AI-8850 LLM Acceleration M.2 Module
The M5Stack LLM-8850 Card adds 24 TOPS INT8 AI acceleration to Raspberry Pi 5, RK3588 and x86 systems, with active cooling, AXCL Runtime support and 8K H.264/H.265 hardware video processing for powerful edge-AI applications.
The M5Stack LLM-8850 Card is an M.2 M-KEY 2242 AI acceleration module designed for edge devices. It combines a compact 42 mm form factor with the Axera AX8850 SoC, delivering 24 TOPS @ INT8 performance. With simple plug-and-play installation, it allows host devices such as the Raspberry Pi 5, RK3588 boards, and x86 PCs to add multimodal large-model processing and advanced video analysis capabilities.
The card includes an active cooling system with a microturbine fan and CNC-machined aluminium alloy fins. Fan speed is automatically managed by the onboard EC using temperature–current curves, ensuring stable performance even during long-term full-load operation inside enclosed cases, preventing thermal slowdown.
Power delivery is handled by an onboard DCDC + PMIC chain, monitored in real time by the EC to provide “power on demand” and “cooling on demand,” which helps maintain overall system stability.
The card supports AXCL Runtime, with C and Python APIs for quick deployment of popular model types—including CNNs, Transformers, LLMs, and multimodal models. It supports workloads such as YOLO-v8/v9, CLIP, Whisper, Llama 3.2, Gemma 2, and Qwen 2.5, and also makes use of the AX8850’s VPU hardware pipeline for H.264/H.265 8K encoding and decoding, including simultaneous encode–decode, transcoding, scaling, and cropping.
Host devices can access the hardware codec directly through ffmpeg, bridging AI workloads with high-performance video stream processing.
Features
- Ultra-compact form factor: NGFF M.2 M-KEY 2242 size, supports PCIe 2.0 ×2 lanes plug-and-play
- High-performance NPU: 24 TOPS @ INT8, octa-core Cortex‑A55 1.7 GHz CPU
- Intelligent cooling / power supply: onboard turbofan + CNC aluminium alloy integrated heatsink, EC-monitored temperature-current-speed closed loop
- High-bandwidth memory: 64‑bit LPDDR4x, 4266 Mbps speed, 8GB capacity
- Rich I/O: 1 × USB 3.0, 2 × USB 2.0, 2 × Gigabit Ethernet MAC
- Hardware video engine: 8 K @ 30 fps H.264/H.265 encoding, 8 K @ 60 fps decoding, supports 32-channel 1080 p parallel decoding
- Secure boot & encryption: AES / DES / 3DES / SHA‑256 hardware security module
- Native AXCL: one-click operation for CNN, Transformer, CLIP, Whisper, Llama3.2, Qwen 2.5, InternVL 2 full-stack models, supports H.264/H.265 simultaneous encode-decode transcoding.
Specifications
| SoC | Axera AX8850 |
| CPU | Octa-core Cortex-A55 1.7 GHz |
| NPU | 24 TOPS @ INT8 |
| Video Encoder | 8 K @ 30 fps H.264/H.265 encoding, supports scaling / cropping |
| Video Decoder | 8 K @ 60 fps H.264/H.265 decoding, supports 32 channels 1080 p parallel decoding, supports scaling / cropping |
| Memory | 64-bit LPDDR4x, 4266 Mbps, 8GB capacity |
| Storage | 32Mbits QSPI NOR Flash (used for Bootloader only) |
| Form Factor | M.2 M-KEY 2242, PCIe 2.0 ×2 |
| Cooling | Micro turbo fan + integrated aluminum alloy CNC heatsink, EC intelligent temperature control |
| Operating Temperature | 0 ~ 60 °C |
| Full Load Temperature at Room Temp | 70 °C |
| Power Supply | 7W @ 3.3V |
| Product Size | 42.6 x 24.0 x 9.7mm |
| Product Weight | 14.7g |
Resources
- Documents
- User Guide
Package Contents
- 1x AI-8850 LLM Acceleration M.2 Module
Original: $161.75
-70%$161.75
$48.52












Description
The M5Stack LLM-8850 Card adds 24 TOPS INT8 AI acceleration to Raspberry Pi 5, RK3588 and x86 systems, with active cooling, AXCL Runtime support and 8K H.264/H.265 hardware video processing for powerful edge-AI applications.
The M5Stack LLM-8850 Card is an M.2 M-KEY 2242 AI acceleration module designed for edge devices. It combines a compact 42 mm form factor with the Axera AX8850 SoC, delivering 24 TOPS @ INT8 performance. With simple plug-and-play installation, it allows host devices such as the Raspberry Pi 5, RK3588 boards, and x86 PCs to add multimodal large-model processing and advanced video analysis capabilities.
The card includes an active cooling system with a microturbine fan and CNC-machined aluminium alloy fins. Fan speed is automatically managed by the onboard EC using temperature–current curves, ensuring stable performance even during long-term full-load operation inside enclosed cases, preventing thermal slowdown.
Power delivery is handled by an onboard DCDC + PMIC chain, monitored in real time by the EC to provide “power on demand” and “cooling on demand,” which helps maintain overall system stability.
The card supports AXCL Runtime, with C and Python APIs for quick deployment of popular model types—including CNNs, Transformers, LLMs, and multimodal models. It supports workloads such as YOLO-v8/v9, CLIP, Whisper, Llama 3.2, Gemma 2, and Qwen 2.5, and also makes use of the AX8850’s VPU hardware pipeline for H.264/H.265 8K encoding and decoding, including simultaneous encode–decode, transcoding, scaling, and cropping.
Host devices can access the hardware codec directly through ffmpeg, bridging AI workloads with high-performance video stream processing.
Features
- Ultra-compact form factor: NGFF M.2 M-KEY 2242 size, supports PCIe 2.0 ×2 lanes plug-and-play
- High-performance NPU: 24 TOPS @ INT8, octa-core Cortex‑A55 1.7 GHz CPU
- Intelligent cooling / power supply: onboard turbofan + CNC aluminium alloy integrated heatsink, EC-monitored temperature-current-speed closed loop
- High-bandwidth memory: 64‑bit LPDDR4x, 4266 Mbps speed, 8GB capacity
- Rich I/O: 1 × USB 3.0, 2 × USB 2.0, 2 × Gigabit Ethernet MAC
- Hardware video engine: 8 K @ 30 fps H.264/H.265 encoding, 8 K @ 60 fps decoding, supports 32-channel 1080 p parallel decoding
- Secure boot & encryption: AES / DES / 3DES / SHA‑256 hardware security module
- Native AXCL: one-click operation for CNN, Transformer, CLIP, Whisper, Llama3.2, Qwen 2.5, InternVL 2 full-stack models, supports H.264/H.265 simultaneous encode-decode transcoding.
Specifications
| SoC | Axera AX8850 |
| CPU | Octa-core Cortex-A55 1.7 GHz |
| NPU | 24 TOPS @ INT8 |
| Video Encoder | 8 K @ 30 fps H.264/H.265 encoding, supports scaling / cropping |
| Video Decoder | 8 K @ 60 fps H.264/H.265 decoding, supports 32 channels 1080 p parallel decoding, supports scaling / cropping |
| Memory | 64-bit LPDDR4x, 4266 Mbps, 8GB capacity |
| Storage | 32Mbits QSPI NOR Flash (used for Bootloader only) |
| Form Factor | M.2 M-KEY 2242, PCIe 2.0 ×2 |
| Cooling | Micro turbo fan + integrated aluminum alloy CNC heatsink, EC intelligent temperature control |
| Operating Temperature | 0 ~ 60 °C |
| Full Load Temperature at Room Temp | 70 °C |
| Power Supply | 7W @ 3.3V |
| Product Size | 42.6 x 24.0 x 9.7mm |
| Product Weight | 14.7g |
Resources
- Documents
- User Guide
Package Contents
- 1x AI-8850 LLM Acceleration M.2 Module






















