Gemma-4-21B-A4B-it-REAP (MLX 4-bit)
This is a high-speed 4-bit (INT4) conversion of 0xSero/gemma-4-21b-a4b-it-REAP optimized for Apple Silicon using the MLX framework.
Model Highlights
- Architecture: Gemma 4 Multimodal with Active Blocks (A4B).
- Optimization: REAP (Reasoning Enhancement and Active Pruning) for enhanced vision-language reasoning.
- Precision:
4-bit(Optimized for inference speed and low memory footprint). - Size: ~12 GB (Recommended for any Apple Silicon Mac with 16GB+ RAM).
Conversion Details
The model was converted locally to ensure bit-perfect compatibility with the MLX ecosystem.
- Hardware: Mac Mini (M4) with 32GB RAM.
- Library:
mlx-vlmusing-q --q-bits 4. - Format: Native MLX safetensors.
Usage
Installation
pip install mlx-vlm
Credits
Original Model: 0xSero/gemma-4-21b-a4b-it-REAP
MLX Conversion: Z3NN001
- Downloads last month
- 587
Model size
4B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for Z3NN001/gemma4-21b-a4b-REAP-it-mlx-Q4
Base model
0xSero/gemma-4-21b-a4b-it-REAP