Performance with Draft Model NapYang/Qwen3.5-2B-Claude-4.6-Opus-Reasoning-Distilled-v2-text-SpecPrefill-oQ3.5 on Mac Studio M2 Max 32GB.

Prompt Processing (excl. cached) 145.9 tok/s

Token Generation 15.4 tok/s

Downloads last month
2,714
Safetensors
Model size
27B params
Tensor type
U8
·
U32
·
BF16
·
F32
·
MLX
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NapYang/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-text-oQ3.5