Performance with Draft Model NapYang/Qwen3.5-2B-Claude-4.6-Opus-Reasoning-Distilled-v2-text-SpecPrefill-oQ3.5 on Mac Studio M2 Max 32GB.
Prompt Processing (excl. cached) 145.9 tok/s
Token Generation 15.4 tok/s
- Downloads last month
- 2,714
Model size
27B params
Tensor type
U8
·
U32 ·
BF16 ·
F32 ·
Hardware compatibility
Log In to add your hardware
3-bit
Model tree for NapYang/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-text-oQ3.5
Base model
Qwen/Qwen3.5-27B