Performance with Draft Model NapYang/Qwen3.5-2B-Claude-4.6-Opus-Reasoning-Distilled-v2-text-SpecPrefill-oQ3.5 on Mac Studio M2 Max 32GB.

Prompt Processing (excl. cached) 145.9 tok/s

Token Generation 15.4 tok/s

Safetensors

Model size

27B params

Tensor type

U32

BF16

F32

MLX

Hardware compatibility

3-bit

Model tree for NapYang/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-text-oQ3.5

Base model

Adapter

Quantized

(21)

this model