Precision discrepancy between variants

by represiv - opened 11 days ago

•

Im little bit confused by the main tensor precision between variants and final model sizes:
I-Compact uses Q4_K for main tensor, model size 17.3 GB, as expected.
I-Balanced uses Q6_K for main tensor, size 25.3 GB, NOT as expected.
I-Quality uses Q6_K for main tensor, model size 22.8 GB, as expected.

Recently uploaded gemma-4-26B-A4B variants have the tensor precisions and sizes as expected: I-Compact Q4_K, I-Balanced Q5_K, I-Quality Q6_K.
So my question is why the Qwen3.5 I-Balanced variant does not use Q5_K for main tensor and why the I-Balanced model is bigger in size than I-Quality?

P.S. Thank You for the APEX variants, really good results and speed.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment