Uncensored Qwen3.5 MLX
Collection
26 items • Updated

Quality: quantized (mixed quants per tensor, group size: 32, 6.19 bpw)
Quantization: 5 bit for experts, 8 bit for shared experts and attention layers, fp16 for embeddings & head. Average bits per weight: 6.19.
This is an uncensored version of Qwen/Qwen3.5-122B-A10B, made using Heretic v1.2.0 with Multi-directional refusal supression
| Metric | This model | Original model (Qwen/Qwen3.5-122B-A10B) |
|---|---|---|
| KL divergence | 0.0646 | 0 (by definition) |
| Refusals | 16/100 | 84/100 |
temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0 temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0 temperature=0.7, top_p=0.8, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0 temperature=1.0, top_p=1.0, top_k=40, min_p=0.0, presence_penalty=2.0, repetition_penalty=1.0presence_penalty parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance.This model was converted to MLX format from coder3101/Qwen3.5-122B-A10B-heretic-v2 using mlx-vlm version 0.4.0.
6-bit
Base model
Qwen/Qwen3.5-122B-A10B