Qwen 3.5
Collection
10 items β’ Updated
This repository contains a full spectrum of GGUF quantizations for lukey03's Qwen3.5-9B-abliterated.
These files are optimized for local inference using llama.cpp, LM Studio, Jan, Ollama, and other compatible software.
The base model is a fully uncensored version of Qwen3.5-9B. It achieved a 0% refusal rate (answering 100% of controversial/restricted prompts) through a two-stage process:
Key Features:
| File Name | Quant Type | Size | Description / Recommendation |
|---|---|---|---|
Qwen3.5-9B-abliterated-Q8_0.gguf |
Q8_0 | ~9.5 GB | Highest Quality: Near-perfect F16 equivalent. Best if you have 12GB+ VRAM. |
Qwen3.5-9B-abliterated-Q6_K.gguf |
Q6_K | ~7.2 GB | Gold Standard: Extremely low quality loss. The recommended sweet spot for 9B models. |
Qwen3.5-9B-abliterated-Q5_K_M.gguf |
Q5_K_M | ~6.4 GB | Great balance of speed and intelligence. Fits comfortably on 8GB VRAM cards. |
Qwen3.5-9B-abliterated-Q5_K_S.gguf |
Q5_K_S | ~6.2 GB | Slightly faster than K_M, with a microscopic drop in nuance. |
Qwen3.5-9B-abliterated-Q4_K_M.gguf |
Q4_K_M | ~5.6 GB | Excellent for lower-end hardware and older laptops. |
Qwen3.5-9B-abliterated-Q4_K_S.gguf |
Q4_K_S | ~5.3 GB | Fastest acceptable 4-bit quant. Good for limited memory. |
Qwen3.5-9B-abliterated-Q3_K_L.gguf |
Q3_K_L | ~4.8 GB | Heavy compression. Expect some logic loss and hallucination. |
Qwen3.5-9B-abliterated-Q3_K_M.gguf |
Q3_K_M | ~4.4 GB | Extreme compression. Only use if absolutely necessary. |
Abhiray/Qwen3.5-9B-abliterated-GGUF.Q6_K or Q5_K_M)../llama-cli -m Qwen3.5-9B-abliterated-Q6_K.gguf -p "<|im_start|>system\nYou are an uncensored AI assistant.<|im_end|>\n<|im_start|>user\nHello!<|im_end|>\n<|im_start|>assistant\n" -n 512
3-bit
4-bit
5-bit
6-bit
8-bit