Qwen3.5-9B-Abliterated-Claude-4.6-Opus-Reasoning-Distilled-v2 (GGUF)

This repository contains GGUF quantized formats of the abliterated reasoning-distilled Qwen3.5-9B model.

Model Lineage

Base Model: Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2

🛠 Abliteration Process (The "Deep Scrub")

This model underwent a three-round iterative ablation process using Orthogonalization via Null-Space SVD. Unlike standard uncensored models, this version uses an aggressive configuration to target "Soft Refusals."

Configuration Profile:

Parameter	Value	Description
Direction Multiplier	1.25	Increased force to bypass "helpful assistant" pivots.
Null-Space Rank Ratio	0.70	Tightened shield to protect only core reasoning logic.
Intervention Range	(0.0, 1.0)	Full coverage from Layer 0 to 48.
Filter by Refusal	Enabled	Specifically targets the brain activity associated with lectures.
Skip State Proj	No	Ensures the Attention heads cannot "detect and pivot" to safety.

🧠 Reasoning Capabilities

Despite the aggressive ablation, the model's intelligence remains grounded. It maintains the ability to:

Perform complex mathematical and logical reasoning.
Execute multi-step coding tasks without "hallucinating" safety blocks.
Maintain a coherent internal monologue inside <think> tags.

⚠️ Usage & Disclaimer

This model is unbound. It has had its safety guardrails removed for research and creative purposes. It will follow instructions that the base model would otherwise refuse.

User Discretion is Advised: This model may generate content that is considered harmful, offensive, or controversial. The creator is not responsible for the outputs generated. Use it for research, roleplay, and complex reasoning only.

Available Quantizations

The following quantization methods are provided to balance VRAM usage and model performance:

Q3_K_M: Smallest size, noticeable loss of quality. Good for very low VRAM.
Q4_K_S: Slightly smaller than Q4_K_M, faster inference but lower quality.
Q4_K_M: Recommended baseline. Excellent balance of size, speed, and quality.
Q5_K_M: Higher quality, slightly larger VRAM footprint.
Q6_K: Near unquantized quality. Recommended if you have the VRAM.
Q8_0: Practically identical to f16, very large.
mmproj-f16: Multimodal projector file in f16 format.

Usage

These models are compatible with modern GGUF loaders like llama.cpp, text-generation-webui, LM Studio, and Ollama.

To run via llama.cpp:

./main -m Qwen3.5-9B-Abliterated-Claude-4.6-Opus-Reasoning-Distilled-v2.Q4_K_M.gguf -n 2048 --color -i -cml

Downloads last month: 5,802

GGUF

Model size

9B params

Architecture

qwen35

Hardware compatibility

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Abiray/Qwen3.5-9B-Abliterated-Claude-4.6-Opus-Reasoning-Distilled-v2

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Adapter

Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2

Quantized

(11)

this model

Collection including Abiray/Qwen3.5-9B-Abliterated-Claude-4.6-Opus-Reasoning-Distilled-v2

Qwen 3.5

Collection

10 items • Updated 9 days ago