Qwen3.5-9B-Abliterated-Claude-4.6-Opus-Reasoning-Distilled-v2 (GGUF)

This repository contains GGUF quantized formats of the abliterated reasoning-distilled Qwen3.5-9B model.

Model Lineage

๐Ÿ›  Abliteration Process (The "Deep Scrub")

This model underwent a three-round iterative ablation process using Orthogonalization via Null-Space SVD. Unlike standard uncensored models, this version uses an aggressive configuration to target "Soft Refusals."

Configuration Profile:

Parameter Value Description
Direction Multiplier 1.25 Increased force to bypass "helpful assistant" pivots.
Null-Space Rank Ratio 0.70 Tightened shield to protect only core reasoning logic.
Intervention Range (0.0, 1.0) Full coverage from Layer 0 to 48.
Filter by Refusal Enabled Specifically targets the brain activity associated with lectures.
Skip State Proj No Ensures the Attention heads cannot "detect and pivot" to safety.

๐Ÿง  Reasoning Capabilities

Despite the aggressive ablation, the model's intelligence remains grounded. It maintains the ability to:

  • Perform complex mathematical and logical reasoning.
  • Execute multi-step coding tasks without "hallucinating" safety blocks.
  • Maintain a coherent internal monologue inside <think> tags.

โš ๏ธ Usage & Disclaimer

This model is unbound. It has had its safety guardrails removed for research and creative purposes. It will follow instructions that the base model would otherwise refuse.

User Discretion is Advised: This model may generate content that is considered harmful, offensive, or controversial. The creator is not responsible for the outputs generated. Use it for research, roleplay, and complex reasoning only.

Available Quantizations

The following quantization methods are provided to balance VRAM usage and model performance:

  • Q3_K_M: Smallest size, noticeable loss of quality. Good for very low VRAM.
  • Q4_K_S: Slightly smaller than Q4_K_M, faster inference but lower quality.
  • Q4_K_M: Recommended baseline. Excellent balance of size, speed, and quality.
  • Q5_K_M: Higher quality, slightly larger VRAM footprint.
  • Q6_K: Near unquantized quality. Recommended if you have the VRAM.
  • Q8_0: Practically identical to f16, very large.
  • mmproj-f16: Multimodal projector file in f16 format.

Usage

These models are compatible with modern GGUF loaders like llama.cpp, text-generation-webui, LM Studio, and Ollama.

To run via llama.cpp:

./main -m Qwen3.5-9B-Abliterated-Claude-4.6-Opus-Reasoning-Distilled-v2.Q4_K_M.gguf -n 2048 --color -i -cml
Downloads last month
5,802
GGUF
Model size
9B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Abiray/Qwen3.5-9B-Abliterated-Claude-4.6-Opus-Reasoning-Distilled-v2

Collection including Abiray/Qwen3.5-9B-Abliterated-Claude-4.6-Opus-Reasoning-Distilled-v2