Qwen3.5-27B-heretic-v2-Opus-4.6-Distilled
A fine-tuned version of llmfan46/Qwen3.5-27B-ultra-uncensored-heretic-v2, distilled from Claude Opus 4.6 reasoning traces.
Model Details
- Base Model: llmfan46/Qwen3.5-27B-ultra-uncensored-heretic-v2
- Architecture: Qwen3.5-27B (Dense, 27B parameters, all active)
- Training Method: LoRA fine-tuning with Unsloth, merged to bf16
- Training Data: Jongsim/claude-opus-4.6-reasoning-12k-en-filtered-v2 (12,822 examples)
- Format: SafeTensors (bf16)
- Size: ~54.7 GB
Training Configuration
| Parameter | Value |
|---|---|
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Epochs | 3 |
| Batch size | 1 (GA=8, effective=8) |
| Learning rate | 2e-4 |
| Scheduler | Cosine |
| Max sequence length | 2048 |
| Optimizer | AdamW 8-bit |
| Precision | bfloat16 |
| Hardware | NVIDIA DGX Spark (GB10 Blackwell GPU, 128GB unified memory) |
| Total training time | ~113 hours |
Training Loss
| Epoch | Start Loss | Final Loss | Avg Loss | Improvement |
|---|---|---|---|---|
| 1 | 0.5551 | 0.2747 | 0.3375 | — |
| 2 | 0.2738 | 0.1283 | 0.1757 | -47.9% |
| 3 | 0.1287 | 0.0585 | 0.0735 | -58.2% |
Overall training loss: 0.1954
The model shows strong convergence with consistent loss reduction across all 3 epochs. No signs of overfitting observed.
Dataset
The training dataset consists of 12,822 high-quality English reasoning examples generated by Claude Opus 4.6, featuring:
- Complex multi-step reasoning with chain-of-thought
<think>...</think>structured reasoning traces- Diverse domains: math, logic, coding, analysis, creative writing
- Quality-filtered for coherent and complete reasoning chains
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "Jongsim/Qwen3.5-27B-heretic-v2-Opus-4.6-Distilled"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "user", "content": "Explain the concept of gradient descent in machine learning, step by step."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Attribution
- Base model abliteration: llmfan46 — created the uncensored heretic variant
- Original architecture: Qwen Team — Qwen3.5-27B
- Fine-tuning & distillation: Jongsim — LoRA training with Claude Opus 4.6 reasoning data
- Training framework: Unsloth
License
This model inherits the Apache 2.0 license from the base Qwen3.5 model.
- Downloads last month
- 335
Model tree for Jongsim/Qwen3.5-27B-heretic-v2-Opus-4.6-Distilled
Base model
Qwen/Qwen3.5-27B