bioreasoning-qwen3-30ba3b-sft-20260218-ckpt-final
LoRA adapter checkpoint from supervised fine-tuning on the bioreasoning dataset.
Training Details
- Base model: Qwen/Qwen3-30B-A3B
- Method: LoRA (rank 32)
- Training platform: Tinker (ThinkingMachines)
- Dataset: abugoot-primeintellect/bioreasoning_v0211_prime_sft (~193k train examples)
- Hyperparameters: batch_size=128, lr=2e-5 (linear decay), weight_decay=0.01, seq_len=8192
- Epochs: 3
- Checkpoint: step final (epoch 3, batch 1511)
- Loss masking: Last assistant message only
All Checkpoints
| Step | Epoch | HF Repo |
|---|---|---|
| 1500 | ~1 | bioreasoning-qwen3-30ba3b-sft-20260218-ckpt-1500 |
| 3000 | ~2 | bioreasoning-qwen3-30ba3b-sft-20260218-ckpt-3000 |
| final | 3 | bioreasoning-qwen3-30ba3b-sft-20260218-ckpt-final |
- Downloads last month
- 7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support