Clarify canonical SFT checkpoint and failed v3 archive
Browse files
README.md
CHANGED
|
@@ -14,6 +14,12 @@ Continuation of [AGILLM-3-Large](https://huggingface.co/OpenTransformer/AGILLM-3
|
|
| 14 |
|
| 15 |
**Demo Space:** [OpenTransformer/AGILLM-3-large-v2-demo](https://huggingface.co/spaces/OpenTransformer/AGILLM-3-large-v2-demo)
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
## What happened to v1?
|
| 18 |
|
| 19 |
A `transformers` library update (to v5.3.0) on 2026-03-11 silently broke the DeepSeek-V3.2 tokenizer's encode/decode pipeline:
|
|
|
|
| 14 |
|
| 15 |
**Demo Space:** [OpenTransformer/AGILLM-3-large-v2-demo](https://huggingface.co/spaces/OpenTransformer/AGILLM-3-large-v2-demo)
|
| 16 |
|
| 17 |
+
## Current recommended checkpoint (2026-05-21)
|
| 18 |
+
|
| 19 |
+
**Use [sft_chat_1024_220k_20260520/final.pt](sft_chat_1024_220k_20260520/final.pt) in AR mode.** This is the current canonical AGILLM-3 v2 chat checkpoint: it answers 1+1=2, 2+2=4, and 47+28=75 in the smoke rerun.
|
| 20 |
+
|
| 21 |
+
**Do not use [sft_sat_repair_v3_20260521/final.pt](sft_sat_repair_v3_20260521/final.pt) except as a failed-experiment archive.** That run regressed AR arithmetic and did not repair SAT. SAT mode remains experimental/broken pending a separate objective/inference fix.
|
| 22 |
+
|
| 23 |
## What happened to v1?
|
| 24 |
|
| 25 |
A `transformers` library update (to v5.3.0) on 2026-03-11 silently broke the DeepSeek-V3.2 tokenizer's encode/decode pipeline:
|