Control LLM: Controlled Evolution for Intelligence Retention in LLM
Paper
•
2501.10979
•
Published
•
6
This is a fine-tuned model of Llama-3.1-8B-Instruct for mathematical tasks on OpenCoder SFT dataset.
This model is associated with the paper: Control-LLM.
This model is associated with the github: Control-LLM.
Here is an overview of the evaluation results and findings:
The following plot illustrates benchmark result and catastrophic forgetting mitigation on the OpenCoder SFT dataset.
The table below summarizes evaluation results across coding tasks and original capabilities.
| Model | MB+ | MS | HE+ | HE | C-Avg | ARC | GP | MLU | MLUP | O-Avg | Overall |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Llama3.1-8B-Ins | 70.4 | 67.7 | 66.5 | 70.7 | 69.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 64.8 |
| OpenCoder-8B-Ins | 81.2 | 76.3 | 78.0 | 82.3 | 79.5 | 8.2 | 25.4 | 37.4 | 11.3 | 24.6 | 52.1 |
| Full Param Tune | 75.1 | 69.6 | 71.3 | 76.8 | 73.3 | 24.4 | 21.9 | 43.0 | 19.2 | 31.5 | 52.4 |
| Partial Param Tune | 75.7 | 71.6 | 74.4 | 79.3 | 75.0 | 70.2 | 28.1 | 60.7 | 32.4 | 48.3 | 61.7 |
| Stack Expansion | 77.2 | 72.8 | 73.2 | 78.7 | 75.6 | 80.0 | 26.3 | 66.6 | 38.2 | 54.2 | 64.9 |
| Hybrid Expansion* | 77.5 | 73.5 | 76.2 | 82.3 | 77.1 | 80.9 | 32.6 | 68.1 | 40.3 | 56.0 | 66.6 |
| Control LLM* | 80.4 | 75.9 | 74.4 | 81.1 | 78.3 | 82.5 | 29.7 | 68.2 | 40.9 | 56.3 | 67.3 |
Base model
meta-llama/Llama-3.1-8B