Update README.md
Browse files
README.md
CHANGED
|
@@ -8,9 +8,15 @@ language:
|
|
| 8 |
|
| 9 |
# Introduction
|
| 10 |
|
| 11 |
-
We announce **Motif 2.6B**, a 2.6 billion parameter language model trained from scratch on AMD Instinct™ MI250X GPUs. Motif 2.6B marks our very first step toward building helpful, reliable AI aligned with human values.
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
# Evaluation
|
| 16 |
|
|
@@ -168,9 +174,7 @@ The benchmarks and corresponding scores listed in the table below are taken dire
|
|
| 168 |
|
| 169 |
## Evaluation Appendix
|
| 170 |
|
| 171 |
-
In the comparisons presented above, Motif 2.6B showed average performance improvements of -15.36% and -14.78% over Llama 3 8B and Gemma 2 9B, respectively, based on the benchmark scores reported in their original technical reports.
|
| 172 |
-
|
| 173 |
-
However, when compared against the benchmarks and scores reported in the Qwen 2.5 technical report, Motif 2.6B demonstrated a +18.55% average improvement over Llama 3 8B and a +2.63% improvement over Gemma 2 9B. See the table below for details.
|
| 174 |
|
| 175 |
### Comparison to Llama 3 8B and Gemma 2 9B based on scores from the *Qwen2.5 technical report*
|
| 176 |
The benchmarks and corresponding scores listed in the table below are taken directly from the [Qwen2.5 technical report](https://arxiv.org/abs/2412.15115).
|
|
|
|
| 8 |
|
| 9 |
# Introduction
|
| 10 |
|
| 11 |
+
We announce **Motif 2.6B**, a 2.6 billion parameter language model trained from scratch on AMD Instinct™ MI250X GPUs. Motif 2.6B marks our very first step toward building helpful, reliable AI aligned with human values. With this initial release, our goal is for Motif 2.6B to match the performance of well-known open-source models such as Phi, Llama, and Qwen — particularly those in the sLLM regime.
|
| 12 |
|
| 13 |
+
# Training information
|
| 14 |
+
|
| 15 |
+
- GPUs: 384 MI250X
|
| 16 |
+
- Training time: 42 days
|
| 17 |
+
- Training data: 2.4T tokens
|
| 18 |
+
|
| 19 |
+
**A detailed technical report will be released at a later time.**
|
| 20 |
|
| 21 |
# Evaluation
|
| 22 |
|
|
|
|
| 174 |
|
| 175 |
## Evaluation Appendix
|
| 176 |
|
| 177 |
+
In the comparisons presented above, Motif 2.6B showed average performance improvements of -15.36% and -14.78% over Llama 3 8B and Gemma 2 9B, respectively, based on the benchmark scores reported in their original technical reports. However, when compared to the benchmarks and scores reported in the Qwen 2.5 technical report, Motif 2.6B shows an average improvement of +18.55% over Llama 3 8B and +2.63% over Gemma 2 9B. See the table below for details.
|
|
|
|
|
|
|
| 178 |
|
| 179 |
### Comparison to Llama 3 8B and Gemma 2 9B based on scores from the *Qwen2.5 technical report*
|
| 180 |
The benchmarks and corresponding scores listed in the table below are taken directly from the [Qwen2.5 technical report](https://arxiv.org/abs/2412.15115).
|