Update README.md
Browse files
README.md
CHANGED
|
@@ -6,11 +6,10 @@ license: mit
|
|
| 6 |
</div>
|
| 7 |
|
| 8 |
<div align="center">
|
| 9 |
-
<a href="https://github.com/MoonshotAI/Kimi-Linear/blob/master/tech_report.pdf"><img src="figures/logo.png" height="16" width="16" style="vertical-align:middle"><b> Tech Report</b></a> |
|
| 10 |
-
<a href="https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Instruct"><img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="16" width="16" style="vertical-align:middle"><b> HuggingFace</b></a>
|
| 11 |
</div>
|
| 12 |
|
| 13 |
-
|
| 14 |
<div align="center">
|
| 15 |
<img width="90%" src="figures/perf_speed.png">
|
| 16 |
<p><em><b>(a)</b> On MMLU-Pro (4k context length), Kimi Linear achieves 51.0 performance with similar speed as full attention. On RULER (128k context length), it shows Pareto-optimal performance (84.3) and 3.98x speedup. <b>(b)</b> Kimi Linear achieves 6.3x faster TPOT compared to MLA, offering significant speedups at long sequence lengths (1M tokens).</em></p>
|
|
|
|
| 6 |
</div>
|
| 7 |
|
| 8 |
<div align="center">
|
| 9 |
+
<a href="https://github.com/MoonshotAI/Kimi-Linear/blob/master/tech_report.pdf" ><img src="figures/logo.png" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> Tech Report</b></a> |
|
| 10 |
+
<a href="https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Instruct"><img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> HuggingFace</b></a>
|
| 11 |
</div>
|
| 12 |
|
|
|
|
| 13 |
<div align="center">
|
| 14 |
<img width="90%" src="figures/perf_speed.png">
|
| 15 |
<p><em><b>(a)</b> On MMLU-Pro (4k context length), Kimi Linear achieves 51.0 performance with similar speed as full attention. On RULER (128k context length), it shows Pareto-optimal performance (84.3) and 3.98x speedup. <b>(b)</b> Kimi Linear achieves 6.3x faster TPOT compared to MLA, offering significant speedups at long sequence lengths (1M tokens).</em></p>
|