AI & ML interests

Merging models

mlabonneย 
posted an update 3 months ago
view post
Post
8110
LiquidAI/LFM2-8B-A1B just dropped!

8.3B params with only 1.5B active/token ๐Ÿš€

> Quality โ‰ˆ 3โ€“4B dense, yet faster than Qwen3-1.7B
> MoE designed to run on phones/laptops (llama.cpp / vLLM)
> Pre-trained on 12T tokens โ†’ strong math/code/IF
  • 1 reply
ยท
mlabonneย 
posted an update 3 months ago
view post
Post
3743
โš›๏ธ New drop of tiny task-specific models!

Want to do data extraction, translation, RAG, tool use, or math on a Raspberry Pi? We got you covered! โœ…

These tiny models were fine-tuned to perform narrow tasks extremely well, making them competitive with much larger models.

You can deploy them today on-device or even on GPUs for big data operations!

LiquidAI/liquid-nanos-68b98d898414dd94d4d5f99a
  • 1 reply
ยท
mlabonneย 
posted an update 5 months ago
view post
Post
6881
Liquid just released two 450M and 1.6B param VLMs!

They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. It's ideal for on-device deployment in constrained environments like phones.

It's available today on Hugging Face, with an inference and a fine-tuning Colab notebooks.

LiquidAI/LFM2-VL-450M
LiquidAI/LFM2-VL-1.6B
mlabonneย 
posted an update 6 months ago
view post
Post
5691
LiquidAI
open-sources a new generation of edge LLMs! ๐Ÿฅณ

Based on a new hybrid architecture, these 350M, 700M, and 1.2B models are both fast and performant, ideal for on-device deployment.

I recommend fine-tuning them to power your next edge application. We already provide Colab notebooks to guide you. More to come soon!

๐Ÿ“ Blog post: https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models
๐Ÿค— Models: https://huggingface.co/collections/LiquidAI/lfm2-686d721927015b2ad73eaa38
  • 1 reply
ยท
mlabonneย 
posted an update 10 months ago
mlabonneย 
posted an update 10 months ago
view post
Post
6543
โœ‚๏ธ Gemma 3 Abliterated

I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.

I experimented with different recipes and improved the abliteration technique I wrote about last year.

It's still experimental but the refusal rate is super low in my tests. Enjoy!

mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

ยท
mlabonneย 
posted an update 12 months ago
view post
Post
7140
๐Ÿ†• LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

๐Ÿ’ป LLM Course: https://huggingface.co/blog/mlabonne/llm-course