view article Article We Got Claude to Fine-Tune an Open Source LLM burtenshaw, evalstate • Dec 4, 2025 • 627
view article Article Performant local mixture-of-experts CPU inference with GPU acceleration in llama.cpp Doctor-Shotgun • Jan 30 • 27
view article Article Small Language Models (SLM): A Comprehensive Overview jjokah • Feb 22, 2025 • 155
view article Article Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR nvidia • Jan 5 • 86
view article Article The Great Classification Showdown: OSS vs BERT on Consumer Hardware BenTouss • Jan 26 • 12
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs Paper • 2601.17058 • Published Jan 22 • 190
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published Jan 29 • 104
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security Paper • 2601.18491 • Published Jan 26 • 125
view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 156
Load 4bit models 4x faster Collection Native bitsandbytes 4bit pre quantized models • 25 items • Updated Apr 22 • 61
Embedding Models Collection Run or fine-tune embedding models with Unsloth. • 14 items • Updated Apr 22 • 6