view article Article 📢 NVIDIA Releases Nemotron-CC-Math Pre-Training Dataset: A High-Quality, Web-Scale Math Corpus for Pretraining Large Language Models Aug 18 • 5
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published Aug 20 • 38
Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset Paper • 2508.15096 • Published Aug 20 • 4
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 1 day ago • 69
NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning Paper • 2504.13941 • Published Apr 15 • 11
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 57
Canary Collection A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤 • 5 items • Updated 1 day ago • 29