DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated Mar 1 • 146k • 5 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 3 allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 375 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated Feb 28 • 24
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 7.15k • 67 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 34.8k • 65 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 2.25k • 37 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 7
DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated Mar 1 • 146k • 5 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 3 allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 375 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated Feb 28 • 24
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 7.15k • 67 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 34.8k • 65 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 2.25k • 37 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 7