DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated 21 days ago • 146k • 39 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 1.24k allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 204 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated 22 days ago • 97
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 5.31k • 62 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 29.1k • 64 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 775 • 34 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 9
DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated 21 days ago • 146k • 39 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 1.24k allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 204 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated 22 days ago • 97
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 5.31k • 62 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 29.1k • 64 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 775 • 34 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 9