SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Paper • 2507.12415 • Published Jul 16, 2025 • 42
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published Jul 2, 2025 • 56
Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework Paper • 2506.02454 • Published Jun 3, 2025 • 7
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency Paper • 2506.08343 • Published Jun 10, 2025 • 54
view article Article RegMix: Data Mixture as Regression for Language Model Pre-training Jul 11, 2024 • 15
view article Article Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo Dec 23, 2024 • 51
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 58
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper • 2408.10914 • Published Aug 20, 2024 • 45
RegMix: Data Mixture as Regression for Language Model Pre-training Paper • 2407.01492 • Published Jul 1, 2024 • 40