Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models Paper • 2508.10751 • Published Aug 14, 2025 • 29
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published 19 days ago • 211 • 8
MMGenBench: Fully Automatically Evaluating LMMs from the Text-to-Image Generation Perspective Paper • 2411.14062 • Published Nov 21, 2024 • 1
MMGenBench: Fully Automatically Evaluating LMMs from the Text-to-Image Generation Perspective Paper • 2411.14062 • Published Nov 21, 2024 • 1
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 19 days ago • 272
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published 19 days ago • 211
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published 19 days ago • 211
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 19 days ago • 272
Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization Paper • 2506.17252 • Published Jun 8, 2025 • 2
Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization Paper • 2506.17252 • Published Jun 8, 2025 • 2
TTCS: Test-Time Curriculum Synthesis for Self-Evolving Paper • 2601.22628 • Published 29 days ago • 35