TMAS: Scaling Test-Time Compute via Multi-Agent Synergy Paper • 2605.10344 • Published 15 days ago • 49
RLVR Linearity Collection RL training and evaluation datasets, and checkpoints in 'Linear Dynamics in the RLVR Training of Large Language Models' • 3 items • Updated 3 days ago
RLVR Linearity Collection RL training and evaluation datasets, and checkpoints in 'Linear Dynamics in the RLVR Training of Large Language Models' • 3 items • Updated 3 days ago