Running 3.6k The Ultra-Scale Playbook π 3.6k The ultimate guide to training LLM on large GPU Clusters
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper β’ 2507.10532 β’ Published Jul 14 β’ 89
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper β’ 2408.10914 β’ Published Aug 20, 2024 β’ 45
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper β’ 2408.09174 β’ Published Aug 17, 2024 β’ 52
Self-Play Preference Optimization for Language Model Alignment Paper β’ 2405.00675 β’ Published May 1, 2024 β’ 28