Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2602.09856

Interesting paper

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 200

SWE-Universe: Scale Real-World Verifiable Environments to Millions

Paper • 2602.02361 • Published Feb 2 • 60
LongCodeZip: Compress Long Context for Code Language Models

Paper • 2510.00446 • Published Oct 1, 2025 • 107
Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 200
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Paper • 2601.11868 • Published Jan 17 • 34

THINKSAFE: Self-Generated Safety Alignment for Reasoning Models

Paper • 2601.23143 • Published Jan 30 • 39
PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 218
Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 201
BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 200

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published Dec 16, 2025 • 119
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published Dec 29, 2025 • 98
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published Dec 29, 2025 • 65

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11, 2025 • 33
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL

Paper • 2503.23157 • Published Mar 29, 2025 • 10
AI Agents: Evolution, Architecture, and Real-World Applications

Paper • 2503.12687 • Published Mar 16, 2025 • 2

maybe there is something to it

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 200
How2Everything: Mining the Web for How-To Procedures to Evaluate and Improve LLMs

Paper • 2602.08808 • Published Feb 9 • 8
Thinking Makes LLM Agents Introverted: How Mandatory Thinking Can Backfire in User-Engaged Agents

Paper • 2602.07796 • Published Feb 8 • 7
QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

Paper • 2602.09901 • Published Feb 10 • 6

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 200
UI-Venus-1.5 Technical Report

Paper • 2602.09082 • Published Feb 9 • 155

Towards Pixel-Level VLM Perception via Simple Points Prediction

Paper • 2601.19228 • Published Jan 27 • 18
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 25
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision

Paper • 2601.19798 • Published Jan 27 • 42
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models

Paper • 2601.21639 • Published Jan 29 • 50

The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs

Paper • 2506.18403 • Published Jun 23, 2025 • 3
ReCode: Updating Code API Knowledge with Reinforcement Learning

Paper • 2506.20495 • Published Jun 25, 2025 • 10
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution

Paper • 2507.23348 • Published Jul 31, 2025 • 12
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering

Paper • 2509.09614 • Published Sep 11, 2025 • 7

Reasoning Models

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30, 2025 • 61
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published Feb 11, 2025 • 40
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10, 2025 • 152
S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20, 2025 • 63

Interesting paper

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 200

maybe there is something to it

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 200
How2Everything: Mining the Web for How-To Procedures to Evaluate and Improve LLMs

Paper • 2602.08808 • Published Feb 9 • 8
Thinking Makes LLM Agents Introverted: How Mandatory Thinking Can Backfire in User-Engaged Agents

Paper • 2602.07796 • Published Feb 8 • 7
QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

Paper • 2602.09901 • Published Feb 10 • 6

SWE-Universe: Scale Real-World Verifiable Environments to Millions

Paper • 2602.02361 • Published Feb 2 • 60
LongCodeZip: Compress Long Context for Code Language Models

Paper • 2510.00446 • Published Oct 1, 2025 • 107
Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 200
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Paper • 2601.11868 • Published Jan 17 • 34

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 200
UI-Venus-1.5 Technical Report

Paper • 2602.09082 • Published Feb 9 • 155

THINKSAFE: Self-Generated Safety Alignment for Reasoning Models

Paper • 2601.23143 • Published Jan 30 • 39
PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 218
Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 201
BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 200

Towards Pixel-Level VLM Perception via Simple Points Prediction

Paper • 2601.19228 • Published Jan 27 • 18
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 25
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision

Paper • 2601.19798 • Published Jan 27 • 42
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models

Paper • 2601.21639 • Published Jan 29 • 50

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published Dec 16, 2025 • 119
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published Dec 29, 2025 • 98
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published Dec 29, 2025 • 65

The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs

Paper • 2506.18403 • Published Jun 23, 2025 • 3
ReCode: Updating Code API Knowledge with Reinforcement Learning

Paper • 2506.20495 • Published Jun 25, 2025 • 10
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution

Paper • 2507.23348 • Published Jul 31, 2025 • 12
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering

Paper • 2509.09614 • Published Sep 11, 2025 • 7

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11, 2025 • 33
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL

Paper • 2503.23157 • Published Mar 29, 2025 • 10
AI Agents: Evolution, Architecture, and Real-World Applications

Paper • 2503.12687 • Published Mar 16, 2025 • 2

Reasoning Models

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30, 2025 • 61
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published Feb 11, 2025 • 40
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10, 2025 • 152
S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20, 2025 • 63

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs