H's picture

1 10

H

SunSwallow

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

upvoted a paper about 2 months ago

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

upvoted a paper about 2 months ago

V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models

View all activity

Organizations

None yet

upvoted a paper 17 days ago

SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

Paper • 2512.22322 • Published 21 days ago • 38

upvoted 2 papers about 2 months ago

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27, 2025 • 88

V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models

Paper • 2511.16668 • Published Nov 20, 2025 • 54

upvoted 3 papers 3 months ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 271

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

Paper • 2509.22601 • Published Sep 26, 2025 • 29

Training-Free Group Relative Policy Optimization

Paper • 2510.08191 • Published Oct 9, 2025 • 44

upvoted 2 papers 4 months ago

From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature

Paper • 2509.16591 • Published Sep 20, 2025 • 2

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 228

upvoted a paper 5 months ago

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7, 2025 • 141

upvoted a collection 5 months ago

OpenMathReasoning

Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated about 7 hours ago • 46

commented a paper 6 months ago

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26, 2025 • 158 •