Papers
arxiv:2604.18292

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Published on Apr 20
· Submitted by
KABI
on Apr 21
#2 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Agent-World introduces a self-evolving training framework that advances general agent intelligence through autonomous environment discovery and continuous learning across diverse real-world scenarios.

AI-generated summary

Large language models are increasingly expected to serve as general-purpose agents that interact with external, stateful tool environments. The Model Context Protocol (MCP) and broader agent skills offer a unified interface for connecting agents with scalable real-world services, but training robust agents remains limited by the lack of realistic environments and principled mechanisms for life-long learning. In this paper, we present Agent-World, a self-evolving training arena for advancing general agent intelligence through scalable environments. Agent-World has two main components: (1) Agentic Environment-Task Discovery, which autonomously explores topic-aligned databases and executable tool ecosystems from thousands of real-world environment themes and synthesizes verifiable tasks with controllable difficulty; and (2) Continuous Self-Evolving Agent Training, which combines multi-environment reinforcement learning with a self-evolving agent arena that automatically identifies capability gaps through dynamic task synthesis and drives targeted learning, enabling the co-evolution of agent policies and environments. Across 23 challenging agent benchmarks, Agent-World-8B and 14B consistently outperforms strong proprietary models and environment scaling baselines. Further analyses reveal scaling trends in relation to environment diversity and self-evolution rounds, offering insights for building general agent intelligence.

Community

Paper author Paper submitter

We introduce Agent-World , a general-purpose agent training arena that couples real-world environment synthesis with continuous self-evolving training, forming a closed loop in which agents and environments co-evolve.

截屏2026-04-20 23.01.15

It consists of two parts:

(1) Agentic environment–task discovery . A deep-search agent, anchored on real-world environment themes, autonomously mines environment databases from the web, generates executable tools, and synthesizes verifiable tasks.

截屏2026-04-20 23.01.46

(2) Continuous self-evolving training . Agents are trained with multi-environment reinforcement learning, while the synthesized environments serve as a training arena that automatically diagnoses capability gaps and targets environment/task expansion, enabling sustained self-evolution.

image

In total, Agent-World builds 1,978 environments and 19,822 tools, with synthesized tasks averaging more than 15 interaction turns .

image

Across 23 challenging benchmarks (including $\tau^2$-Bench, BFCL V4, MCP-Mark, ClawEval, SkillsBench, etc.), Agent-World-8B/14B consistently outperforms existing environment-scaling methods and strong open-source foundation models. Further analyses reveal a clear scaling relationship among environment diversity, self-evolution rounds, and agent performance.

截屏2026-04-20 23.03.37

截屏2026-04-20 23.04.56

截屏2026-04-20 23.17.28

Paper author Paper submitter

Check out our Demo here:

https://agent-tars-world.github.io/-/

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.18292
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.18292 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.18292 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.18292 in a Space README.md to link it from this page.

Collections including this paper 2