arxiv:2604.18292

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Published on Apr 20

· Submitted by

KABI on Apr 21

#2 Paper of the day

ByteDance Seed

Upvote

Authors:

Guanting Dong ,

Shijue Huang ,

Abstract

Agent-World introduces a self-evolving training framework that advances general agent intelligence through autonomous environment discovery and continuous learning across diverse real-world scenarios.

AI-generated summary

Large language models are increasingly expected to serve as general-purpose agents that interact with external, stateful tool environments. The Model Context Protocol (MCP) and broader agent skills offer a unified interface for connecting agents with scalable real-world services, but training robust agents remains limited by the lack of realistic environments and principled mechanisms for life-long learning. In this paper, we present Agent-World, a self-evolving training arena for advancing general agent intelligence through scalable environments. Agent-World has two main components: (1) Agentic Environment-Task Discovery, which autonomously explores topic-aligned databases and executable tool ecosystems from thousands of real-world environment themes and synthesizes verifiable tasks with controllable difficulty; and (2) Continuous Self-Evolving Agent Training, which combines multi-environment reinforcement learning with a self-evolving agent arena that automatically identifies capability gaps through dynamic task synthesis and drives targeted learning, enabling the co-evolution of agent policies and environments. Across 23 challenging agent benchmarks, Agent-World-8B and 14B consistently outperforms strong proprietary models and environment scaling baselines. Further analyses reveal scaling trends in relation to environment diversity and self-evolution rounds, offering insights for building general agent intelligence.

View arXiv page View PDF Project page Add to collection

Community

dongguanting

Paper author Paper submitter about 22 hours ago

We introduce Agent-World , a general-purpose agent training arena that couples real-world environment synthesis with continuous self-evolving training, forming a closed loop in which agents and environments co-evolve.

It consists of two parts:

(1) Agentic environment–task discovery . A deep-search agent, anchored on real-world environment themes, autonomously mines environment databases from the web, generates executable tools, and synthesizes verifiable tasks.

(2) Continuous self-evolving training . Agents are trained with multi-environment reinforcement learning, while the synthesized environments serve as a training arena that automatically diagnoses capability gaps and targets environment/task expansion, enabling sustained self-evolution.

In total, Agent-World builds 1,978 environments and 19,822 tools, with synthesized tasks averaging more than 15 interaction turns .

Across 23 challenging benchmarks (including $\tau^2$-Bench, BFCL V4, MCP-Mark, ClawEval, SkillsBench, etc.), Agent-World-8B/14B consistently outperforms existing environment-scaling methods and strong open-source foundation models. Further analyses reveal a clear scaling relationship among environment diversity, self-evolution rounds, and agent performance.