EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale Paper • 2604.17406 • Published 3 days ago • 1
Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration Paper • 2604.18131 • Published 2 days ago • 4
WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models Paper • 2604.18224 • Published 2 days ago • 18
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents Paper • 2604.18543 • Published 2 days ago • 17
MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval Paper • 2604.18584 • Published 2 days ago • 7
PRL-Bench: A Comprehensive Benchmark Evaluating LLMs' Capabilities in Frontier Physics Research Paper • 2604.15411 • Published 6 days ago • 2
VEFX-Bench: A Holistic Benchmark for Generic Video Editing and Visual Effects Paper • 2604.16272 • Published 5 days ago • 1
Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems Paper • 2604.14228 • Published 8 days ago • 22
LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories Paper • 2604.15311 • Published 6 days ago • 11
DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper • 2604.14683 • Published 6 days ago • 32
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 7 days ago • 103
MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation Paper • 2604.15309 • Published 6 days ago • 6
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments Paper • 2604.14144 • Published 7 days ago • 62
UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding Paper • 2604.14113 • Published 7 days ago • 10
InfiniteScienceGym: An Unbounded, Procedurally-Generated Benchmark for Scientific Analysis Paper • 2604.13201 • Published 8 days ago • 2
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 7 days ago • 148
VideoFlexTok: Flexible-Length Coarse-to-Fine Video Tokenization Paper • 2604.12887 • Published 8 days ago • 4