SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization Paper • 2604.02268 • Published 6 days ago • 88
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published 9 days ago • 84
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 10 days ago • 137
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published 27 days ago • 91
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published Mar 3 • 145
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation Paper • 2603.08652 • Published 30 days ago • 39
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published about 1 month ago • 85
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation Paper • 2603.08652 • Published 30 days ago • 39
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios Paper • 2602.22638 • Published Feb 26 • 107
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published Mar 3 • 87
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published Mar 3 • 102
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published Feb 3 • 64
On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models Paper • 2602.03392 • Published Feb 3 • 59
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published Feb 9 • 39
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published Feb 9 • 39