AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper • 2510.24699 • Published 3 days ago • 60
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published 7 days ago • 87
Search Self-play: Pushing the Frontier of Agent Capability without Supervision Paper • 2510.18821 • Published 10 days ago • 15
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published 14 days ago • 49
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints Paper • 2510.14847 • Published 15 days ago • 55
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training Paper • 2510.12586 • Published 17 days ago • 107
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published 22 days ago • 120
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning Paper • 2509.20712 • Published Sep 25 • 18
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets Paper • 2509.21245 • Published Sep 25 • 36
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24 • 76
S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models Paper • 2508.12880 • Published Aug 18 • 46
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation Paper • 2508.07981 • Published Aug 11 • 58
360+x: A Panoptic Multi-modal Scene Understanding Dataset Paper • 2404.00989 • Published Apr 1, 2024 • 1
Game4Loc: A UAV Geo-Localization Benchmark from Game Data Paper • 2409.16925 • Published Sep 25, 2024 • 8