OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes Paper • 2510.26800 • Published 2 days ago • 17
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published 5 days ago • 79
VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning Paper • 2510.25772 • Published 3 days ago • 32
Uniform Discrete Diffusion with Metric Path for Video Generation Paper • 2510.24717 • Published 4 days ago • 39
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction Paper • 2510.22706 • Published 6 days ago • 35
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published 5 days ago • 166
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation Paper • 2510.21583 • Published 8 days ago • 30
LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published 9 days ago • 8
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published 9 days ago • 38
Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall Paper • 2510.19304 • Published 10 days ago • 22
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 10 days ago • 43
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing Paper • 2510.19808 • Published 10 days ago • 27
Unified Reinforcement and Imitation Learning for Vision-Language Models Paper • 2510.19307 • Published 10 days ago • 26
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published 11 days ago • 106
ConsistEdit: Highly Consistent and Precise Training-free Visual Editing Paper • 2510.17803 • Published 12 days ago • 12
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 15 days ago • 85