RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling Paper • 2510.20206 • Published 5 days ago • 11
Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published 4 days ago • 33
LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published 4 days ago • 7
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published 4 days ago • 35
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published 4 days ago • 48
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing Paper • 2510.19808 • Published 5 days ago • 23
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 5 days ago • 39
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published 6 days ago • 104
UltraGen: High-Resolution Video Generation with Hierarchical Attention Paper • 2510.18775 • Published 6 days ago • 15
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models Paper • 2510.17519 • Published 7 days ago • 9
MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation Paper • 2510.18692 • Published 6 days ago • 38
ConsistEdit: Highly Consistent and Precise Training-free Visual Editing Paper • 2510.17803 • Published 7 days ago • 12
Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling Paper • 2510.16751 • Published 9 days ago • 19
Glyph: Scaling Context Windows via Visual-Text Compression Paper • 2510.17800 • Published 7 days ago • 61
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published 11 days ago • 47
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks Paper • 2510.15019 • Published 11 days ago • 56
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning Paper • 2510.15444 • Published 11 days ago • 139
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published 10 days ago • 49
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints Paper • 2510.14847 • Published 11 days ago • 55