Plan-X: Instruct Video Generation via Semantic Planning Paper • 2511.17986 • Published Nov 22, 2025 • 16
FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation Paper • 2509.25187 • Published Sep 29, 2025 • 2
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback Paper • 2510.16888 • Published Oct 19, 2025 • 21
BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration Paper • 2510.00438 • Published Oct 1, 2025 • 8
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15, 2025 • 105
CineScale: Free Lunch in High-Resolution Cinematic Visual Generation Paper • 2508.15774 • Published Aug 21, 2025 • 20
Wan-S2V: Audio-Driven Cinematic Video Generation Paper • 2508.18621 • Published Aug 26, 2025 • 20
Accelerate High-Quality Diffusion Models with Inner Loop Feedback Paper • 2501.13107 • Published Jan 22, 2025 • 2
Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity Paper • 2502.01776 • Published Feb 3, 2025 • 3
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Paper • 2501.18427 • Published Jan 30, 2025 • 23
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published Aug 14, 2025 • 145
HPSv3: Towards Wide-Spectrum Human Preference Score Paper • 2508.03789 • Published Aug 5, 2025 • 19
Captain Cinema: Towards Short Movie Generation Paper • 2507.18634 • Published Jul 24, 2025 • 41
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published Jul 17, 2025 • 77
Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective Paper • 2507.08801 • Published Jul 11, 2025 • 30
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation Paper • 2507.09862 • Published Jul 14, 2025 • 49
Tora2: Motion and Appearance Customized Diffusion Transformer for Multi-Entity Video Generation Paper • 2507.05963 • Published Jul 8, 2025 • 12