DriveGen3D: Boosting Feed-Forward Driving Scene Generation with Efficient Video Diffusion Paper • 2510.15264 • Published Oct 17, 2025 • 1
EgoLCD: Egocentric Video Generation with Long Context Diffusion Paper • 2512.04515 • Published 28 days ago • 5
BlockVid: Block Diffusion for High-Quality and Consistent Minute-Long Video Generation Paper • 2511.22973 • Published Nov 28, 2025 • 4
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots Paper • 2511.17889 • Published Nov 22, 2025 • 5
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation Paper • 2511.20714 • Published Nov 25, 2025 • 47
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models Paper • 2510.01623 • Published Oct 2, 2025 • 10
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction Paper • 2509.19297 • Published Sep 23, 2025 • 24
FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion Paper • 2506.04648 • Published Jun 5, 2025 • 1
StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes Paper • 2509.16415 • Published Sep 19, 2025 • 2
ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS Paper • 2505.23734 • Published May 29, 2025 • 4
Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting Paper • 2506.05327 • Published Jun 5, 2025 • 11
SSS: Semi-Supervised SAM-2 with Efficient Prompting for Medical Imaging Segmentation Paper • 2506.08949 • Published Jun 10, 2025
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding Paper • 2507.23478 • Published Jul 31, 2025 • 15
ReMoMask: Retrieval-Augmented Masked Motion Generation Paper • 2508.02605 • Published Aug 4, 2025 • 4
PresentAgent: Multimodal Agent for Presentation Video Generation Paper • 2507.04036 • Published Jul 5, 2025 • 10
MediAug: Exploring Visual Augmentation in Medical Imaging Paper • 2504.18983 • Published Apr 26, 2025 • 7