TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published 8 days ago • 58
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 7 days ago • 20
CaptionQA: Is Your Caption as Useful as the Image Itself? Paper • 2511.21025 • Published about 1 month ago • 27
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published 29 days ago • 79
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 29 days ago • 212
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published Nov 12 • 76
Revisiting Multimodal Positional Encoding in Vision-Language Models Paper • 2510.23095 • Published Oct 27 • 20
ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation Paper • 2511.01163 • Published Nov 3 • 31