Towards Interactive Intelligence for Digital Humans Paper • 2512.13674 • Published about 10 hours ago
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published 1 day ago • 14
Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans? Paper • 2512.13281 • Published about 17 hours ago • 2
Exploring MLLM-Diffusion Information Transfer with MetaCanvas Paper • 2512.11464 • Published 4 days ago • 8
SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder Paper • 2512.11749 • Published 3 days ago • 33
PersonaLive! Expressive Portrait Image Animation for Live Streaming Paper • 2512.11253 • Published 4 days ago • 15
The N-Body Problem: Parallel Execution from Single-Person Egocentric Video Paper • 2512.11393 • Published 4 days ago • 1
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties Paper • 2512.11799 • Published 3 days ago • 25
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos Paper • 2512.10881 • Published 4 days ago • 26
Confucius Code Agent: An Open-sourced AI Software Engineer at Industrial Scale Paper • 2512.10398 • Published 5 days ago • 4
Evaluating Gemini Robotics Policies in a Veo World Simulator Paper • 2512.10675 • Published 5 days ago • 12
The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality Paper • 2512.10791 • Published 5 days ago • 4
UniUGP: Unifying Understanding, Generation, and Planing For End-to-end Autonomous Driving Paper • 2512.09864 • Published 5 days ago • 10
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation Paper • 2512.09363 • Published 6 days ago • 67
OmniPSD: Layered PSD Generation with Diffusion Transformer Paper • 2512.09247 • Published 6 days ago • 43
Learning Unmasking Policies for Diffusion Language Models Paper • 2512.09106 • Published 6 days ago • 6