Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published 6 days ago • 167
E^2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker Paper • 2510.22733 • Published 7 days ago • 31
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation Paper • 2510.23581 • Published 6 days ago • 41
VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing, Speaking, and Acting Paper • 2510.21817 • Published 12 days ago • 41
Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences Paper • 2510.23451 • Published 6 days ago • 26
ACG: Action Coherence Guidance for Flow-based VLA models Paper • 2510.22201 • Published 8 days ago • 36
RobotArena infty: Scalable Robot Benchmarking via Real-to-Sim Translation Paper • 2510.23571 • Published 6 days ago • 8
LimRank: Less is More for Reasoning-Intensive Information Reranking Paper • 2510.23544 • Published 6 days ago • 8
PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity Paper • 2510.23603 • Published 6 days ago • 21
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction Paper • 2510.22706 • Published 7 days ago • 36
StreamingVLM: Real-Time Understanding for Infinite Video Streams Paper • 2510.09608 • Published 23 days ago • 49
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 20 days ago • 169
MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech Paper • 2509.25131 • Published Sep 29 • 14
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4 • 130
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published Aug 2 • 236