OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 16 days ago • 85
Has GPT-5 Achieved Spatial Intelligence? An Empirical Study Paper • 2508.13142 • Published Aug 18 • 34
StyleMM: Stylized 3D Morphable Face Model via Text-Driven Aligned Image Translation Paper • 2508.11203 • Published Aug 15 • 10
FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation Paper • 2508.11255 • Published Aug 15 • 11
TexVerse: A Universe of 3D Objects with High-Resolution Textures Paper • 2508.10868 • Published Aug 14 • 17
SPARSE Data, Rich Results: Few-Shot Semi-Supervised Learning via Class-Conditioned Image Translation Paper • 2508.06429 • Published Aug 8 • 2
MAESTRO: Masked AutoEncoders for Multimodal, Multitemporal, and Multispectral Earth Observation Data Paper • 2508.10894 • Published Aug 14 • 6