Visual Representation Alignment for Multimodal Large Language Models Paper • 2509.07979 • Published Sep 9 • 83
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion Paper • 2507.02813 • Published Jul 3 • 60
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention Paper • 2507.17745 • Published Jul 23 • 35
The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner Paper • 2507.13332 • Published Jul 17 • 48
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published Jul 14 • 70
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models Paper • 2507.13344 • Published Jul 17 • 57
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning Paper • 2507.08267 • Published Jul 11 • 10
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation Paper • 2310.13119 • Published Oct 19, 2023 • 13
view article Article Drag GAN - Interactive Point-based Manipulation on the Generative Image Manifold Dec 17, 2023 • 3
GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers Paper • 2409.04196 • Published Sep 6, 2024 • 16
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement Paper • 2408.00653 • Published Aug 1, 2024 • 32
Collaborative Control for Geometry-Conditioned PBR Image Generation Paper • 2402.05919 • Published Feb 8, 2024 • 6
view article Article Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality +8 Jun 24, 2024 • 34