DreamFoley: Scalable VLMs for High-Fidelity Video-to-Audio Generation Paper • 2512.06022 • Published 24 days ago • 3
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture Paper • 2512.04810 • Published 24 days ago • 25