V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models Paper • 2308.09300 • Published Aug 18, 2023 • 1
BannerAgency: Advertising Banner Design with Multimodal LLM Agents Paper • 2503.11060 • Published Mar 14 • 3
DesignLab: Designing Slides Through Iterative Detection and Correction Paper • 2507.17202 • Published Jul 23 • 50
Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds Paper • 2204.10688 • Published Apr 22, 2022