MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models
Paper • 2603.25744 • Published • 12
None defined yet.
MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models
See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models