FastVLM Collection Efficient Vision Encoding for Vision Language Models • 9 items • Updated Sep 2 • 103
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data Jun 3 • 270
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 5 items • Updated Sep 1 • 129