view article Article ✴️ ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use Ziyang • Jan 3, 2025 • 28
view article Article SmolVLM2: Bringing Video Understanding to Every Device +5 orrzohar, mfarre, andito, merve, pcuenq, cyrilzakka, Xenova • Feb 20, 2025 • 343
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.15k
view article Article We now support VLMs in smolagents! +1 m-ric, merve, albertvillanova • Jan 24, 2025 • 114
view article Article Zero-shot image-to-text generation with BLIP-2 MariaK, JunnanLi • Feb 15, 2023 • 28
view article Article Blazingly fast whisper transcriptions with Inference Endpoints +4 mfuntowicz, freddyaboulton, Steveeeeeeen, reach-vb, erikkaum, michellehbn • May 13, 2025 • 82