Andres Marafioti's picture

Andres Marafioti

andito

·

AI & ML interests

Multimodal models, VLM and TTS

Recent Activity

updated a model about 16 hours ago

andito/nanoVLM

upvoted an article 3 days ago

Building the Open Agent Ecosystem Together: Introducing OpenEnv

upvoted a collection 3 days ago

View all activity

Organizations

authored a paper 5 days ago

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published 7 days ago • 54

authored a paper 5 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 140

authored 2 papers 7 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 200

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 117

authored a paper 9 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 243

authored 3 papers about 1 year ago

GACELA -- A generative adversarial context encoder for long audio inpainting

Paper • 2005.05032 • Published May 11, 2020

Adversarial Generation of Time-Frequency Features with application in audio synthesis

Paper • 1902.04072 • Published Feb 11, 2019

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133