AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials Paper • 2412.09605 • Published Dec 12, 2024 • 30
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper • 2412.04454 • Published Dec 5, 2024 • 71
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published Nov 26, 2024 • 90
Transformers.js demos Collection A collection of my favorite WebML demos, built with Transformers.js! • 30 items • Updated Jul 11, 2024 • 127
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at https://developers.google.com/health-ai-developer-foundations • 15 items • Updated Jul 10 • 108
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 544
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub Paper • 2405.13058 • Published May 20, 2024 • 2
CAT3D: Create Anything in 3D with Multi-View Diffusion Models Paper • 2405.10314 • Published May 16, 2024 • 48
🐒 Stable Diffusion LoRAs Collection Awesome LoRAs found on the hub - using only 🐵 • 7 items • Updated May 5 • 16
📦 3D creation workflow Collection Going from a text prompt to a nice 3D model • 3 items • Updated May 5 • 30
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion Paper • 2402.10009 • Published Feb 15, 2024 • 22