Does your data spark joy? Performance gains from domain upsampling at the end of training Paper • 2406.03476 • Published Jun 5, 2024 • 4
view changelog Changelog Inference Providers now fully support OpenAI-compatible API Jul 18, 2025 • 95
view changelog Changelog Introducing HF Jobs: Run scalable compute jobs on Hugging Face Jul 30, 2025 • 200
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29, 2025 • 206
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 747
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 75
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data +7 Jun 3, 2025 • 305
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23, 2025 • 81
view changelog Changelog Xet is now the default storage option for new users and organizations May 23, 2025 • 74
view article Article LeRobot Community Datasets: The “ImageNet” of Robotics — When and How? +5 May 11, 2025 • 88