facebook/metaclip-h14-fullcc2.5b Zero-Shot Image Classification • 1.0B • Updated Jan 11, 2024 • 16.3k • 44
openai/clip-vit-large-patch14 Zero-Shot Image Classification • 0.4B • Updated Sep 15, 2023 • 9.76M • 1.88k
Running on CPU Upgrade 13.6k 13.6k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Paper • 2503.10615 • Published Mar 13 • 17
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training Paper • 2503.08525 • Published Mar 11 • 17
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published Mar 12 • 36
Self-Rewarding Vision-Language Model via Reasoning Decomposition Paper • 2508.19652 • Published Aug 27 • 84
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions Paper • 2509.06951 • Published Sep 8 • 31
Interactive Training: Feedback-Driven Neural Network Optimization Paper • 2510.02297 • Published 24 days ago • 41
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models Paper • 2510.05034 • Published 20 days ago • 45
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation Paper • 2510.00515 • Published 26 days ago • 39
Demystifying Reinforcement Learning in Agentic Reasoning Paper • 2510.11701 • Published 13 days ago • 31