Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.19786

Papers I Have Read

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54
WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published 11 days ago • 78

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 207
Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54
Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 132
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 298
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

Paper • 2504.09925 • Published Apr 14 • 38

FM_Training_Infra

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19 • 69
Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25 • 29
Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 186
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20 • 36
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7 • 63
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 267

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54
Running

338

338

MiniMax M1

💬

Generate code from text prompts

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

Reinforcement Learning: An Overview

Paper • 2412.05265 • Published Dec 6, 2024 • 8
Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis

Paper • 2411.01156 • Published Nov 2, 2024 • 10
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Paper • 2503.21755 • Published Mar 27 • 33
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 166

2025 LLM Papers on Hugging Face with Japanese Memos

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Paper • 2501.02955 • Published Jan 6 • 44
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21 • 85
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper • 2501.09781 • Published Jan 16 • 28

Papers I Have Read

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54
WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published 11 days ago • 78

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 186
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20 • 36
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7 • 63
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 267

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54
Running

338

338

MiniMax M1

💬

Generate code from text prompts

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 207
Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54
Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 132
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 298
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

Paper • 2504.09925 • Published Apr 14 • 38

Reinforcement Learning: An Overview

Paper • 2412.05265 • Published Dec 6, 2024 • 8
Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis

Paper • 2411.01156 • Published Nov 2, 2024 • 10
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Paper • 2503.21755 • Published Mar 27 • 33
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 166

FM_Training_Infra

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19 • 69
Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25 • 29
Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

2025 LLM Papers on Hugging Face with Japanese Memos

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Paper • 2501.02955 • Published Jan 6 • 44
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21 • 85
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper • 2501.09781 • Published Jan 16 • 28

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs