- 
	
	
	
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Paper • 2504.05118 • Published • 26 - 
	
	
	
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
Paper • 2504.04718 • Published • 42 - 
	
	
	
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement
Paper • 2504.03561 • Published • 18 - 
	
	
	
Concept Lancet: Image Editing with Compositional Representation Transplant
Paper • 2504.02828 • Published • 16 
Collections
Discover the best community collections!
Collections including paper arxiv:2504.03553 
						
					
				- 
	
	
	
Agentic Knowledgeable Self-awareness
Paper • 2504.03553 • Published • 27 - 
	
	
	
Benchmarking LLMs' Swarm intelligence
Paper • 2505.04364 • Published • 20 - 
	
	
	
Multi-Agent System for Comprehensive Soccer Understanding
Paper • 2505.03735 • Published • 25 - 
	
	
	
LIMI: Less is More for Agency
Paper • 2509.17567 • Published • 100 
- 
	
	
	
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
Paper • 2503.10615 • Published • 17 - 
	
	
	
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
Paper • 2503.10630 • Published • 6 - 
	
	
	
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 - 
	
	
	
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Paper • 2503.07536 • Published • 88 
- 
	
	
	
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 192 - 
	
	
	
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Paper • 2502.14739 • Published • 104 - 
	
	
	
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?
Paper • 2502.14502 • Published • 91 - 
	
	
	
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC
Paper • 2502.14282 • Published • 29 
- 
	
	
	
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Paper • 2412.11605 • Published • 18 - 
	
	
	
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 108 - 
	
	
	
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
Paper • 2412.17739 • Published • 41 - 
	
	
	
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10 
- 
	
	
	
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62 - 
	
	
	
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 - 
	
	
	
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 - 
	
	
	
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 141 
- 
	
	
	
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 84 - 
	
	
	
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
Paper • 2503.01688 • Published • 21 - 
	
	
	
Predictive Data Selection: The Data That Predicts Is the Data That Teaches
Paper • 2503.00808 • Published • 56 - 
	
	
	
Chain of Draft: Thinking Faster by Writing Less
Paper • 2502.18600 • Published • 49 
- 
	
	
	
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 34 - 
	
	
	
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 27 - 
	
	
	
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 - 
	
	
	
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22 
- 
	
	
	
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Paper • 2504.05118 • Published • 26 - 
	
	
	
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
Paper • 2504.04718 • Published • 42 - 
	
	
	
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement
Paper • 2504.03561 • Published • 18 - 
	
	
	
Concept Lancet: Image Editing with Compositional Representation Transplant
Paper • 2504.02828 • Published • 16 
- 
	
	
	
Agentic Knowledgeable Self-awareness
Paper • 2504.03553 • Published • 27 - 
	
	
	
Benchmarking LLMs' Swarm intelligence
Paper • 2505.04364 • Published • 20 - 
	
	
	
Multi-Agent System for Comprehensive Soccer Understanding
Paper • 2505.03735 • Published • 25 - 
	
	
	
LIMI: Less is More for Agency
Paper • 2509.17567 • Published • 100 
- 
	
	
	
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62 - 
	
	
	
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 - 
	
	
	
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 - 
	
	
	
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 141 
- 
	
	
	
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
Paper • 2503.10615 • Published • 17 - 
	
	
	
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
Paper • 2503.10630 • Published • 6 - 
	
	
	
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 - 
	
	
	
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Paper • 2503.07536 • Published • 88 
- 
	
	
	
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 84 - 
	
	
	
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
Paper • 2503.01688 • Published • 21 - 
	
	
	
Predictive Data Selection: The Data That Predicts Is the Data That Teaches
Paper • 2503.00808 • Published • 56 - 
	
	
	
Chain of Draft: Thinking Faster by Writing Less
Paper • 2502.18600 • Published • 49 
- 
	
	
	
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 192 - 
	
	
	
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Paper • 2502.14739 • Published • 104 - 
	
	
	
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?
Paper • 2502.14502 • Published • 91 - 
	
	
	
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC
Paper • 2502.14282 • Published • 29 
- 
	
	
	
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Paper • 2412.11605 • Published • 18 - 
	
	
	
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 108 - 
	
	
	
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
Paper • 2412.17739 • Published • 41 - 
	
	
	
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10 
- 
	
	
	
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 34 - 
	
	
	
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 27 - 
	
	
	
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 - 
	
	
	
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22