Collections
Discover the best community collections!
Collections including paper arxiv:2502.07640
-
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99 -
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
Paper • 2502.03544 • Published • 43 -
FoNE: Precise Single-Token Number Embeddings via Fourier Features
Paper • 2502.09741 • Published • 15 -
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
Paper • 2502.20545 • Published • 22
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 78 -
Expect the Unexpected: FailSafe Long Context QA for Finance
Paper • 2502.06329 • Published • 132 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 68
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 115 -
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper • 2412.06559 • Published • 84 -
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
Paper • 2412.15084 • Published • 13 -
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 36 -
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Paper • 2411.04996 • Published • 51 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 68 -
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 47
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 115 -
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper • 2412.06559 • Published • 84 -
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
Paper • 2412.15084 • Published • 13 -
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99
-
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99 -
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
Paper • 2502.03544 • Published • 43 -
FoNE: Precise Single-Token Number Embeddings via Fourier Features
Paper • 2502.09741 • Published • 15 -
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
Paper • 2502.20545 • Published • 22
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 36 -
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Paper • 2411.04996 • Published • 51 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 68 -
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 47
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 78 -
Expect the Unexpected: FailSafe Long Context QA for Finance
Paper • 2502.06329 • Published • 132 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 68