-
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper • 2505.19443 • Published • 15 -
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
Paper • 2506.19290 • Published • 52 -
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
Paper • 2105.12655 • Published -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 148
Collections
Discover the best community collections!
Collections including paper arxiv:2510.08697
-
33
BigCodeArena
🚀Compare two AI models by sending them code and seeing their responses
-
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
Paper • 2510.08697 • Published • 33 -
bigcode/bigcodearena-raw-14k
Viewer • Updated • 14.1k • 65 • 1 -
bigcode/bigcodearena-preference-5k
Viewer • Updated • 4.73k • 70
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 266 • 96 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning
Paper • 2509.13761 • Published • 16 -
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
Paper • 2509.25849 • Published • 47 -
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models
Paper • 2510.03561 • Published • 23 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 461
-
The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs
Paper • 2506.18403 • Published • 3 -
ReCode: Updating Code API Knowledge with Reinforcement Learning
Paper • 2506.20495 • Published • 9 -
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper • 2507.23348 • Published • 11 -
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Paper • 2509.09614 • Published • 7
-
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper • 2505.19443 • Published • 15 -
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
Paper • 2506.19290 • Published • 52 -
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
Paper • 2105.12655 • Published -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 148
-
THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning
Paper • 2509.13761 • Published • 16 -
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
Paper • 2509.25849 • Published • 47 -
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models
Paper • 2510.03561 • Published • 23 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 461
-
33
BigCodeArena
🚀Compare two AI models by sending them code and seeing their responses
-
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
Paper • 2510.08697 • Published • 33 -
bigcode/bigcodearena-raw-14k
Viewer • Updated • 14.1k • 65 • 1 -
bigcode/bigcodearena-preference-5k
Viewer • Updated • 4.73k • 70
-
The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs
Paper • 2506.18403 • Published • 3 -
ReCode: Updating Code API Knowledge with Reinforcement Learning
Paper • 2506.20495 • Published • 9 -
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper • 2507.23348 • Published • 11 -
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Paper • 2509.09614 • Published • 7
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 266 • 96 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88