Towards General Agentic Intelligence via Environment Scaling Paper • 2509.13311 • Published Sep 16 • 70
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation Paper • 2509.25849 • Published 27 days ago • 46
Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models Paper • 2509.26628 • Published 26 days ago • 14