TIPS: Turn-Level Information-Potential Reward Shaping for Search-Augmented LLMs Paper โข 2603.22293 โข Published Mar 11 โข 1
IsoCompute Playbook: Optimally Scaling Sampling Compute for LLM RL Paper โข 2603.12151 โข Published Mar 12 โข 2