Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window Paper • 2510.08276 • Published 18 days ago • 9 • 2
RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback Paper • 2507.15024 • Published Jul 20 • 14 • 1