Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26, 2025 • 59
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment Paper • 2505.11821 • Published May 17, 2025 • 14