Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 26 days ago • 93
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 26 days ago • 93
RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing Paper • 2507.20352 • Published Jul 27