AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs Paper • 2507.05687 • Published Jul 8 • 27
Perception-Aware Policy Optimization for Multimodal Reasoning Paper • 2507.06448 • Published Jul 8 • 47
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny Paper • 2507.16331 • Published Jul 22 • 20