Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601]
Yulei Qin
yolay
AI & ML interests
Medical Imaging, Computer Vision,
Language Models
Recent Activity
updated
a model
12 days ago
yolay/SPEAR-ReTool-Qwen2.5-32B
updated
a model
12 days ago
yolay/SPEAR-ReTool-Qwen3-32B
updated
a model
12 days ago
yolay/SPEAR-ALFWorld-DrBoT-GRPO-1.5B