Geyang
geyang627
AI & ML interests
None yet
Recent Activity
upvoted a paper about 1 month ago
Safe and Scalable Web Agent Learning via Recreated Websites upvoted an article about 2 months ago
Deriving the PPO Loss from First Principles upvoted an article about 2 months ago
A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond