Ranger Loh
carrobot
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 1 month ago
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
upvoted
a
paper
5 months ago
Through the Valley: Path to Effective Long CoT Training for Small
Language Models
commented on
a paper
over 1 year ago
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical
Reasoning
Organizations
None yet