Ranger Loh
carrobot
		AI & ML interests
None yet
		Recent Activity
						upvoted 
								a
								paper
							
						about 1 month ago
						
					
						
						
						Language Models Can Learn from Verbal Feedback Without Scalar Rewards
						
						upvoted 
								a
								paper
							
						5 months ago
						
					
						
						
						Through the Valley: Path to Effective Long CoT Training for Small
  Language Models
						
						commented on 
								a paper
							
						over 1 year ago
						
					
						
						
						Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical
  Reasoning
						Organizations
None yet