hyung gyu rho's picture

2 2

hyung gyu rho

sirano1004

·

sirano1004

AI & ML interests

None yet

Recent Activity

authored a paper 22 days ago

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

upvoted a paper 22 days ago

A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling

upvoted a paper 22 days ago

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

View all activity

Organizations

None yet

authored a paper 22 days ago

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

Paper • 2510.05342 • Published 23 days ago • 5

upvoted 2 papers 22 days ago

A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling

Paper • 2510.04087 • Published 25 days ago • 1

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

Paper • 2510.05342 • Published 23 days ago • 5

commented 2 papers 22 days ago

A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling

Paper • 2510.04087 • Published 25 days ago • 1 •

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

Paper • 2510.05342 • Published 23 days ago • 5 •

authored a paper 23 days ago

A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling

Paper • 2510.04087 • Published 25 days ago • 1