arxiv:2510.05342
hyung gyu rho
sirano1004
ยท
AI & ML interests
None yet
Recent Activity
authored
a paper
21 days ago
Margin Adaptive DPO: Leveraging Reward Model for Granular Control in
Preference Optimization
upvoted
a
paper
21 days ago
A Contextual Quality Reward Model for Reliable and Efficient Best-of-N
Sampling
Organizations
None yet