hyung gyu rho's picture

2 2

hyung gyu rho

sirano1004

·

sirano1004

AI & ML interests

None yet

Organizations

None yet

authored 2 papers 3 months ago

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

Paper • 2510.05342 • Published Oct 6, 2025 • 5

A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling

Paper • 2510.04087 • Published Oct 5, 2025 • 1