Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Paper • 2406.10957 • Published • 2
Resources for EMNLP 2024 Paper: Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence