The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published Jul 20, 2025 • 85
Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning Paper • 2502.19655 • Published Feb 27, 2025