A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning Paper โข 2507.08267 โข Published Jul 11, 2025 โข 10 โข 2