Experience of Training a 1.7B-Parameter LLaMa Model From Scratch Paper • 2412.13335 • Published Dec 17, 2024
On the Effectiveness of Incremental Training of Large Language Models Paper • 2411.18700 • Published Nov 27, 2024 • 1