Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation Paper • 2510.22115 • Published Oct 25 • 83
Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness Paper • 2508.18824 • Published Aug 26 • 1
Toward Stable and Consistent Evaluation Results: A New Methodology for Base Model Evaluation Paper • 2503.00812 • Published Mar 2
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs Paper • 2503.05139 • Published Mar 7 • 4
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models Paper • 2507.17702 • Published Jul 23 • 6
WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training Paper • 2507.17634 • Published Jul 23 • 2