Test-Time Compute/Optimal Scaling
updated
Scaling LLM Inference with Optimized Sample Compute Allocation
Paper
•
2410.22480
•
Published
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper
•
2501.02497
•
Published
•
45
Scaling of Search and Learning: A Roadmap to Reproduce o1 from
Reinforcement Learning Perspective
Paper
•
2412.14135
•
Published
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta
Chain-of-Though
Paper
•
2501.04682
•
Published
•
99
O1 Replication Journey: A Strategic Progress Report -- Part 1
Paper
•
2410.18982
•
Published
•
3
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical
Reasoning
Paper
•
2501.06458
•
Published
•
31
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning
Trajectories Search
Paper
•
2410.03864
•
Published
•
12
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper
•
2501.18585
•
Published
•
61
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
Paper
•
2501.13200
•
Published
•
69
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper
•
2502.03373
•
Published
•
58
Inference-Time Scaling for Generalist Reward Modeling
Paper
•
2504.02495
•
Published
•
57
TTRL: Test-Time Reinforcement Learning
Paper
•
2504.16084
•
Published
•
120
Scaling Test-time Compute for LLM Agents
Paper
•
2506.12928
•
Published
•
63