Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent Paper • 2606.30616 • Published 6 days ago • 86
MOPD: Multi-Teacher On-Policy Distillation for Capability Integration in LLM Post-Training Paper • 2606.30406 • Published 6 days ago • 13
Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs Paper • 2606.32032 • Published 5 days ago • 22
PerceptionRubrics: Calibrating Multimodal Evaluation to Human Perception Paper • 2606.28322 • Published 9 days ago • 38
Agentic Abstention: Do Agents Know When to Stop Instead of Act? Paper • 2606.28733 • Published 8 days ago • 142
Translation as a Bridging Action: Transferring Manipulation Skills from Humans to Robots Paper • 2606.28133 • Published 9 days ago • 39
NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? Paper • 2606.24530 • Published 12 days ago • 62
The Verification Horizon: No Silver Bullet for Coding Agent Rewards Paper • 2606.26300 • Published 11 days ago • 47
OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning Paper • 2606.26790 • Published 10 days ago • 54
Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation Paper • 2606.26907 • Published 10 days ago • 49
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 12 days ago • 144