Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models Paper • 2606.16281 • Published 13 days ago • 34
Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts Paper • 2606.05922 • Published 23 days ago • 69
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published 25 days ago • 126
AgentHijack: Benchmarking Computer Use Agent Robustness to Common Environment Corruptions Paper • 2605.25707 • Published May 25 • 6