MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments Paper • 2602.06075 • Published 6 days ago • 11
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos Paper • 2602.06949 • Published 3 days ago • 11
Accurate Failure Prediction in Agents Does Not Imply Effective Failure Prevention Paper • 2602.03338 • Published 6 days ago • 25
InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions Paper • 2602.06035 • Published 4 days ago • 21
Reinforcement World Model Learning for LLM-based Agents Paper • 2602.05842 • Published 4 days ago • 18
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents Paper • 2602.02474 • Published 7 days ago • 48
VLS: Steering Pretrained Robot Policies via Vision-Language Models Paper • 2602.03973 • Published 6 days ago • 20
Likelihood-Based Reward Designs for General LLM Reasoning Paper • 2602.03979 • Published 6 days ago • 8
EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models Paper • 2602.04515 • Published 5 days ago • 36
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published 6 days ago • 25
VIOLA: Towards Video In-Context Learning with Minimal Annotations Paper • 2601.15549 • Published 18 days ago • 4
Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning Paper • 2601.16163 • Published 18 days ago • 13
PROGRESSLM: Towards Progress Reasoning in Vision-Language Models Paper • 2601.15224 • Published 19 days ago • 12
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces Paper • 2601.11868 • Published 23 days ago • 32
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published 18 days ago • 89