LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published Nov 25, 2025 • 182
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published Nov 20, 2025 • 92
RynnVLA-002: A Unified Vision-Language-Action and World Model Paper • 2511.17502 • Published Nov 21, 2025 • 25
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 165
PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity Paper • 2510.23603 • Published Oct 27, 2025 • 22
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13, 2025 • 100
High-Fidelity Simulated Data Generation for Real-World Zero-Shot Robotic Manipulation Learning with Gaussian Splatting Paper • 2510.10637 • Published Oct 12, 2025 • 12
TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios Paper • 2505.12891 • Published May 19, 2025 • 10
Multi-Agent Tool-Integrated Policy Optimization Paper • 2510.04678 • Published Oct 6, 2025 • 30
Residual Off-Policy RL for Finetuning Behavior Cloning Policies Paper • 2509.19301 • Published Sep 23, 2025 • 18