OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer Paper • 2601.14250 • Published 4 days ago • 40
Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation Paper • 2601.10880 • Published 9 days ago • 15
Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published 10 days ago • 31
3AM: Segment Anything with Geometric Consistency in Videos Paper • 2601.08831 • Published 11 days ago • 34
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published 14 days ago • 206
OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent Paper • 2601.07779 • Published 13 days ago • 27
VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction Paper • 2601.05966 • Published 16 days ago • 23
NitroGen: An Open Foundation Model for Generalist Gaming Agents Paper • 2601.02427 • Published 21 days ago • 42
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 19 days ago • 134
KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs Paper • 2601.01046 • Published 22 days ago • 13
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published 23 days ago • 54
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation Paper • 2512.24724 • Published 25 days ago • 7
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published Dec 19, 2025 • 97
Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding Paper • 2512.17220 • Published Dec 19, 2025 • 111
Spatia: Video Generation with Updatable Spatial Memory Paper • 2512.15716 • Published Dec 17, 2025 • 33
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published Dec 24, 2025 • 13
StoryMem: Multi-shot Long Video Storytelling with Memory Paper • 2512.19539 • Published Dec 22, 2025 • 18