MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation Paper • 2508.19320 • Published Aug 26, 2025 • 29
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 3 days ago • 549
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control +2 Feb 4, 2025 • 186
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers • 23 items • Updated 11 days ago • 103