TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper β’ 2512.16093 β’ Published 8 days ago β’ 49
Region-Constraint In-Context Generation for Instructional Video Editing Paper β’ 2512.17650 β’ Published 6 days ago β’ 46
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper β’ 2512.15603 β’ Published 8 days ago β’ 55
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper β’ 2511.14993 β’ Published Nov 19 β’ 226
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper β’ 2511.10629 β’ Published Nov 13 β’ 122
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper β’ 2511.09057 β’ Published Nov 12 β’ 76
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper β’ 2508.10711 β’ Published Aug 14 β’ 145
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper β’ 2507.14683 β’ Published Jul 19 β’ 134
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 β’ 11 items β’ Updated Jul 21 β’ 550
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper β’ 2506.18898 β’ Published Jun 23 β’ 33
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper β’ 2506.09350 β’ Published Jun 11 β’ 48
SeedVR Collection A diffusion transformer model for high-resolution image and video restoration. β’ 9 items β’ Updated Aug 19 β’ 9
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training Paper β’ 2506.05301 β’ Published Jun 5 β’ 58
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper β’ 2505.10554 β’ Published May 15 β’ 120