FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing Paper • 2601.01720 • Published 4 days ago • 4
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 24 days ago • 73
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models Paper • 2511.11007 • Published Nov 14, 2025 • 15
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow Paper • 2509.21789 • Published Sep 26, 2025 • 9
StrandDesigner: Towards Practical Strand Generation with Sketch Guidance Paper • 2508.01650 • Published Aug 3, 2025 • 6
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation Paper • 2504.18087 • Published Apr 25, 2025 • 5
When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO Paper • 2503.16921 • Published Mar 21, 2025 • 6