OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448 Video-Text-to-Text • 2B • Updated Mar 16, 2025 • 1.7k • 26
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated 26 days ago • 211k • 1.56k
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 49.7k • 1.6k