Video2LoRA SmolVLM Checkpoints

Video2LoRA: Parametric Video Internalization for Vision-Language Models

Official implementation of Video2LoRA

Manan Suri · Sarvesh Baskar · Dinesh Manocha

University of Maryland, College Park

This repository contains two Video2LoRA Stage 1 checkpoint files:

video2lora-smolvlm2-500m-best-ce.pt for HuggingFaceTB/SmolVLM2-500M-Video-Instruct
video2lora-smolvlm2-2.2b-best-ce.pt for HuggingFaceTB/SmolVLM2-2.2B-Instruct

Cite us

@misc{suri2026video2loraparametricvideointernalization,
      title={Video2LoRA: Parametric Video Internalization for Vision-Language Models}, 
      author={Manan Suri and Sarvesh Baskar and Dinesh Manocha},
      year={2026},
      eprint={2606.04351},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2606.04351}, 
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for MananSuri27/Video2LoRA-SmolVLM-ckpts

Video2LoRA: Parametric Video Internalization for Vision-Language Models

Paper • 2606.04351 • Published 6 days ago • 3