Video2LoRA SmolVLM Checkpoints

Video2LoRA: Parametric Video Internalization for Vision-Language Models

Official implementation of Video2LoRA

Manan Suri  ·  Sarvesh Baskar  ·  Dinesh Manocha

University of Maryland, College Park

project page arxiv paper

This repository contains two Video2LoRA Stage 1 checkpoint files:

  • video2lora-smolvlm2-500m-best-ce.pt for HuggingFaceTB/SmolVLM2-500M-Video-Instruct
  • video2lora-smolvlm2-2.2b-best-ce.pt for HuggingFaceTB/SmolVLM2-2.2B-Instruct

Cite us

@misc{suri2026video2loraparametricvideointernalization,
      title={Video2LoRA: Parametric Video Internalization for Vision-Language Models}, 
      author={Manan Suri and Sarvesh Baskar and Dinesh Manocha},
      year={2026},
      eprint={2606.04351},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2606.04351}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for MananSuri27/Video2LoRA-SmolVLM-ckpts