Instructions to use google/paligemma-3b-mix-448 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/paligemma-3b-mix-448 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="google/paligemma-3b-mix-448")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/paligemma-3b-mix-448") model = AutoModelForImageTextToText.from_pretrained("google/paligemma-3b-mix-448") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use google/paligemma-3b-mix-448 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "google/paligemma-3b-mix-448" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/paligemma-3b-mix-448", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/google/paligemma-3b-mix-448
- SGLang
How to use google/paligemma-3b-mix-448 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "google/paligemma-3b-mix-448" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/paligemma-3b-mix-448", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "google/paligemma-3b-mix-448" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/paligemma-3b-mix-448", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use google/paligemma-3b-mix-448 with Docker Model Runner:
docker model run hf.co/google/paligemma-3b-mix-448
Usage rights?
Hi,
the model pages on HF state that:
The models are available in float32, bfloat16 and float16 format for research purposes only.
The "for research purposes" portion is confusing because:
- the Gemma license, which is linked as "Terms", does not have this provision
- README.md on Github does have such a provision, but only for Transfer Checkpoints:
"We provide checkpoints transferred to most of the tasks we evaluated transfer on [...] for academic research purposes only.". This could imply that the base model and the "mix" checkpoints are free to use for non-academic work (subject to the Gemma license), but contradicts the HF page for e.g. paligemma-3b-mix-448 which does have the "for research purposes" statement.
Please clarify.
Thanks,
Nils
Asked the other way round: is it correct that only "downstream" trained models are just for academic purposes, such as:
- google/paligemma-3b-ft-ocrvqa-896
- google/paligemma-3b-ft-docvqa-896
- google/paligemma-3b-ft-infovqa-896
?
Hi Nils,
If you consider non-research deployments, we recommend using the paligemma-3b-pt-{224|448|896} checkpoints, and fine-tune very quickly on your customized datasets.
PaliGemma fine-tune is pretty lightweight, that you could try our colab at:
https://colab.sandbox.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/paligemma/finetune_paligemma.ipynb
I also found this fine-tune example tutoriol from Twitter for reference: https://blog.roboflow.com/how-to-fine-tune-paligemma/
Best,
Xiaohua
I also created https://huggingface.co/google/paligemma-3b-mix-224/discussions/7, I think the License: gemma property then needs to be fixed to make clear that it is licensed under different (only research) terms.