Instructions to use lvyufeng/PaddleOCR-VL-0.9B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lvyufeng/PaddleOCR-VL-0.9B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="lvyufeng/PaddleOCR-VL-0.9B")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("lvyufeng/PaddleOCR-VL-0.9B", dtype="auto") - MindSpore
How to use lvyufeng/PaddleOCR-VL-0.9B with MindSpore:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use lvyufeng/PaddleOCR-VL-0.9B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lvyufeng/PaddleOCR-VL-0.9B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lvyufeng/PaddleOCR-VL-0.9B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/lvyufeng/PaddleOCR-VL-0.9B
- SGLang
How to use lvyufeng/PaddleOCR-VL-0.9B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "lvyufeng/PaddleOCR-VL-0.9B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lvyufeng/PaddleOCR-VL-0.9B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "lvyufeng/PaddleOCR-VL-0.9B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lvyufeng/PaddleOCR-VL-0.9B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use lvyufeng/PaddleOCR-VL-0.9B with Docker Model Runner:
docker model run hf.co/lvyufeng/PaddleOCR-VL-0.9B
prompt
Orginal paper said that there are 4 prompt for 4 different objects
sure, you can use the prompts below:
query = "OCR:"
query = "Table Recognition:"
query = "Chart Recognition:"
query = "Formula Recognition:"
For table recognition I use the prompt "Table Recognition:" but it does not output OTSL style like in the paper noticed. Is this the problem with the checkpoint/prompt or the paper?
For table recognition I use the prompt "Table Recognition:" but it does not output OTSL style like in the paper noticed. Is this the problem with the checkpoint/prompt or the paper?
it seems the preprocess should be exactly same as PaddleX, i will analyze the code when i have free time
Hi, the prompt "OTSL Table Recognition:" works better for me
OTSL Table Recognition:
<fcel>Millions of dollars and shares except per share data<fcel>Year Ended December 31<lcel><lcel><nl><ucel><fcel>2008<fcel>2007<fcel>2006<nl><fcel>Revenue:<ecel><ecel><ecel><nl><fcel>Services<fcel>$ 13,391<fcel>$ 11,256<fcel>$ 9,643<nl><fcel>Product sales<fcel>4,888<fcel>4,008<fcel>3,312<nl><fcel>Total revenue<fcel>18,279<fcel>15,264<fcel>12,955<nl><fcel>Operating costs and expenses:<ecel><ecel><ecel><nl><fcel>Cost of services<fcel>10,079<fcel>8,167<fcel>6,751<nl><fcel>Cost of sales<fcel>3,970<fcel>3,358<fcel>2,675<nl><fcel>Revenue and administrative<fcel>282<fcel>293<fcel>342<nl><fcel>Gain on sale of business assets, net<fcel>(62)<fcel>(52)<fcel>(58)<nl><fcel>Total operating costs and expenses<fcel>14,269<fcel>11,766<fcel>9,710<nl><fcel>Operating income<fcel>4,010<fcel>3,498<fcel>3,245<nl><fcel>Interest expense<fcel>(160)<fcel>(154)<fcel>(165)<nl><fcel>Interest income<fcel>39<fcel>124<fcel>129<nl><fcel>Other, net<fcel>(726)<fcel>(8)<fcel>(10)<nl><fcel>Income from continuing operations before income taxes and minority interest provision for income taxes<fcel>3,163<fcel>3,460<fcel>3,199<nl><fcel>Minority interest in net income of subsidiaries<fcel>(1,211)<fcel>(907)<fcel>(1,003)<nl><fcel>Income from continuing operations<fcel>9<fcel>(29)<fcel>(19)<nl><fcel>Income from continuing operations, net of income tax (provision) benefit of $3, $15), and $ (183)<fcel>1,961<fcel>2,524<fcel>2,177<nl><fcel>Net income<fcel>(423)<fcel>975<fcel>171<nl><fcel>Net income<fcel>$ 1,538<fcel>$ 3,499<fcel>$ 2,348<nl><fcel>Basic income (loss) per share:<ecel><ecel><ecel><nl><fcel>Income from continuing operations<fcel>$ 2,24<fcel>$ 2,76<fcel>$ 2,15<nl><fcel>Income (loss) from discontinued operations, net<fcel>(0.49)<fcel>1.07<fcel>0.16<nl><fcel>Net income per share<fcel>$ 1.75<fcel>$ 3.83<fcel>$ 2.31<nl><fcel>Diluted income (loss) per share:<ecel><ecel><ecel><nl><fcel>Income from continuing operations<fcel>$ 2.17<fcel>$ 2.66<fcel>$ 2.07<nl><fcel>Income (loss) from discontinued operations, net<fcel>(0.47)<fcel>1.02<fcel>0.16<nl><fcel>Net income per share<fcel>$ 1.70<fcel>$ 3.68<fcel>$ 2.23<nl><fcel>Basic weighted average common shares outstanding<fcel>877<fcel>913<fcel>1,014<nl><fcel>Diluted weighted average common shares outstanding<fcel>904<fcel>950<fcel>1,054<nl>