paddleocr-onnx / README.md
s-emanuilov's picture
Create README.md
2642e18 verified
|
raw
history blame
11.7 kB

PP-OCRv5 ONNX Models

Fast and accurate multilingual OCR models from PaddleOCR, converted to ONNX format for easy deployment.

Original Models: PaddlePaddle PP-OCRv5 Collection
Converted by: Community contribution
Format: ONNX (optimized for inference)
License: Apache 2.0


🎯 What's Inside

This repository contains 11 production-ready ONNX models:

  • 1 Detection Model - Finds text in images (works with all languages)
  • 7 Recognition Models - Reads text in 39+ languages
  • 3 Preprocessing Models - Fixes rotated or distorted documents (optional)

Total Size: ~258 MB
Languages: English, French, German, Spanish, Italian, Portuguese, Russian, Ukrainian, Korean, Chinese, Japanese, Thai, Greek, and 25+ more!


πŸš€ Quick Start

Installation

pip install rapidocr-onnxruntime

That's it! No PaddlePaddle, no CUDA required. Works on CPU out of the box.

Basic Usage - English

from rapidocr_onnxruntime import RapidOCR

# Initialize OCR
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="english/ppocrv5_en_dict.txt"
)

# Run OCR
result, elapsed = ocr("your_image.jpg")

# Print results
for line in result:
    text = line[1][0]  # Extracted text
    confidence = line[1][1]  # Confidence score
    print(f"{text} (confidence: {confidence:.2%})")

Other Languages

Just change the model paths:

# French, German, Spanish, Italian, etc. (32 languages)
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="latin/ppocrv5_latin_dict.txt"
)

# Russian, Bulgarian, Ukrainian, Belarusian
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="eslav/eslav_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="eslav/ppocrv5_eslav_dict.txt"
)

# Korean
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="korean/korean_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="korean/ppocrv5_korean_dict.txt"
)

# Chinese / Japanese
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="chinese/PP-OCRv5_server_rec.onnx",
    rec_keys_path="chinese/ppocrv5_dict.txt"
)

# Thai
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="thai/th_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="thai/ppocrv5_th_dict.txt"
)

# Greek
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="greek/el_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="greek/ppocrv5_el_dict.txt"
)

πŸ“¦ Available Models

Text Recognition Models

Model Languages Accuracy Size Best For
english/ English 85.25% 7.5 MB English documents
latin/ French, German, Spanish, Italian, Portuguese, Dutch, Polish, Czech, + 24 more 84.7% 7.5 MB European documents
eslav/ Russian, Bulgarian, Ukrainian, Belarusian, English 81.6% 7.5 MB Cyrillic scripts
korean/ Korean, English 88.0% 13 MB Korean documents
chinese/ Chinese, Japanese, English - 81 MB CJK documents
thai/ Thai, English 82.68% 7.5 MB Thai documents
greek/ Greek, English 89.28% 7.4 MB Greek documents

Detection Model

  • detection/ - Universal text detection (84 MB) - Works with all languages

Preprocessing Models (Optional)

Enhance OCR accuracy on challenging documents:

  • preprocessing/doc-orientation/ - Fixes rotated documents (6.5 MB, 99.06% accuracy)
  • preprocessing/textline-orientation/ - Fixes upside-down text (6.5 MB, 98.85% accuracy)
  • preprocessing/doc-unwarping/ - Fixes curved/warped pages (30 MB)

🌍 Supported Languages (39+)

Latin Model (32 languages)

English β€’ French β€’ German β€’ Spanish β€’ Italian β€’ Portuguese β€’ Dutch β€’ Polish β€’ Czech β€’ Slovak β€’ Croatian β€’ Bosnian β€’ Serbian (Latin) β€’ Slovenian β€’ Danish β€’ Norwegian β€’ Swedish β€’ Icelandic β€’ Estonian β€’ Lithuanian β€’ Hungarian β€’ Albanian β€’ Welsh β€’ Irish β€’ Turkish β€’ Indonesian β€’ Malay β€’ Afrikaans β€’ Swahili β€’ Tagalog β€’ Uzbek β€’ Latin

Other Models

  • English - English (optimized)
  • East Slavic - Russian β€’ Bulgarian β€’ Ukrainian β€’ Belarusian
  • Korean - Korean
  • Chinese/Japanese - Simplified Chinese β€’ Traditional Chinese β€’ Pinyin β€’ Japanese (Hiragana, Katakana, Kanji)
  • Thai - Thai
  • Greek - Greek

πŸ“ Repository Structure

.
β”œβ”€β”€ detection/                    # Text detection (84 MB)
β”‚   β”œβ”€β”€ PP-OCRv5_server_det.onnx
β”‚   └── config.json
β”‚
β”œβ”€β”€ english/                      # English (7.5 MB)
β”‚   β”œβ”€β”€ en_PP-OCRv5_mobile_rec.onnx
β”‚   β”œβ”€β”€ ppocrv5_en_dict.txt
β”‚   └── config.json
β”‚
β”œβ”€β”€ latin/                        # 32 languages (7.5 MB)
β”‚   β”œβ”€β”€ latin_PP-OCRv5_mobile_rec.onnx
β”‚   β”œβ”€β”€ ppocrv5_latin_dict.txt
β”‚   └── config.json
β”‚
β”œβ”€β”€ eslav/                        # Russian/Ukrainian (7.5 MB)
β”‚   β”œβ”€β”€ eslav_PP-OCRv5_mobile_rec.onnx
β”‚   β”œβ”€β”€ ppocrv5_eslav_dict.txt
β”‚   └── config.json
β”‚
β”œβ”€β”€ korean/                       # Korean (13 MB)
β”‚   β”œβ”€β”€ korean_PP-OCRv5_mobile_rec.onnx
β”‚   β”œβ”€β”€ ppocrv5_korean_dict.txt
β”‚   └── config.json
β”‚
β”œβ”€β”€ chinese/                      # Chinese/Japanese (81 MB)
β”‚   β”œβ”€β”€ PP-OCRv5_server_rec.onnx
β”‚   β”œβ”€β”€ ppocrv5_dict.txt
β”‚   └── config.json
β”‚
β”œβ”€β”€ thai/                         # Thai (7.5 MB)
β”‚   β”œβ”€β”€ th_PP-OCRv5_mobile_rec.onnx
β”‚   β”œβ”€β”€ ppocrv5_th_dict.txt
β”‚   └── config.json
β”‚
β”œβ”€β”€ greek/                        # Greek (7.4 MB)
β”‚   β”œβ”€β”€ el_PP-OCRv5_mobile_rec.onnx
β”‚   β”œβ”€β”€ ppocrv5_el_dict.txt
β”‚   └── config.json
β”‚
└── preprocessing/                # Optional (43 MB)
    β”œβ”€β”€ doc-orientation/
    β”œβ”€β”€ textline-orientation/
    └── doc-unwarping/

Each model directory contains:

  • .onnx - The model file
  • .txt - Character dictionary
  • config.json - Model metadata

πŸ’‘ Why Use These Models?

βœ… Advantages

  1. ONNX Format - Fast inference, works on any platform (CPU/GPU)
  2. No PaddlePaddle Required - Just install rapidocr-onnxruntime
  3. 39+ Languages - Multilingual support out of the box
  4. Production Ready - All models tested and validated
  5. Complete Package - Detection + Recognition + Dictionaries included
  6. Well Documented - Every model has detailed config and usage info

πŸ“Š Performance

  • Speed: Fast inference on CPU (~100-300ms per image)
  • Accuracy: 30% improvement over PP-OCRv3
  • Size: Compact models (7-84 MB each)

πŸ› οΈ Advanced Usage

With GPU Acceleration

pip install onnxruntime-gpu

Models will automatically use GPU if available for 10x faster inference.

Batch Processing

from rapidocr_onnxruntime import RapidOCR
import glob

ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="latin/ppocrv5_latin_dict.txt"
)

# Process all images in a folder
for image_path in glob.glob("documents/*.jpg"):
    result, elapsed = ocr(image_path)
    print(f"Processed {image_path} in {elapsed:.2f}s")
    for line in result:
        print(f"  {line[1][0]}")

With Preprocessing (for rotated/distorted documents)

# Enable angle classification for rotated text
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="english/ppocrv5_en_dict.txt",
    use_angle_cls=True,
    angle_cls_model_path="preprocessing/textline-orientation/PP-LCNet_x1_0_textline_ori.onnx"
)

πŸ“– Model Details

How It Works

  1. Detection - Finds all text regions in the image
  2. Recognition - Reads text from each region using language-specific model
  3. Decoding - Converts model output to text using character dictionary

Model Specifications

  • Framework: Converted from PaddlePaddle to ONNX
  • ONNX Opset: 11
  • Precision: FP32
  • Input: RGB images (dynamic size)
  • Output: Text + confidence scores + bounding boxes

Accuracy Benchmarks

Tested on official PP-OCRv5 datasets:

  • Greek: 89.28%
  • Korean: 88.0%
  • English: 85.25%
  • Latin: 84.7%
  • Thai: 82.68%
  • East Slavic: 81.6%

🎯 Use Cases

  • Document Digitization - Scan and extract text from documents
  • Multilingual OCR - Process documents in 39+ languages
  • Mobile Apps - Lightweight models perfect for mobile deployment
  • Batch Processing - Process thousands of documents efficiently
  • Real-time OCR - Fast enough for real-time applications
  • Custom Pipelines - Integrate into your existing workflows

πŸ“ Language Selection Guide

Your Document Use This Model
English only english/
French, German, Spanish, Italian, etc. latin/ (best choice for European languages)
Russian, Bulgarian, Ukrainian, Belarusian eslav/
Korean korean/
Chinese or Japanese chinese/
Thai thai/
Greek greek/
Mixed European languages latin/ (supports 32 languages!)

Pro Tip: The latin/ model is the most versatile - it handles 32 different languages!


❓ FAQ

Q: Do I need PaddlePaddle installed?
A: No! These are ONNX models. Just install rapidocr-onnxruntime.

Q: Can I use GPU?
A: Yes! Install onnxruntime-gpu instead of onnxruntime.

Q: Which model should I use for French?
A: Use the latin/ model - it supports French and 31 other languages.

Q: Are these models free to use?
A: Yes! Licensed under Apache 2.0.

Q: How accurate are these models?
A: Very accurate! PP-OCRv5 has 30% better accuracy than PP-OCRv3.

Q: Can I use these commercially?
A: Yes! Apache 2.0 license allows commercial use.


πŸ”— Links


πŸ™ Credits


πŸ“„ License

Apache License 2.0 (inherited from PaddleOCR)

You are free to:

  • βœ… Use commercially
  • βœ… Modify
  • βœ… Distribute
  • βœ… Use privately

πŸ› Issues & Support

For issues with: