paddleocr-onnx / README.md

s-emanuilov

Create README.md

2642e18 verified 2 months ago

preview code

raw

history blame

11.7 kB

PP-OCRv5 ONNX Models

Fast and accurate multilingual OCR models from PaddleOCR, converted to ONNX format for easy deployment.

Original Models: PaddlePaddle PP-OCRv5 Collection
Converted by: Community contribution
Format: ONNX (optimized for inference)
License: Apache 2.0

🎯 What's Inside

This repository contains 11 production-ready ONNX models:

1 Detection Model - Finds text in images (works with all languages)
7 Recognition Models - Reads text in 39+ languages
3 Preprocessing Models - Fixes rotated or distorted documents (optional)

Total Size: ~258 MB
Languages: English, French, German, Spanish, Italian, Portuguese, Russian, Ukrainian, Korean, Chinese, Japanese, Thai, Greek, and 25+ more!

🚀 Quick Start

Installation

pip install rapidocr-onnxruntime

That's it! No PaddlePaddle, no CUDA required. Works on CPU out of the box.

Basic Usage - English

from rapidocr_onnxruntime import RapidOCR

# Initialize OCR
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="english/ppocrv5_en_dict.txt"
)

# Run OCR
result, elapsed = ocr("your_image.jpg")

# Print results
for line in result:
    text = line[1][0]  # Extracted text
    confidence = line[1][1]  # Confidence score
    print(f"{text} (confidence: {confidence:.2%})")

Other Languages

Just change the model paths:

# French, German, Spanish, Italian, etc. (32 languages)
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="latin/ppocrv5_latin_dict.txt"
)

# Russian, Bulgarian, Ukrainian, Belarusian
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="eslav/eslav_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="eslav/ppocrv5_eslav_dict.txt"
)

# Korean
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="korean/korean_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="korean/ppocrv5_korean_dict.txt"
)

# Chinese / Japanese
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="chinese/PP-OCRv5_server_rec.onnx",
    rec_keys_path="chinese/ppocrv5_dict.txt"
)

# Thai
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="thai/th_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="thai/ppocrv5_th_dict.txt"
)

# Greek
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="greek/el_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="greek/ppocrv5_el_dict.txt"
)

📦 Available Models

Text Recognition Models

Model	Languages	Accuracy	Size	Best For
english/	English	85.25%	7.5 MB	English documents
latin/	French, German, Spanish, Italian, Portuguese, Dutch, Polish, Czech, + 24 more	84.7%	7.5 MB	European documents
eslav/	Russian, Bulgarian, Ukrainian, Belarusian, English	81.6%	7.5 MB	Cyrillic scripts
korean/	Korean, English	88.0%	13 MB	Korean documents
chinese/	Chinese, Japanese, English	-	81 MB	CJK documents
thai/	Thai, English	82.68%	7.5 MB	Thai documents
greek/	Greek, English	89.28%	7.4 MB	Greek documents

Detection Model

detection/ - Universal text detection (84 MB) - Works with all languages

Preprocessing Models (Optional)

Enhance OCR accuracy on challenging documents:

preprocessing/doc-orientation/ - Fixes rotated documents (6.5 MB, 99.06% accuracy)
preprocessing/textline-orientation/ - Fixes upside-down text (6.5 MB, 98.85% accuracy)
preprocessing/doc-unwarping/ - Fixes curved/warped pages (30 MB)

🌍 Supported Languages (39+)

Latin Model (32 languages)

English • French • German • Spanish • Italian • Portuguese • Dutch • Polish • Czech • Slovak • Croatian • Bosnian • Serbian (Latin) • Slovenian • Danish • Norwegian • Swedish • Icelandic • Estonian • Lithuanian • Hungarian • Albanian • Welsh • Irish • Turkish • Indonesian • Malay • Afrikaans • Swahili • Tagalog • Uzbek • Latin

Other Models

English - English (optimized)
East Slavic - Russian • Bulgarian • Ukrainian • Belarusian
Korean - Korean
Chinese/Japanese - Simplified Chinese • Traditional Chinese • Pinyin • Japanese (Hiragana, Katakana, Kanji)
Thai - Thai
Greek - Greek

📁 Repository Structure

.
├── detection/                    # Text detection (84 MB)
│   ├── PP-OCRv5_server_det.onnx
│   └── config.json
│
├── english/                      # English (7.5 MB)
│   ├── en_PP-OCRv5_mobile_rec.onnx
│   ├── ppocrv5_en_dict.txt
│   └── config.json
│
├── latin/                        # 32 languages (7.5 MB)
│   ├── latin_PP-OCRv5_mobile_rec.onnx
│   ├── ppocrv5_latin_dict.txt
│   └── config.json
│
├── eslav/                        # Russian/Ukrainian (7.5 MB)
│   ├── eslav_PP-OCRv5_mobile_rec.onnx
│   ├── ppocrv5_eslav_dict.txt
│   └── config.json
│
├── korean/                       # Korean (13 MB)
│   ├── korean_PP-OCRv5_mobile_rec.onnx
│   ├── ppocrv5_korean_dict.txt
│   └── config.json
│
├── chinese/                      # Chinese/Japanese (81 MB)
│   ├── PP-OCRv5_server_rec.onnx
│   ├── ppocrv5_dict.txt
│   └── config.json
│
├── thai/                         # Thai (7.5 MB)
│   ├── th_PP-OCRv5_mobile_rec.onnx
│   ├── ppocrv5_th_dict.txt
│   └── config.json
│
├── greek/                        # Greek (7.4 MB)
│   ├── el_PP-OCRv5_mobile_rec.onnx
│   ├── ppocrv5_el_dict.txt
│   └── config.json
│
└── preprocessing/                # Optional (43 MB)
    ├── doc-orientation/
    ├── textline-orientation/
    └── doc-unwarping/

Each model directory contains:

.onnx - The model file
.txt - Character dictionary
config.json - Model metadata

💡 Why Use These Models?

✅ Advantages

ONNX Format - Fast inference, works on any platform (CPU/GPU)
No PaddlePaddle Required - Just install rapidocr-onnxruntime
39+ Languages - Multilingual support out of the box
Production Ready - All models tested and validated
Complete Package - Detection + Recognition + Dictionaries included
Well Documented - Every model has detailed config and usage info

📊 Performance

Speed: Fast inference on CPU (~100-300ms per image)
Accuracy: 30% improvement over PP-OCRv3
Size: Compact models (7-84 MB each)

🛠️ Advanced Usage

With GPU Acceleration

pip install onnxruntime-gpu

Models will automatically use GPU if available for 10x faster inference.

Batch Processing

from rapidocr_onnxruntime import RapidOCR
import glob

ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="latin/ppocrv5_latin_dict.txt"
)

# Process all images in a folder
for image_path in glob.glob("documents/*.jpg"):
    result, elapsed = ocr(image_path)
    print(f"Processed {image_path} in {elapsed:.2f}s")
    for line in result:
        print(f"  {line[1][0]}")

With Preprocessing (for rotated/distorted documents)

# Enable angle classification for rotated text
ocr = RapidOCR(
    det_model_path="detection/PP-OCRv5_server_det.onnx",
    rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
    rec_keys_path="english/ppocrv5_en_dict.txt",
    use_angle_cls=True,
    angle_cls_model_path="preprocessing/textline-orientation/PP-LCNet_x1_0_textline_ori.onnx"
)

📖 Model Details

How It Works

Detection - Finds all text regions in the image
Recognition - Reads text from each region using language-specific model
Decoding - Converts model output to text using character dictionary

Model Specifications

Framework: Converted from PaddlePaddle to ONNX
ONNX Opset: 11
Precision: FP32
Input: RGB images (dynamic size)
Output: Text + confidence scores + bounding boxes

Accuracy Benchmarks

Tested on official PP-OCRv5 datasets:

Greek: 89.28%
Korean: 88.0%
English: 85.25%
Latin: 84.7%
Thai: 82.68%
East Slavic: 81.6%

🎯 Use Cases

Document Digitization - Scan and extract text from documents
Multilingual OCR - Process documents in 39+ languages
Mobile Apps - Lightweight models perfect for mobile deployment
Batch Processing - Process thousands of documents efficiently
Real-time OCR - Fast enough for real-time applications
Custom Pipelines - Integrate into your existing workflows

📝 Language Selection Guide

Your Document	Use This Model
English only	`english/`
French, German, Spanish, Italian, etc.	`latin/` (best choice for European languages)
Russian, Bulgarian, Ukrainian, Belarusian	`eslav/`
Korean	`korean/`
Chinese or Japanese	`chinese/`
Thai	`thai/`
Greek	`greek/`
Mixed European languages	`latin/` (supports 32 languages!)

Pro Tip: The latin/ model is the most versatile - it handles 32 different languages!

❓ FAQ

Q: Do I need PaddlePaddle installed?
A: No! These are ONNX models. Just install rapidocr-onnxruntime.

Q: Can I use GPU?
A: Yes! Install onnxruntime-gpu instead of onnxruntime.

Q: Which model should I use for French?
A: Use the latin/ model - it supports French and 31 other languages.

Q: Are these models free to use?
A: Yes! Licensed under Apache 2.0.

Q: How accurate are these models?
A: Very accurate! PP-OCRv5 has 30% better accuracy than PP-OCRv3.

Q: Can I use these commercially?
A: Yes! Apache 2.0 license allows commercial use.

🔗 Links

Original Models: PaddlePaddle PP-OCRv5 Collection
PaddleOCR GitHub: github.com/PaddlePaddle/PaddleOCR
Documentation: PaddleOCR Docs
RapidOCR: github.com/RapidAI/RapidOCR
ONNX Runtime: onnxruntime.ai

🙏 Credits

Original Models: PaddlePaddle Team
Conversion: Community contribution using paddle2onnx
Based on: PP-OCRv5 Official Collection

📄 License

Apache License 2.0 (inherited from PaddleOCR)

You are free to:

✅ Use commercially
✅ Modify
✅ Distribute
✅ Use privately

🐛 Issues & Support

For issues with:

These ONNX models: Open an issue in this repository
Original PaddleOCR models: PaddleOCR Issues
ONNX Runtime: onnxruntime Issues