This document describes all available OCR models in this project.
All models are sourced from:
- OnnxOCR - ONNX-converted PaddleOCR models
- ppu-paddle-ocr - Additional English models
- ID:
ppocrv5-mobile - Type: Mobile (Fast)
- Best For: General purpose, balanced speed and accuracy
- Languages: English, Chinese
- Components:
- Detection:
ch_PP-OCRv5_det.onnx(4.6MB) - Recognition:
ch_PP-OCRv5_rec.onnx(16.5MB) - Classification:
ch_PP-OCRv5_cls.onnx(583KB)
- Detection:
- ID:
ppocrv4-mobile - Type: Mobile (Fast)
- Best For: Stable performance, smaller model size
- Languages: English, Chinese
- Components:
- Detection:
ch_PP-OCRv4_det.onnx(4.6MB) - Recognition:
ch_PP-OCRv4_rec.onnx(10.8MB) - Classification:
ch_PP-OCRv4_cls.onnx(583KB)
- Detection:
- ID:
ppocrv2-server - Type: Server (Accurate)
- Best For: High accuracy detection, slower processing
- Languages: Chinese (primary)
- Components:
- Detection:
ch_ppocr_server_v2.0_det.onnx(47MB) - Recognition:
ch_PP-OCRv4_rec.onnx(10.8MB) - Fallback from v4 - Classification:
ch_ppocr_server_v2.0_cls.onnx(583KB)
- Detection:
- ID:
en-mobile - Type: Mobile (Very Fast)
- Best For: English-only text, fastest processing
- Languages: English
- Components:
- Detection:
PP-OCRv5_mobile_det.onnx(4.7MB) - Recognition:
en_PP-OCRv4_mobile_rec.onnx(8.6MB) - Classification:
ch_PP-OCRv4_cls.onnx(583KB)
- Detection:
All models follow the PaddleOCR architecture:
-
Detection Model (DB)
- Uses Differentiable Binarization (DB) algorithm
- Outputs probability maps for text regions
- Input: RGB image (any size, resized to multiple of 32)
- Output: Binary map of text regions
-
Recognition Model (CRNN)
- Uses CRNN + CTC for text recognition
- Processes cropped text regions
- Input: Grayscale image (48px height)
- Output: Character sequence probabilities
-
Classification Model (CNN)
- Detects text orientation (0° or 180°)
- Used to correct upside-down text
- Input: Text region (48x192px)
- Output: Orientation probability
Each model configuration includes a dictionary file that maps output indices to characters:
ppocr_keys_v1.txt- Standard PaddleOCR dictionary (6623 chars)ppocrv5_dict.txt- Extended v5 dictionaryen_dict.txt- English-only dictionary (96 chars)
| Model | Detection Speed | Recognition Speed | Overall Accuracy |
|---|---|---|---|
| v5 Mobile | Fast | Fast | 95%+ |
| v4 Mobile | Fast | Very Fast | 93%+ |
| v2 Server | Slow | Fast* | 97%+ |
| English Mobile | Very Fast | Very Fast | 94%+ (English) |
*v2 Server uses v4 recognition model as fallback
import { getModelConfig } from './core/model-config'
// Get default model (v5 mobile)
const defaultConfig = getModelConfig('ppocrv5-mobile')
// Initialize with specific model
await engine.initialize(defaultConfig.paths, defaultConfig.id)