onnxtr Changelog

0.4.0

<img src="https://github.com/felixdittrich92/OnnxTR/blob/main/docs/images/logo.jpg" width="50%">


What's Changed

- Sync with current docTR state
- Hf hub integration

HuggingFace Hub integration

Now you can load and/or push models to the hub directly.

Loading

python
from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor, from_hub

img = DocumentFile.from_images(['<image_path>'])
Load your model from the hub
model = from_hub('onnxtr/my-model')

Pass it to the predictor
If your model is a recognition model:
predictor = ocr_predictor(
det_arch='db_mobilenet_v3_large',
reco_arch=model
)

If your model is a detection model:
predictor = ocr_predictor(
det_arch=model,
reco_arch='crnn_mobilenet_v3_small'
)

Get your predictions
res = predictor(img)

Push

python
from onnxtr.models import parseq, push_to_hf_hub, login_to_hub
from onnxtr.utils.vocabs import VOCABS

Login to the hub
login_to_hub()

Recogniton model
model = parseq("~/onnxtr-parseq-multilingual-v1.onnx", vocab=VOCABS["multilingual"])
push_to_hf_hub(
model,
model_name="onnxtr-parseq-multilingual-v1",
task="recognition", The task for which the model is intended [detection, recognition, classification]
arch="parseq", The name of the model architecture
override=False Set to `True` if you want to override an existing model / repository
)

Detection model
model = linknet_resnet18("~/onnxtr-linknet-resnet18.onnx")
push_to_hf_hub(
model,
model_name="onnxtr-linknet-resnet18",
task="detection",
arch="linknet_resnet18",
override=True
)

HF Hub search: [here](https://huggingface.co/models?search=onnxtr).

Collection: [here](https://huggingface.co/collections/Felix92/onnxtr-66bf213a9f88f7346c90e842)

**Full Changelog**: https://github.com/felixdittrich92/OnnxTR/compare/v0.3.2...v0.4.0

0.3.2

<img src="https://github.com/felixdittrich92/OnnxTR/blob/main/docs/images/logo.jpg" width="50%">


What's Changed

- Fix: Resize transformation / interpolation adjusted to docTR (10 22)

**Full Changelog**: https://github.com/felixdittrich92/OnnxTR/compare/v0.3.1...v0.3.2

0.3.1

<img src="https://github.com/felixdittrich92/OnnxTR/blob/main/docs/images/logo.jpg" width="50%">


What's Changed

- Minor configuration fix for CUDAExecutionProvider
- Adjusted default batch sizes
- avoid init EngineConfig multiple times

**Full Changelog**: https://github.com/felixdittrich92/OnnxTR/compare/v0.3.0...v0.3.1

0.3.0

<img src="https://github.com/felixdittrich92/OnnxTR/blob/main/docs/images/logo.jpg" width="50%">


What's Changed

- Sync with current docTR state
- Added advanced options to configure the underlying execution engine
- Added new `db_mobilenet_v3_large` converted models (fp32 & 8bit)

Advanced engine configuration

python
from onnxruntime import SessionOptions

from onnxtr.models import ocr_predictor, EngineConfig

general_options = SessionOptions() For configuartion options see: https://onnxruntime.ai/docs/api/python/api_summary.html#sessionoptions
general_options.enable_cpu_mem_arena = False

NOTE: The following would force to run only on the GPU if no GPU is available it will raise an error
List of strings e.g. ["CUDAExecutionProvider", "CPUExecutionProvider"] or a list of tuples with the provider and its options e.g.
[("CUDAExecutionProvider", {"device_id": 0}), ("CPUExecutionProvider", {"arena_extend_strategy": "kSameAsRequested"})]
providers = [("CUDAExecutionProvider", {"device_id": 0})] For available providers see: https://onnxruntime.ai/docs/execution-providers/

engine_config = EngineConfig(
session_options=general_options,
providers=providers
)
We use the default predictor with the custom engine configuration
NOTE: You can define different engine configurations for detection, recognition and classification depending on your needs
predictor = ocr_predictor(
det_engine_cfg=engine_config,
reco_engine_cfg=engine_config,
clf_engine_cfg=engine_config
)

**Full Changelog**: https://github.com/felixdittrich92/OnnxTR/compare/v0.2.0...v0.3.0

0.2.0

<img src="https://github.com/felixdittrich92/OnnxTR/blob/main/docs/images/logo.jpg" width="50%">


What's Changed

- Added 8-Bit quantized models
- Added Dockerfile and CI for CPU/GPU Usage

8-Bit quantized models

8-Bit quantized variants of all models was added (expect: the FAST models - which are already reparameterized)

python3
from onnxtr.models import ocr_predictor, detection_predictor, recognition_predictor

predictor = ocr_predictor(det_arch="db_resnet50", reco_arch="crnn_vgg16_bn", load_in_8_bit=True)

det_predictor = detection_predictor("db_resnet50", load_in_8_bit=True)
reco_predictor = recognition_predictor("parseq", load_in_8_bit=True)

- CPU benchmarks:

|Library |FUNSD (199 pages) |CORD (900 pages) |
|--------------------------------|-------------------------------|-------------------------------|
|docTR (CPU) - v0.8.1 | ~1.29s / Page | ~0.60s / Page |
|OnnxTR (CPU) - v0.1.2 | ~0.57s / Page | ~0.25s / Page |
|OnnxTR (CPU) 8-bit - v0.1.2 | ~0.38s / Page | ~0.14s / Page |
|EasyOCR (CPU) - v1.7.1 | ~1.96s / Page | ~1.75s / Page |
|PyTesseract (CPU) - v0.3.10 | ~0.50s / Page | ~0.52s / Page |
|Surya (line) (CPU) - v0.4.4 | ~48.76s / Page | ~35.49s / Page |

0.1.2

This release:

- Fix some typos
- update Readme and add a first minimal benchmark
- clean build dependencies

Onnxtr

Page 2 of 3