<p align="center">
<img src="https://user-images.githubusercontent.com/76527547/135670324-5fee4530-26f9-413b-b6e0-282cdfbd746a.gif" width="50%">
</p>
Highlights of the release:
**Note**: doctr 0.6.0 requires either TensorFlow >= 2.9.0 or PyTorch >= 1.8.0.
Full integration with Huggingface Hub (docTR meets Huggingface)
![hf](https://assets.st-note.com/production/uploads/images/35450010/rectangle_large_type_2_7f287c8bb8ad90f69c4a537719b32ace.png?fit=bounds&quality=85&width=1280)
- Loading from hub:
from doctr.io import DocumentFile
from doctr.models import ocr_predictor, from_hub
image = DocumentFile.from_images(['data/example.jpg'])
Load a custom detection model from huggingface hub
det_model = from_hub('Felix92/doctr-torch-db-mobilenet-v3-large')
Load a custom recognition model from huggingface hub
reco_model = from_hub('Felix92/doctr-torch-crnn-mobilenet-v3-large-french')
You can easily plug in this models to the OCR predictor
predictor = ocr_predictor(det_arch=det_model, reco_arch=reco_model)
result = predictor(image)
- Pushing to the hub:
from doctr.models import recognition, login_to_hub, push_to_hf_hub
login_to_hub()
my_awesome_model = recognition.crnn_mobilenet_v3_large(pretrained=True)
push_to_hf_hub(my_awesome_model, model_name='doctr-crnn-mobilenet-v3-large-french-v1', task='recognition', arch='crnn_mobilenet_v3_large')
Documentation: https://mindee.github.io/doctr/using_doctr/sharing_models.html
Predefined datasets can be used also for recognition task
from doctr.datasets import CORD
Crop boxes as is (can contain irregular)
train_set = CORD(train=True, download=True, recognition_task=True)
Crop rotated boxes (always regular)
train_set = CORD(train=True, download=True, use_polygons=True, recognition_task=True)
img, target = train_set[0]
Documentation: https://mindee.github.io/doctr/using_doctr/using_datasets.html
New models (both frameworks)
- classification: VisionTransformer (ViT)
- recognition: Vision Transformer for Scene Text Recognition (ViTSTR)
Bug fixes recognition models
- MASTER and SAR architectures are now operational in both frameworks (TensorFlow and PyTorch)
ONNX support (experimential)
- All models can now be exported into ONNX format (only TF mobilenet left for 0.7.0)
NOTE: full production pipeline with ONNX / build is planned for 0.7.0 (the models can be only exported up to the logits without any post processing included)
Further features
- our demo is now also PyTorch compatible, thanks to odulcy-mindee
- it is now possible to detect the language of the extracted text, thanks to aminemindee
What's Changed
Breaking Changes ðŸ›
* feat: :sparkles: allow beam width > 1 in the CRNN postprocessor by khalidMindee in https://github.com/mindee/doctr/pull/630
* [Fix] TensorFlow SAR_Resnet31 implementation by felixdittrich92 in https://github.com/mindee/doctr/pull/925
New Features
* [onnx] classification models export by felixdittrich92 in https://github.com/mindee/doctr/pull/830
* feat: Added Vietnamese entry in VOCAB by calibretaliation in https://github.com/mindee/doctr/pull/878
* feat: Added Czech to the set of vocabularies in datasets/vocabs.py by Xargonus in https://github.com/mindee/doctr/pull/885
* feat: Add ability to upload PT/TF models to Huggingface Hub by felixdittrich92 in https://github.com/mindee/doctr/pull/881
* [feature][tf/pt] integrate from_hub for all tasks by felixdittrich92 in https://github.com/mindee/doctr/pull/892
* [feature] Part 2 from use datasets for recognition by felixdittrich92 in https://github.com/mindee/doctr/pull/891
* [datasets] Add MJSynth (Synth90K) by felixdittrich92 in https://github.com/mindee/doctr/pull/827
* [docu]: add documentation for datasets by felixdittrich92 in https://github.com/mindee/doctr/pull/905
* add a Slack Community badge by fharper in https://github.com/mindee/doctr/pull/936
* Feat/add language detection by aminemindee in https://github.com/mindee/doctr/pull/1023
* add ViT as classification model TF and PT by felixdittrich92 in https://github.com/mindee/doctr/pull/1050
* [models] add ViTSTR TF and PT and update ViT to work as backbone by felixdittrich92 in https://github.com/mindee/doctr/pull/1055
Bug Fixes
* [PyTorch][references] fix pretrained with different vocabs by felixdittrich92 in https://github.com/mindee/doctr/pull/874
* [classification] Fix cfgs by felixdittrich92 in https://github.com/mindee/doctr/pull/883
* docs: Fixed typo in installation instructions by frgfm in https://github.com/mindee/doctr/pull/901
* [Fix] imgur5k test by felixdittrich92 in https://github.com/mindee/doctr/pull/903
* fix: Fixed load_pretrained_params in PyTorch when ignoring keys by frgfm in https://github.com/mindee/doctr/pull/902
* [Fix]: Documentation add missing in vocabs and correct tab in sharing models by felixdittrich92 in https://github.com/mindee/doctr/pull/904
* Fix links in readme by jsn5 in https://github.com/mindee/doctr/pull/937
* [Fix] PyTorch MASTER implementation by felixdittrich92 in https://github.com/mindee/doctr/pull/941
* [Fix] MJSynth dataset: filter corrupted or missing images by felixdittrich92 in https://github.com/mindee/doctr/pull/956
* [Fix] SVT dataset: clip box values and add shape and label check by felixdittrich92 in https://github.com/mindee/doctr/pull/955
* [Fix] Tensorflow MASTER implementation by felixdittrich92 in https://github.com/mindee/doctr/pull/949
* [FIX] MASTER AMP and onnxruntime issue with master PT by felixdittrich92 in https://github.com/mindee/doctr/pull/986
* pytest-api test: fix ping server step by odulcy-mindee in https://github.com/mindee/doctr/pull/997
* docs/index: fix two minor typos by mara004 in https://github.com/mindee/doctr/pull/1002
* Fix orientation details export by aminemindee in https://github.com/mindee/doctr/pull/1022
* Changed return type of multithread_exec to iterator by mtvch in https://github.com/mindee/doctr/pull/1019
* [datasets] Fix recognition parts of SynthText and IMGUR5K by felixdittrich92 in https://github.com/mindee/doctr/pull/1038
* [Fix] rotation classifier input move to model device by felixdittrich92 in https://github.com/mindee/doctr/pull/1039
* [models] Vit: fix intermediate size scale and unify TF to PT by felixdittrich92 in https://github.com/mindee/doctr/pull/1063
Improvements
* chore: Applied post release modifications v0.5.1 by felixdittrich92 in https://github.com/mindee/doctr/pull/870
* [refactor][fix]: Part1 from use datasets for recognition task by felixdittrich92 in https://github.com/mindee/doctr/pull/889
* ci: Add swagger ping in API CI job by frgfm in https://github.com/mindee/doctr/pull/906
* [docs] Add naming conventions for upload models to hf hub by felixdittrich92 in https://github.com/mindee/doctr/pull/921
* docs: Improved error message of encode_string by frgfm in https://github.com/mindee/doctr/pull/929
* [Refactor] PyTorch SAR_Resnet31 make it ONNX exportable (again) by felixdittrich92 in https://github.com/mindee/doctr/pull/930
* Add support page in README by jonathanMindee in https://github.com/mindee/doctr/pull/946
* [references] Add eval recognition and update eval detection scripts by felixdittrich92 in https://github.com/mindee/doctr/pull/933
* update pypdfium2 dep and improve code quality by felixdittrich92 in https://github.com/mindee/doctr/pull/953
* docs: Moved need help section after code snippet by frgfm in https://github.com/mindee/doctr/pull/959
* chore: Updated TF requirements to fix grouped convolutions on CPU by frgfm in https://github.com/mindee/doctr/pull/963
* style: Fixed mypy and moved tool configs to pyproject.toml by frgfm in https://github.com/mindee/doctr/pull/966
* Updating the readme by Atomme1 in https://github.com/mindee/doctr/pull/938
* Update docs in `using_doctr` by odulcy-mindee in https://github.com/mindee/doctr/pull/993
* feat: add a basic example of text detection by ianardee in https://github.com/mindee/doctr/pull/999
* Add pytorch demo by odulcy-mindee in https://github.com/mindee/doctr/pull/1008
* [build] move requirements to pyproject.toml by felixdittrich92 in https://github.com/mindee/doctr/pull/1031
* Migrate static data from github to monitoring middleware. by marvinmindee in https://github.com/mindee/doctr/pull/1033
* Changes needed to be able to use doctr on AWS Lambda by mtvch in https://github.com/mindee/doctr/pull/1017
* [Fix] unify recognition dataset parts return signature by felixdittrich92 in https://github.com/mindee/doctr/pull/1041
* Updated README.md for custom fonts by carl-krikorian in https://github.com/mindee/doctr/pull/1051
* [refactor] detection script by felixdittrich92 in https://github.com/mindee/doctr/pull/1060
* [models] ViT add checkpoints and some rework to use pretrained ViT backbone in ViTSTR by felixdittrich92 in https://github.com/mindee/doctr/pull/1072
* upgrade pypdfium2 by felixdittrich92 in https://github.com/mindee/doctr/pull/1075
* ViTSTR disable pretrained backbone by default by felixdittrich92 in https://github.com/mindee/doctr/pull/1080
Miscellaneous
* [Refactor] commit tags by felixdittrich92 in https://github.com/mindee/doctr/pull/871
* Update `io/pdf.py` to new pypdfium2 API by mara004 in https://github.com/mindee/doctr/pull/944
* docs: Documentation the reason for keras version specifier by frgfm in https://github.com/mindee/doctr/pull/958
* [datasets] update IC / SROIE / FUNSD / CORD by felixdittrich92 in https://github.com/mindee/doctr/pull/983
* [datasets] revert whitespace filtering and fix svhn reco by felixdittrich92 in https://github.com/mindee/doctr/pull/987
* fix: update tensorflow-addons to match tensorflow version by ianardee in https://github.com/mindee/doctr/pull/998
* move transformers implementation to modules by felixdittrich92 in https://github.com/mindee/doctr/pull/1013
* [FIX] revert dev deps mistake by felixdittrich92 in https://github.com/mindee/doctr/pull/1047
* [models] update vit and transformer layer norm by felixdittrich92 in https://github.com/mindee/doctr/pull/1059
* make pretrained backbone flexible in predictor by felixdittrich92 in https://github.com/mindee/doctr/pull/1061
* handle LocalizationConfusion memory consuption and upgrade min weasyprint version by felixdittrich92 in https://github.com/mindee/doctr/pull/1062
* Fixed small typo in references recognition by carl-krikorian in https://github.com/mindee/doctr/pull/1070
* [docs] install extras for MacBooks with M1 chip by felixdittrich92 in https://github.com/mindee/doctr/pull/1076
* update version for minor release by felixdittrich92 in https://github.com/mindee/doctr/pull/1073
New Contributors
* calibretaliation made their first contribution in https://github.com/mindee/doctr/pull/878
* Xargonus made their first contribution in https://github.com/mindee/doctr/pull/885
* khalidMindee made their first contribution in https://github.com/mindee/doctr/pull/630
* frgfm made their first contribution in https://github.com/mindee/doctr/pull/901
* jsn5 made their first contribution in https://github.com/mindee/doctr/pull/937
* fharper made their first contribution in https://github.com/mindee/doctr/pull/936
* jonathanMindee made their first contribution in https://github.com/mindee/doctr/pull/946
* Atomme1 made their first contribution in https://github.com/mindee/doctr/pull/938
* odulcy-mindee made their first contribution in https://github.com/mindee/doctr/pull/993
* ianardee made their first contribution in https://github.com/mindee/doctr/pull/998
* aminemindee made their first contribution in https://github.com/mindee/doctr/pull/1022
* mtvch made their first contribution in https://github.com/mindee/doctr/pull/1019
* marvinmindee made their first contribution in https://github.com/mindee/doctr/pull/1033
* carl-krikorian made their first contribution in https://github.com/mindee/doctr/pull/1051
**Full Changelog**: https://github.com/mindee/doctr/compare/v0.5.1...v0.6.0