- Fix performance issue when processing PDF with large number of pages - Complete overhaul of the borderless table detection algorithm
0.0.19
Not secure
- Fix bug when text size from OCR can not be computed - Fix bug for image rotation when no text is retrieved from OCR - Improve bordered table cell creation - Add PSM option for TesseractOCR
0.0.18
Not secure
- Patch breaking change in polars update - Add dependabot in project
0.0.17
Not secure
- Add support for Python 3.11 - excluding PaddleOCR - Fix bug on borderless tables
0.0.16
Not secure
**Bordered tables** - Adapt line detection parameters with OCR results if available - Fix bug in cell normalization causing malformed output tables
0.0.15
Not secure
- Addition of installation validations before using Tesseract - Fix issues with image preprocessing before table identification - Borderless tables : - Improvements on row detection for multiline rows - Improved consistency of table extraction accross all OCRs