- Asynchronous functionality of Doc2X is supported. - Processes designed specifically for Doc2X, the content of each page of the processed pdf is the source document corresponding to the number of pages.
0.0.4
- Add Doc2X as a new OCR engine - Also can use Doc2X's conversion function alone:`from pdfdeal.doc2x import Doc2x`
0.0.3
- Build bug fix - Now can use `easyocr` or `pytesseract` as the OCR engine or just skip OCR. - Improved package installation. - Fixed the bug of outputting PDF without line breaks.
0.0.2
- Now can use `easyocr` or `pytesseract` as the OCR engine or just skip OCR. - Improved package installation. - Fixed the bug of outputting PDF without line breaks.