Added
- Support for Python 3.8.
([86](https://github.com/HazyResearch/pdftotree/pull/86), [HiromuHota][HiromuHota])
Changed
- Switch the output format from "HTML-like" to hOCR.
([62](https://github.com/HazyResearch/pdftotree/pull/62), [HiromuHota][HiromuHota])
- Loosen Keras' version restriction, which is now unnecessarily strict.
([68](https://github.com/HazyResearch/pdftotree/pull/68), [HiromuHota][HiromuHota])
- Greedily extract contents from PDF even if it looks scanned.
([71](https://github.com/HazyResearch/pdftotree/pull/71), [HiromuHota][HiromuHota])
- Upgrade Keras to 2.4.0 or later (and TensorFlow 2.2 or later).
([86](https://github.com/HazyResearch/pdftotree/pull/86), [HiromuHota][HiromuHota])
Removed
- Remove "favor_figures" option and extract everything.
([77](https://github.com/HazyResearch/pdftotree/pull/77), [HiromuHota][HiromuHota])
- Remove "dry_run" option.
([89](https://github.com/HazyResearch/pdftotree/pull/89), [HiromuHota][HiromuHota])
Fixed
- Fix a bug that an html file is not created at a given path.
([64](https://github.com/HazyResearch/pdftotree/pull/64), [HiromuHota][HiromuHota])
- Extract LTChar even if they are not children of LTTextLine.
([79](https://github.com/HazyResearch/pdftotree/pull/79), [HiromuHota][HiromuHota])