Amazon-textract-textractor

Latest version: v1.8.5

Safety actively analyzes 681844 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 8

1.7.0

What's Changed

* Loosen XlsxWriter version constraints by mdscruggs in https://github.com/aws-samples/amazon-textract-textractor/pull/292
* Rework the linearization heuristic to ensure that no words are missing or duplicated
* Fix KeyValues being assigned twice on overlapping table cells, going forward KVs inside a tables are ignored (table structure takes precedence)
* Hardens parser code against missing children in layouts or KeyValues with missing keys
* Fix markdown tables not having header rows when one of the cell is empty
* Add support for Python 3.11 and 3.12 in the GitHub action workflows
* Add `textractor.__version__` to allow easier identification of the installed Textractor version in code
* Added hide_table_layout
* Remove amazon-textract-response-parser as a dependency as its use for validating the input schema could add +200 ms of latency in some cases. Textractor-only parsing takes <30ms.

Breaking changes
* Remove `linearize_table` and `linearize_key_value` from `TextLinearizationConfig` as both were not used
* Remove the `s3_output_path` parameter from `analyze_expense` as the API does not support outputting to S3

New Contributors
* mdscruggs made their first contribution in https://github.com/aws-samples/amazon-textract-textractor/pull/292

**Full Changelog**: https://github.com/aws-samples/amazon-textract-textractor/compare/v1.6.1...v1.7.0

1.6.1

What's new

- Fix bug in table to markdown

**Full Changelog**: https://github.com/aws-samples/amazon-textract-textractor/compare/v1.6.0...v1.6.1

1.6.0

What's Changed
* Fix selection elements in table by Belval in https://github.com/aws-samples/amazon-textract-textractor/pull/289


**Full Changelog**: https://github.com/aws-samples/amazon-textract-textractor/compare/v1.5.0...v1.6.0

1.5.0

What's Changed

* Add GetResult from S3 in LazyDocument
* Add more linearization formatting options
* Fix exception thrown when a CHILD relationships maps to a non-existent LINE

**Full Changelog**: https://github.com/aws-samples/amazon-textract-textractor/compare/v1.4.5...v1.5.0

1.4.5

What's Changed

* Fix missing words in get_text_and_words by Belval in https://github.com/aws-samples/amazon-textract-textractor/pull/270


**Full Changelog**: https://github.com/aws-samples/amazon-textract-textractor/compare/v1.4.4...v1.4.5

1.4.4

What's Changed

* Add page_layout property to Page object by Belval in https://github.com/aws-samples/amazon-textract-textractor/pull/268


**Full Changelog**: https://github.com/aws-samples/amazon-textract-textractor/compare/v1.4.3...v1.4.4

Page 4 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.