Unstructured

Latest version: v0.17.2

Safety actively analyzes 723607 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 19 of 39

0.9.2

Not secure
Enhancements

* Update table extraction section in API documentation to sync with change in Prod API
* Update Notion connector to extract to html
* Added UUID option for `element_id`
* Bump unstructured-inference==0.5.9:
- better caching of models
- another version of detectron2 available, though the default layout model is unchanged
* Added UUID option for element_id
* Added UUID option for element_id
* CI improvements to run ingest tests in parallel

Features

* Adds Sharepoint connector.

Fixes

* Bump unstructured-inference==0.5.9:
- ignores Tesseract errors where no text is extracted for tiles that indeed, have no text

0.9.1

Not secure
Enhancements

* Adds --partition-pdf-infer-table-structure to unstructured-ingest.
* Enable `partition_html` to skip headers and footers with the `skip_headers_and_footers` flag.
* Update `partition_doc` and `partition_docx` to track emphasized texts in the output
* Adds post processing function `filter_element_types`
* Set the default strategy for partitioning images to `hi_res`
* Add page break parameter section in API documentation to sync with change in Prod API
* Update `partition_html` to track emphasized texts in the output
* Update `XMLDocument._read_xml` to create `<p>` tag element for the text enclosed in the `<pre>` tag
* Add parameter `include_tail_text` to `_construct_text` to enable (skip) tail text inclusion
* Add Notion connector

Features

Fixes

* Remove unused `_partition_via_api` function
* Fixed emoji bug in `partition_xlsx`.
* Pass `file_filename` metadata when partitioning file object
* Skip ingest test on missing Slack token
* Add Dropbox variables to CI environments
* Remove default encoding for ingest
* Adds new element type `EmailAddress` for recognising email address in the text
* Simplifies `min_partition` logic; makes partitions falling below the `min_partition`
less likely.
* Fix bug where ingest test check for number of files fails in smoke test
* Fix unstructured-ingest entrypoint failure

0.9.0

Not secure
Enhancements

* Dependencies are now split by document type, creating a slimmer base installation.

0.8.8

Not secure
Enhancements

Features

Fixes

* Rename "date" field to "last_modified"
* Adds Box connector

Fixes

0.8.7

Not secure
Enhancements

* Put back useful function `split_by_paragraph`

Features

Fixes

* Fix argument order in NLTK download step

0.8.6

Not secure
Enhancements

Features

Fixes

* Remove debug print lines and non-functional code

Page 19 of 39

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.