Unstructured

Latest version: v0.17.2

Safety actively analyzes 723607 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 32 of 39

0.3.7

Fixes

* **Correct fsspec connectors date metadata field types** - sftp, azure, box and gcs
* **Fix Kafka source connection problems**
* **Fix Azure AI Search session handling**
* **Fixes issue with SingleStore Source Connector not being available**
* **Fixes issue with SQLite Source Connector using wrong Indexer** - Caused indexer config parameter error when trying to use SQLite Source
* **Fixes issue with Snowflake Destination Connector `nan` values** - `nan` values were not properly replaced with `None`
* **Fixes Snowflake source `'SnowflakeCursor' object has no attribute 'mogrify'` error**
* **Box source connector can now use raw JSON as access token instead of file path to JSON**
* **Fix fsspec upload paths to be OS independent**
* **Properly log elasticsearch upload errors**

Enhancements

* **Kafka source connector has new field: group_id**
* **Support personal access token for confluence auth**
* **Leverage deterministic id for uploaded content**
* **Makes multiple SQL connectors (Snowflake, SingleStore, SQLite) more robust against SQL injection.**
* **Optimizes memory usage of Snowflake Destination Connector.**
* **Added Qdrant Cloud integration test**
* **Add DuckDB destination connector** Adds support storing artifacts in a local DuckDB database.
* **Add MotherDuck destination connector** Adds support storing artifacts in MotherDuck database.
* **Update weaviate v2 example**

0.3.6

Fixes

* **Fix Azure AI Search Error handling**

0.3.5

Not secure
* Add support for local inference
* Add new pattern to recognize plain text dash bullets
* Add test for bullet patterns
* Fix for `partition_html` that allows for processing `div` tags that have both text and child
elements
* Add ability to extract document metadata from `.docx`, `.xlsx`, and `.jpg` files.
* Helper functions for identifying and extracting phone numbers
* Add new function `extract_attachment_info` that extracts and decodes the attachment
of an email.
* Staging brick to convert a list of `Element`s to a `pandas` dataframe.
* Add plain text functionality to `partition_email`

0.3.4

Not secure
* Python-3.7 compat

0.3.3

Not secure
* Removes BasicConfig from logger configuration
* Adds the `partition_email` partitioning brick
* Adds the `replace_mime_encodings` cleaning bricks
* Small fix to HTML parsing related to processing list items with sub-tags
* Add `EmailElement` data structure to store email documents

0.3.2

Not secure
* Added `translate_text` brick for translating text between languages
* Add an `apply` method to make it easier to apply cleaners to elements

Page 32 of 39

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.