* Adds console_entrypoint for unstructured-ingest, other structure/doc updates related to ingest. * Add `parser` parameter to `partition_html`.
0.4.11
* Adds `partition_doc` for partitioning Word documents in `.doc` format. Requires `libreoffice`. * Adds `partition_ppt` for partitioning PowerPoint documents in `.ppt` format. Requires `libreoffice`.
0.4.10
* Fixes `ElementMetadata` so that it's JSON serializable when the filename is a `Path` object.
0.4.9
* Added ingest modules and s3 connector, sample ingest script * Default to `url=None` for `partition_pdf` and `partition_image` * Add ability to skip English specific check by setting the `UNSTRUCTURED_LANGUAGE` env var to `""`. * Document `Element` objects now track metadata
0.4.8
* Modified XML and HTML parsers not to load comments.