Markitdown

Latest version: v0.1.1

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.1.1

What's Changed

`convert_url` renamed to `convert_uri`, and now handles data and file URIs by afourney in https://github.com/microsoft/markitdown/pull/1153

**NOTE**: `convert_url` remains an alias to `convert_uri`, for backward compatibility.

Both now accept file URIs and data URIs:

e.g.,
python
markitdown = MarkItDown()
result = markitdown.convert_uri("file:///path/to/file.txt")
print(result.markdown)


And,

python
markitdown = MarkItDown()
result = markitdown.convert_uri("data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==")
print(result.markdown)


**Full Changelog**: https://github.com/microsoft/markitdown/compare/v0.1.0...v0.1.1

0.1.0

Overview
Version 0.1.0 (previously 0.1.0a6) is a large release, bringing many improvements over the previous 0.0.2 version.

High-level changes include:

* Organized dependencies into feature groups — install only the converters you need, or get everything with `pip install markitdown[all]`
* A new plugin-based architecture, allowing 3rd-party developers to add functionality to MarkItDown (see the [sample plugin](https://github.com/microsoft/markitdown/tree/main/packages/markitdown-sample-plugin))
* All conversions are performed in-memory — no more temporary files
* Support for new formats including EPUB
* Option to keep data URIs in converted Markdown
* Option to override MIME type, extension, and charset in the command-line interface (useful when reading input from a pipe or stdin)

Breaking changes
* As noted above, dependencies are now organized into optional feature groups. `Use pip install markitdown[all]` for backward-compatible behavior.
* `convert_stream()` now requires a binary file-like object (e.g., a file opened in binary mode, or an io.BytesIO object). This is a breaking change from the previous version, which also accepted text file-like objects, like io.StringIO.
* The `DocumentConverter` class interface has changed to read from file-like streams rather than file paths. No temporary files are created anymore. If you are the maintainer of a plugin or custom DocumentConverter, you likely need to update your code. Otherwise, if you're only using the MarkItDown class or CLI (as in these examples), you should not need to change anything.
 
Detailed list of contributions
* Cleanup and refactor, in preparation for plugin support. by afourney in https://github.com/microsoft/markitdown/pull/318
* Skip generating md links in 'pre' blocks by t-kalinowski in https://github.com/microsoft/markitdown/pull/322
* Fix a typo in sample RTF plugin by rickygao in https://github.com/microsoft/markitdown/pull/320
* Added priority argument to all converter constructors. by afourney in https://github.com/microsoft/markitdown/pull/324
* Doc Intelligence fixes for refactored code by KennyZhang1 in https://github.com/microsoft/markitdown/pull/325
* Added CLI tests. by afourney in https://github.com/microsoft/markitdown/pull/327
* Fix UnboundLocalError in MarkItDown._convert by menezesandre in https://github.com/microsoft/markitdown/pull/1038
* add necessary imports by tanreinama in https://github.com/microsoft/markitdown/pull/861
* fix: Implement retry logic for YouTube transcript fetching and fix URL decoding issue by iw4p in https://github.com/microsoft/markitdown/pull/1035
* Add Support For PPTX Shape Groups (Fix in code design to not miss out on slide content) by C0dingMast3r in https://github.com/microsoft/markitdown/pull/331
* Make sure extensions are unique in MarkItDown's convert methods. by afourney in https://github.com/microsoft/markitdown/pull/1076
* Don't have ZipConverter accept OOXML files. by afourney in https://github.com/microsoft/markitdown/pull/1078
* Print and log better exceptions when file conversions fail. by afourney in https://github.com/microsoft/markitdown/pull/1080
* Exceptions should subclass Exception not BaseException. by afourney in https://github.com/microsoft/markitdown/pull/1082
* [Draft] Exploring ways to allow Optional dependencies by afourney in https://github.com/microsoft/markitdown/pull/1079
* Fixed property name by afourney in https://github.com/microsoft/markitdown/pull/1085
* Update converter API, user streams rather than filepaths by afourney in https://github.com/microsoft/markitdown/pull/1088
* Bump version. by afourney in https://github.com/microsoft/markitdown/pull/1094
* Fixed loading of plugins. by afourney in https://github.com/microsoft/markitdown/pull/1096
* Fixed version. by afourney in https://github.com/microsoft/markitdown/pull/1097
* fix(README): correct pip install command formatting by Piero24 in https://github.com/microsoft/markitdown/pull/1090
* Fixed deepcopy failure when passing llm_client by scalabreseGD in https://github.com/microsoft/markitdown/pull/1089
* Fixed formatting. by afourney in https://github.com/microsoft/markitdown/pull/1098
* feat: sort pptx shapes to be parsed in top-to-bottom, left-to-right order by richardye101 in https://github.com/microsoft/markitdown/pull/1104
* feat(docker): improve dockerfile build by syaghoubi00 in https://github.com/microsoft/markitdown/pull/220
* Fix exiftool in well-known paths. by afourney in https://github.com/microsoft/markitdown/pull/1106
* fix typo in well-known path list by 0xmohit in https://github.com/microsoft/markitdown/pull/1109
* Switch from puremagic to magika. by afourney in https://github.com/microsoft/markitdown/pull/1108
* Minimize guesses when guesses are compatible. by afourney in https://github.com/microsoft/markitdown/pull/1114
* Added CLI options for extension, mime-types, and charset. by afourney in https://github.com/microsoft/markitdown/pull/1115
* Fix string formatting in FileConversionException error message by yushihang in https://github.com/microsoft/markitdown/pull/1121
* Handle not supported plot type in pptx by EmanueleMeazzo in https://github.com/microsoft/markitdown/pull/1122
* Small fixes for autogen integration. by afourney in https://github.com/microsoft/markitdown/pull/1124
* Added epub test file. by afourney in https://github.com/microsoft/markitdown/pull/1130
* Fix remaining mypy errors. by afourney in https://github.com/microsoft/markitdown/pull/1132
* Have magika read from the stream. by afourney in https://github.com/microsoft/markitdown/pull/1136
* EPub Support. Adapted 123 to not use epublib. by afourney in https://github.com/microsoft/markitdown/pull/1131
* Consider anything with a charset as plain text-convertible. by afourney in https://github.com/microsoft/markitdown/pull/1142
* Adjust warning filters and update dependencies by afourney in https://github.com/microsoft/markitdown/pull/1143
* Add support for preserving base64 encoded images by BetterAndBetterII in https://github.com/microsoft/markitdown/pull/1140
* Resolve a console encoding error. by afourney in https://github.com/microsoft/markitdown/pull/1149
* Bump version to 0.1.0 by afourney in https://github.com/microsoft/markitdown/pull/1150

New Contributors
* t-kalinowski made their first contribution in https://github.com/microsoft/markitdown/pull/322
* rickygao made their first contribution in https://github.com/microsoft/markitdown/pull/320
* menezesandre made their first contribution in https://github.com/microsoft/markitdown/pull/1038
* tanreinama made their first contribution in https://github.com/microsoft/markitdown/pull/861
* iw4p made their first contribution in https://github.com/microsoft/markitdown/pull/1035
* C0dingMast3r made their first contribution in https://github.com/microsoft/markitdown/pull/331
* Piero24 made their first contribution in https://github.com/microsoft/markitdown/pull/1090
* scalabreseGD made their first contribution in https://github.com/microsoft/markitdown/pull/1089
* richardye101 made their first contribution in https://github.com/microsoft/markitdown/pull/1104
* syaghoubi00 made their first contribution in https://github.com/microsoft/markitdown/pull/220
* 0xmohit made their first contribution in https://github.com/microsoft/markitdown/pull/1109
* yushihang made their first contribution in https://github.com/microsoft/markitdown/pull/1121
* EmanueleMeazzo made their first contribution in https://github.com/microsoft/markitdown/pull/1122
* BetterAndBetterII made their first contribution in https://github.com/microsoft/markitdown/pull/1140

**Full Changelog**: https://github.com/microsoft/markitdown/compare/v0.0.2...v0.1.0

0.1.0a6

What's Changed
* Add support for preserving base64 encoded images by BetterAndBetterII in https://github.com/microsoft/markitdown/pull/1140
* Bump version and resolve a console encoding error. by afourney in https://github.com/microsoft/markitdown/pull/1149

New Contributors
* BetterAndBetterII made their first contribution in https://github.com/microsoft/markitdown/pull/1140

**Full Changelog**: https://github.com/microsoft/markitdown/compare/v0.1.0a5...v0.1.0a6

0.1.0a5

What's Changed
* Consider anything with a charset as plain text-convertible. by afourney in https://github.com/microsoft/markitdown/pull/1142
* Adjust warning filters and update dependencies by afourney in https://github.com/microsoft/markitdown/pull/1143

**Full Changelog**: https://github.com/microsoft/markitdown/compare/v0.1.0a4...v0.1.0a5

0.1.0a4

Features
* Basic EPub support from 0xRaduan, in collaboration with afourney
* Switch from puremagic to magika. by afourney in https://github.com/microsoft/markitdown/pull/1108
* Added CLI options for extension, mime-types, and charset. by afourney in https://github.com/microsoft/markitdown/pull/1115
* Sort pptx shapes to be parsed in top-to-bottom, left-to-right order by richardye101 in https://github.com/microsoft/markitdown/pull/1104

Bug fixes and enhancements
* fix(README): correct pip install command formatting by Piero24 in https://github.com/microsoft/markitdown/pull/1090
* Fixed deepcopy failure when passing llm_client by scalabreseGD in https://github.com/microsoft/markitdown/pull/1089
* feat(docker): improve dockerfile build by syaghoubi00 in https://github.com/microsoft/markitdown/pull/220
* Fix exiftool in well-known paths. by afourney in https://github.com/microsoft/markitdown/pull/1106
* fix typo in well-known path list by 0xmohit in https://github.com/microsoft/markitdown/pull/1109
* Minimize guesses when guesses are compatible. by afourney in https://github.com/microsoft/markitdown/pull/1114
* Fix string formatting in FileConversionException error message by yushihang in https://github.com/microsoft/markitdown/pull/1121
* Refactored tests. by afourney in https://github.com/microsoft/markitdown/pull/1120
* Handle not supported plot type in pptx by EmanueleMeazzo in https://github.com/microsoft/markitdown/pull/1122
* Fix remaining mypy errors. by afourney in https://github.com/microsoft/markitdown/pull/1132
* Investigate and silence warnings. by afourney in https://github.com/microsoft/markitdown/pull/1133

New Contributors
* 0xRaduan made their first contribution in https://github.com/microsoft/markitdown/pull/123
* Piero24 made their first contribution in https://github.com/microsoft/markitdown/pull/1090
* scalabreseGD made their first contribution in https://github.com/microsoft/markitdown/pull/1089
* richardye101 made their first contribution in https://github.com/microsoft/markitdown/pull/1104
* syaghoubi00 made their first contribution in https://github.com/microsoft/markitdown/pull/220
* 0xmohit made their first contribution in https://github.com/microsoft/markitdown/pull/1109
* yushihang made their first contribution in https://github.com/microsoft/markitdown/pull/1121
* EmanueleMeazzo made their first contribution in https://github.com/microsoft/markitdown/pull/1122

**Full Changelog**: https://github.com/microsoft/markitdown/compare/v0.1.0a1...v0.1.0a4

0.1.0a1

What's Changed
This MarkItDown _alpha_ introduces numerous bug-fixes, and the following major changes:

* Dependencies are now organized into optional feature-groups (further details below). Use pip install `markitdown[all]` to have backward-compatible behavior.
* The DocumentConverter class interface has changed to read from file-like streams rather than file paths. *No temporary files are created anymore*. If you are the maintainer of a DocumentConverter, you likely need to update your code. Otherwise, if only using the MarkItDown class or CLI, you should not need to change anything.
* MarkItDown now supports extension through 3rd-party plugins. See [markitdown-sample-plugin](https://github.com/microsoft/markitdown/tree/main/packages/markitdown-sample-plugin) for more details!

Page 1 of 2

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.