Docx2python

Latest version: v3.5.0

Safety actively analyzes 722460 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

3.5.0

Feat

- Remove Python 3.8 support.
- Refactor File.path inference to support rare files with rels in a
`word/glossary` directory.
- Test Python 3.13 support.

3.4.0

Feat

- edit and save rels files. You can now access the `rels_element` attribute of
File instances to update hyperlink urls and other values. These will be saves
on DocxReader.save(). This is an advanced feature and will not change text
extraction.

3.3.0

Feat

- skip elements with invalid tags. Issue a warning. These are usually the
result of faulty conversion software.

3.2.1

Feat

- add an `elem` attribute to `Par` instances, returning the xml element from
which the paragraph was generated

3.0.0

BREAKING CHANGE

- The html and duplicate_merged_cells arguments to docx2python are now keyword
only.
- Inserts empty cells and whitespace into exported
tables.
- Removed IndexedItem class which was *probably* only used internally, but it
was a part of the public interface.
- Function get_text was a public function. It mirrored the identical
flatten_text from the docx_text module.
- This change breaks the way paragraph styles (internally pStyle) were handled.
The input argument `do_pStyle` will no now raise an error.
- This doesn't change the interface and doesn't break any of my tests, but it
took a lot of refactoring to make this change and it may break some
unofficial patches I've made for clients.

Feat

- improve type hints for DocxContent properties
- insert blank cells to match gridSpan
- add list_position attribute for Par instances
- explicate return types in iterators
- use input file namespace

Fix

- eliminate double html tags for paragraph styles

Refactor

- make boolean args keyword only
- use pathlib in lieu of os.path
- remove Any types from DocxContent close method
- convert HtmlFormatter lambdas to defs
- specialize join_leaves into join_runs
- insert html when extracting text
- make queuing text outside paragraphs explicit
- make _open_pars private
- stop accepting extract_image bool argument
- default duplicate_merged_cells to True
- remove unused helper functions
- use pathlib in conftest
- expose numPr, ilvl, and number in BulletGenerator
- remove redundant functions
- remove do_pStyle argument from flatten_text
- remove function get_text from iterators module
- store content table as nested list of Par instances
- move xml2html_format attrib from TagRunner to DepthCollector
- factor out DepthCollector.item_depth param
- make set_caret recursive
- remove unused `styled` param from insert_text_as_new_run
- remove relative imports in src modules

Docx2python

Page 1 of 3

3.5.0

3.4.0

3.3.0

3.2.1

3.0.0

2.10.2

Page 1 of 3

Links

Releases