Wikipron

Latest version: v1.3.3

Safety actively analyzes 723685 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 2

1.0.0

----------------------

Added

- Handled Japanese. (\109, \114)
- Handled Latin, for which the actual graphemes cannot be the Wiktionary
page titles and have to come from within the page. (\92, \93)
- Handled Thai, whose pronunciations are embedded in HTML tables. (\90)
- Handled Khmer, whose pronunciations are embedded in HTML tables. (\88)
- IPA segmentation using spaces by default, with the `--no-segment` flag to
optionally turn it off. (\69, \79, \83, \89, \100)
- Added TSV files for all Wiktionary languages with over 100 entries.
(\61, \76, \95, \97, \103, \104)
- Resolved Wiktionary language names for languages with at least 100
pronunciation entries. (\52, \55)

Changed

- Removed duplicate <word, pronunciation> pairs in the persisted data. (\85, \111, \116)
- Split Welsh into Northern Wales and Southern dialects in the persisted data. (\110)
- Factored out casefolding. (\102)
- Split Serbo-Croatian into Cyrillic and Latin TSVs. (\96)
- Generalized word and pronunciation extraction. (\88)

Removed

- Removed the timeout in smoke tests. (\107)
- Removed the `output` option. (\82)
- Removed the `require_dialect_label` option. (\77)

Fixed

- Skipped pronunciations with a dash. (\106)
- Skipped empty pronunciations in scraping. (\59)
- Updated the `<li>` XPath selector for an optional layer of `<span>` to cover
previously unhandled languages (e.g., Korean). (\50)
- Updated the `<li>` XPath selector for
`title="wikipedia:<language> phonology"` to cover previously unhandled
languages (e.g., Estonian and Slovak). (\49)

Security

- Avoided using `exec` to retrieve the version string. Used `pkg_resources`
instead. (\63)

0.1.1

----------------------

Fixed

- Fixed import bug. (\45)

0.1.0

----------------------

First release.

Page 2 of 2

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.