Added
- Added the `parse_text` for analyzing Cantonese text data.
- Characters-to-Jyutping conversion:
The `characters_to_jyutping` function now has the `segmenter` kwarg for
customizing word segmentation.
- Added support for Python 3.10.
- Turned on Windows testing on CircleCI.
- Added `pyproject.toml`. Related to preferring `setup.cfg` for specifying
build metadata and options.
Changed
- Characters-to-Jyutping conversion:
For the `characters_to_jyutping` function,
in case rime-cantonese and HKCanCor don't agree,
rime-cantonese data (more accurate) is preferred.
- Updated the rime-cantonese data to the latest `2021.05.16` release,
improving both characters-to-Jyutping conversion and word segmentation.
- Updated the PyLangAcq dependency to v0.16.0, allowing PyCantonese's `CHATReader`
to use the new methods `to_chat`, `to_strs`, `info`, `head`, and `tail`.
- Switched to `setup.cfg` to fully specify build metadata and options,
while keeping a minimal `setup.py` for backward compatibility.
Related to the new `pyproject.toml`.
Removed
- Dropped support for Python 3.6.
Security
- Turned on `safety` and `bandit` checks at CircleCI builds.