Features
- Installation of the C++ library and command-line tools can finally be done using `make install`
- `make build-cli` has been changed to `make build`
Bug fixes
- Capture case where `in_num_p` is not switched off.
- Before: `"文字123汉语" -> ["文字", "123", "汉", "语"]`
- After: `"文字123汉语" -> ["文字", "123", "汉语"]`
Todo
- To determine how characters belonging to the ["other letters"](https://www.compart.com/en/unicode/category) category
should be handled by the tokenizer.
- Reduce the number of flags.
- Remove those out of the scope of this package. Eg. lowercase
- Or adds unnecessary bloat to the logic. Eg. url handling