Konoha

Latest version: v5.5.6

Safety actively analyzes 683322 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 7

4.3.0

Not secure
core feature

- Add tokenization server. (79)
- Add type annotations. (83)


integration

- Support tokenizers in AllenNLP integration. (73)

documentation

- Update README. (84)
- Add reference to blog articles. (85)
- Update README to use shields.io. (88)


other

- Replace `unittest` with `pytest`. (74)
- Simple code-fix and modify error messages. (86)
- Install `sudachidict_core` in `pip install`. (89)
- Bump up version number to v4.3.0. (90)

4.2.0

Not secure
- Support tokenizers in AllenNLP integration. (73)

This PR added full supports of konoha word tokenizers.

4.1.0

Not secure
- [beta] Add integration for `AllenNLP`. (71)

4.0.0

Not secure
Support remote files ([59](https://github.com/himkt/konoha/pull/59))

You can specify a s3 path for `user_dictionary_path`, `system_dictionary_path` and `model_path`.
To use a remote path, you have to set AWS credentials.
For more information, please read [the documentation].(https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html))
(Konoha supports ways of environment variables and shared credentials file.)

python
from konoha import WordTokenizer

if __name__ == "__main__":
sentence = "首都大学東京"

word_tokenizer = WordTokenizer("mecab")
print(word_tokenizer.tokenize(sentence))

word_tokenizer = WordTokenizer("mecab", user_dictionary_path="s3://abc/xxx.dic")
print(word_tokenizer.tokenize(sentence))

word_tokenizer = WordTokenizer("mecab", system_dictionary_path="s3://abc/yyy")
print(word_tokenizer.tokenize(sentence))

word_tokenizer = WordTokenizer("sentencepiece", model_path="s3://abc/zzz.model")
print(word_tokenizer.tokenize(sentence))


Rename name of repository ([60](https://github.com/himkt/konoha/pull/60))

`tiny_tokenizer` is ambiguous. (`tiny_segmenter` already exists)

3.1.0

- Use poetry for development 53
- Support Janome, which is pure-python morphological analyzer 57

3.0.2

- Support system dictionary in MeCab 42
- Support custom model in KyTea 49

Page 5 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.