Classla

Latest version: v2.1.1

Safety actively analyzes 623518 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

2.1.1

Update reldi-tokenizer to version 1.0.3.

This assures proper tokenization of abbreviations, e.g. tokenizing "To je djelo dr. Ljubešića" as a single sentence instead of splitting it in two sentences on the full stop.

2.1

- Added new models for all languages
- Added new "web" processing type
- Fixed sentence splitting in the tokenizers

2.0

- Added new models for standard Slovenian
- Added new inflectional lexicon for Slovenian
- Adapted tests to new model outputs
- Modified lexicon to store underscores instead of empty strings
- Other changes

1.2.0

- Added SRL parsing to Slovenian language
- Fixed training for lemmatizer and pos tagger
- Added toy tests for all trainings
- Other smaller fixes

1.1.1

- Updated external package version requirements. Mainly due to updates in Slovenian obeliks tokenizer

1.1.0

- Added tokenizer pretag option for both obeliks and reldi-tokeniser (via `pos_lemma_pretag`)
- Updated Slovene inflectional lexicon and moved from lemmatizer model to morphosyntactic annotation model
- Added upos and ufeats control to Slovene inflectional lexicon
- Other smaller fixes

Page 1 of 2

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.