Lingua-py

Latest version: v0.1.1

Safety actively analyzes 683530 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 4

1.2.2

Features

- The enums `Language`, `IsoCode639_1` and `IsoCode639_3` now implement [`std::str::FromStr`](https://doc.rust-lang.org/std/str/trait.FromStr.html) in order to instantiate enum variants by string values. This comes in handy for JavaScript bindings and the like. (#15)

Improvements

- The performance of preloading the language models has been improved.

Bug Fixes

- Language detection for sentences with more than 120 characters was supposed to be done by iterating through trigrams only but this was never the case. This has been corrected.

1.2.1

Improvements

- Language detection for sentences with more than 120 characters now performs more quickly by iterating through trigrams only which is enough to achieve high detection accuracy.
- Textual input that includes logograms from Chinese, Japanese or Korean is now split at each logogram and not only at whitespace. This provides for more reliable language detection for sentences that include multi-language content.

Bug Fixes

- Errors in the rule engine for the Latvian language have been resolved.
- Corrupted characters in the Latvian test data have been corrected.

1.2.0

Features

- A `LanguageDetector` can now be built with lazy-loading required language models on demand (default) or with preloading all language models at once by calling `LanguageDetectorBuilder.with_preloaded_language_models()`. (10)

1.1.0

Languages

- The Maori language is now supported. Thanks to eekkaiia for the contribution. (5)

Performance

- Loading and searching the language models has been quite slow so far. Using parallel iterators from the [Rayon](https://github.com/rayon-rs/rayon) library, this process is now at least 50% faster, depending on how many CPU cores are available. (#8)

Accuracy Reports

- Accuracy reports are now also generated for the [*CLD2*](https://github.com/emk/rust-cld2) library and included in the language detector comparison plots. (#6)

1.0.3

Bug Fixes

- Lingua could not be used within other projects because of a private serde module that was accidentally tried to be exposed.
Thanks to luananama for reporting this bug. (9)

1.0.2

Bug Fixes

- Accidentally, bug 3 was only partially fixed. This has been corrected.

Page 2 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.