Lingua-py

Latest version: v0.1.1

Safety actively analyzes 682404 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 4

1.5.0

Features

- The new method `LanguageDetector.detect_multiple_languages_of()` has been introduced. It allows to detect multiple languages in mixed-language text. (1)

- The new method `LanguageDetectorBuilder.with_low_accuracy_mode()` has been introduced. By activating it, detection accuracy for short text is reduced in favor of a smaller memory footprint and faster detection performance. (119)

- The new method `LanguageDetector.compute_language_confidence()` has been introduced. It allows to retrieve the confidence value for one specific language only, given the input text. (102)

Improvements

- The computation of the confidence values has been revised and the softmax function is now applied to the values, making them better comparable by behaving more like real probabilities. (120)

- The WASM API has been revised. Now it makes use of the same builder pattern as the Rust API. (122)

- The language model files are now compressed with the Brotli algorithm which reduces the file size by 15 %, on average. (189)

- The language model ngrams are now stored in a `CompactString` type which reduces the amount of consumed memory by 20 %. (198)

- Several performance optimizations have been applied which makes the library nearly twice as fast as the previous version. Big thanks go out to serega and koute for their help. (82, 148, 177)

- The enums `IsoCode639_1` and `IsoCode639_3` now implement some new traits such as `Copy`, `Hash` and Serde's `Serialize` and `Deserialize`. The enum `Language` now implements `Copy` as well. (175)

1.4.0

Features

- The library can now be compiled to WebAssembly and be used in any JavaScript project. Big thanks to martindisch for bringing this forward. (14)

Improvements

- Some minor performance tweaks have been applied to the rule engine.

1.3.3

Bug Fixes

- This release updates outdated dependencies and fixes an incompatibility between different versions of the `include_dir` crate which are used in the main `lingua` crate and the language model crates.

1.3.2

Bug Fixes

- Another compilation error has been fixed which occurred when the Latin language was left out as Cargo feature.

1.3.1

Bug Fixes

- When Chinese, Japanese or Korean were left out as Cargo features, there were compilation errors. This has been fixed.

1.3.0

Features

- The language model dependencies are separate Cargo features now. Users can decide which languages shall be downloaded and used in the library. (12)

Improvements

- The code that does the lazy-loading of the language models has been refactored significantly, making the code more stable and less error-prone.

Bug Fixes

- In very rare cases, the language returned by the detector was non-deterministic. This has been fixed. Big thanks to asg0451 for identifying this problem. (17)

Page 1 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.