Compress-fasttext

Latest version: v0.1.5

Safety actively analyzes 683530 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 3

0.1.1

- Wrap up the refactoring related to new `gensim` version
- add `FastTextTransformer`, a scikit-learn-like wrapper for feature extraction

gensim-4-draft
Support of gensim>=4.0.0 and deprecation of earlier gensim
New released models

Russian models based on `geowac_tokens_none_fasttextskipgram_300_5_2020` from [RusVectores](https://rusvectores.org/ru/models/), 1.9GB:
| Model | RAM size, mb | similarity to the original model |
| --- | --- | --- |

0.0.6

- require `sklearn` and `pqkmeans` only in the `[full]` setup mode

0.0.4

- Publish more compressed models and compare their quality
- Make the compressed models downloadable

0.0.3

Now attempts of arithmetic operations on compressed matrices do not raise errors. However, they lead to conversion of these matrices to `numpy.array`, which uses time and memory.

0.0.2

Now `prune_ft_freq` method takes into account not only n-gram frequency, but also the norm of its embedding.
This improves model compression accuracy for the same model size.

0.0.1

We publish the code for compressing Gensim FastText models and using their small versions.

We also publish 4 compressed versions of the [ruscorpora_none_fasttextskipgram_300_2_2019](http://vectors.nlpl.eu/repository/20/181.zip) model from [RusVectores](https://rusvectores.org/ru/models/).

| Model | RAM, mb | Similarity to the original | Intrinsic evaluation (relative to the original) |
| --- | --- | --- | --- |

Page 3 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.