Polyfuzz

Latest version: v0.4.2

Safety actively analyzes 623144 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.4.2

Removed restrictive pytorch dependencies for Flair

0.4.1

* Fixed deprecated `np.float`
* Fixed [53](https://github.com/MaartenGr/PolyFuzz/issues/53)
* Fixed typo in README ([45](https://github.com/MaartenGr/PolyFuzz/pull/45)) by [dobraczka](https://github.com/dobraczka)
* Fixed API documentation ([38](https://github.com/MaartenGr/PolyFuzz/pull/38)) by [maxbachmann](https://github.com/maxbachmann)

0.4.0

* Added new models (SentenceTransformers, Gensim, USE, Spacy)
* Added `.fit`, `.transform`, and `.fit_transform` methods
* Added `.save` and `PolyFuzz.load()`

SentenceTransformers, Gensim, USE, and Spacy

**SentenceTransformers**
python
from polyfuzz.models import SentenceEmbeddings
distance_model = SentenceEmbeddings("all-MiniLM-L6-v2")
model = PolyFuzz(distance_model)


**Gensim**
python
from polyfuzz.models import GensimEmbeddings
distance_model = GensimEmbeddings("glove-twitter-25")
model = PolyFuzz(distance_model)


**USE**
python
from polyfuzz.models import USEEmbeddings
distance_model = USEEmbeddings("https://tfhub.dev/google/universal-sentence-encoder/4")
model = PolyFuzz(distance_model)


**Spacy**
python
from polyfuzz.models import SpacyEmbeddings
distance_model = SpacyEmbeddings("en_core_web_md")
model = PolyFuzz(distance_model)



fit, transform, fit_transform
Add `fit`, `transform`, and `fit_transform` in order to use PolyFuzz in production (34)

python
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
from polyfuzz import PolyFuzz

train_words = ["apple", "apples", "appl", "recal", "house", "similarity"]
unseen_words = ["apple", "apples", "mouse"]

Fit
model = PolyFuzz("TF-IDF")
model.fit(train_words)

Transform
results = model.transform(unseen_words)


In the code above, we fit our TF-IDF model on `train_words` and use `.transform()` to match the words in `unseen_words` to the words that we trained on in `train_words`.

After fitting our model, we can save it as follows:

python
model.save("my_model")


Then, we can load our model to be used elsewhere:

python
from polyfuzz import PolyFuzz

model = PolyFuzz.load("my_model")

0.3.4

- Make sure that when you use two lists that are exactly the same, it will return 1 for identical terms:

python
from polyfuzz import PolyFuzz
from_list = ["apple", "house"]
model = PolyFuzz("TF-IDF")
model.match(from_list, from_list)


This will match each word in `from_list` to itself and give it a score of 1. Thus, `apple` will be matched to `apple` and
`house` will be mapped to `house`. However, if you input just a single list, it will try to map them within the list without
mapping to itself:

python
from polyfuzz import PolyFuzz
from_list = ["apple", "apples"]
model = PolyFuzz("TF-IDF")
model.match(from_list)


In the example above, `apple` will be mapped to `apples` and not to `apple`. Here, we assume that the user wants to
find the most similar words within a list without mapping to itself.

0.3.3

Quickfix for issues 21 and 23

0.3.2

Fixed an issue with sparse_dot_n exploding memory usage when trying to access the top_n of a sparse matrix.

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.