Pyjedai

Latest version: v0.2.4

Safety actively analyzes 726363 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 4

0.1.2

⚒️ Fixed
- Fixed export methods. Use case of not providing a ground-truth
- Time of vectorization by saving and retrieving the distance matrix
- Bug resolution in PER indexing, Dirty ER
- Speed/Memory optimizations in NN Blocking & Join PER

➕ Added
- 'sqeuclidean' metric in matching step
- Valentine as a Schema Matching plugin
- Frequency Evaluator compatible with base ER matching

⚠️ Issues
- Vectorizers (tfidf, etc) don't support dirty er. Will be fixed in the next release.

0.1.1

⚒️ Fixed
- Removed deprecated whoosh imports from prioritization file

➕ Added
- None

⚠️ Issues
- None

0.1.0

⚒️ Fixed
- Restructured Matching Module - vectorizer, tokenizer, and qgrams as arguments (not inferred)
- Clustering step randomization bug

➕ Added
- PER notebook tutorials
- PER grid-search pipeline (config files, search scripts, storage)
- PER workflows visualization and comparison through:
- feature configuration budget-centric metric progress plots
- feature configuration dataset-centric sorting and comparison

⚠️ Issues
- None

0.0.9

⚒️Fixed:
- FAISS euclidean distance
- Workflow methods
- Removed whoosh
- Removed SCANN

➕Added:
- 3 New workflow methods
- Export pairs in each step
- Tfidf weights in matching options
- Website:
- code API
- new tutorials

⚠️ Issues:
- None

0.0.8

**Fixed**:
- Word grams tokenization
- Code architecture in entity matching
- py_stringmatching dependencies
- Pypi readme

**Added**:
- Boolean/Tfidf/Tf weights

0.0.7

**Fixed**:
- Issues in block filtering
- Issues in vector based blocking
- Data model set types
- EJoin wrong naming

**Added**:
- Prioritization algorithms
- Tf-Idf functionality
- More metrics on entity matching
- Optional data cleaning functionalities
- New visualizations
- New stats for the blocking workflows

Page 3 of 4

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.