Hlink

Latest version: v3.8.0

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 4

3.3.1

Changed

* Updated documentation for column mapping transforms. [PR 77][pr77]
* Updated documentation for the `present_both_years` and `neither_are_null` comparison
types, clarifying how they are different. [PR 79][pr79]

Fixed

* Fixed a bug where comparison features were marked as categorical whenever the
`categorical` key was present, even if it was set to false. [PR 82][pr82]

3.3.0

Added

* Added logging for user input to the script. This is extremely helpful for diagnosing
errors. [PR 64][pr64]
* Added and improved documentation for several comparison types. [PR 47][pr47]

Changed

* Started writing to a unique log file for each script run. [PR 55][pr55]
* Updated and improved the tutorial in examples/tutorial. [PR 63][pr63]
* Changed to pyproject.toml instead of setup.py and setup.cfg. [PR 71][pr71]

Fixed

* Fixed a bug which caused Jaro-Winkler scores to be 1.0 for two empty strings. The
scores are now 0.0 on two empty strings. [PR 59][pr59]

3.2.7

Added

* Added a configuration validation that checks that both data sources contain the id column. [PR 13][pr13]
* Added driver memory options to `SparkConnection`. [PR 40][pr40]

Changed

* Upgraded from PySpark 3.2 to 3.3. [PR 11][pr11]
* Capped the number of partitions requested at 10,000. [PR 40][pr40]

Fixed

* Fixed a bug where `feature_selections` was always required in the config file.
It now defaults to an empty list as intended. [PR 15][pr15]
* Fixed a bug where an error message in `conf_validations` was not formatted correctly. [PR 13][pr13]

3.2.6

Added

* Made hlink installable with `pip` via PyPI.org.

3.2.1

Added

* Improved logging during startup and for the `LinkTask.run_all_steps()` method.
[PR 7][pr7]

Changed

* Added code to adjust the number of Spark partitions based on the size of the input
datasets for some link steps. This should help these steps scale better with large
datasets. [PR 10][pr10]

Fixed

* Fixed a bug where model exploration's step 3 would run into a `TypeError` due to
trying to manually build up a file path. [PR 8][pr8]

3.2.0

Changed

* Upgraded from Python 3.6 to 3.10. [PR 5][pr5]
* Upgraded from PySpark 2 to PySpark 3. [PR 5][pr5]
* Upgraded from Java 8 to Java 11. [PR 5][pr5]
* Upgraded from Scala 2.11 to Scala 2.12. [PR 5][pr5]
* Upgraded from Scala Commons Text 1.4 to 1.9. This includes some bug fixes which
may slightly change Jaro-Winkler scores. [PR 5][pr5]

Page 3 of 4

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.