Hlink

Latest version: v3.7.0

Safety actively analyzes 682471 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 3

3.2.7

Overview

This release of hlink contains some bug fixes and maintenance items, along with some tuning of hlink for large datasets. It modifies the `hlink.spark.session.SparkConnection` class to allow easier adjustment of the `spark.driver.memory` configuration setting, and it upgrades hlink from Spark 3.2 to 3.3.

Changes

- Upgraded from Spark 3.2 to 3.3.0. This required only a few internal changes to hlink.
- Fixed a bug where `feature_selections` was always required in the config file. Now it defaults to `[]` as intended.
- Fixed a bug where an error message in `conf_validations` wasn't formatted correctly.
- Added a check to `conf_validations` to confirm that both data sources contain the id column specified in the config file.
- Improved the project README.
- Capped the number of Spark partitions requested at 10,000 to prevent hlink from requesting too many partitions with very large datasets.
- Added driver memory options to `SparkConnection`.

Notes

- Added developer documentation on how to push hlink to PyPI.
- Cleaned up some old files and did some reorganization. Did some work to organize some test files that were in a confusing place.

3.2.6

Overview

With this release, hlink is now installable from [pypi.org](https://pypi.org) with `pip install hlink`. hlink went through several small intermediate updates to get packaging with PyPI set up correctly.

Changes

- Updated metadata to integrate with PyPI.
- Updated documentation to include instructions on installing from PyPI.

Notes

- Versions 3.2.2 through 3.2.5 are intermediate versions needed to get hlink working on PyPI. They don't have associated releases.

3.2.1

Overview

This is a small patch release with a bug fix and a couple of usability improvements to hlink.
Changes

- Fixed a bug where model exploration's step 3 would run into a `TypeError` due to trying to manually build up a file path.
- Improved logging during startup and for the `LinkTask.run_all_steps()` method.
- Added code to adjust the number of Spark partitions based on the size of input datasets for a few link steps. This should help these steps scale better with large datasets.

Notes

- Updated the pre-commit installation file to work with Python 3.10.

3.2.0

Overview

This release upgrades many of hlink's dependencies to newer versions. This should make hlink easier to maintain in the future and keep it from being blocked because it's on old versions of packages that are missing newer features. Because so many of hlink's dependencies have been updated, there are some changes in hlink's functionality. Most of these should be slight, and there are some explanations of the more noticeable ones below.

Changes

- Upgraded from Python 3.6 to 3.10.
- Upgraded from Java 8 to Java 11.
- Upgraded from Scala 2.11 to Scala 2.12.
- Upgraded from pyspark 2 to pyspark 3.
- Upgraded the Scala package that hlink uses for computing Jaro-Winkler scores. This included some bug fixes. Jaro-Winkler scores may change slightly in some cases with this upgrade. They were incorrect in previous versions. See the Scala Commons Text changelog [here](https://commons.apache.org/proper/commons-text/changes-report.html) for some more information. hlink upgraded from using version 1.4 to using version 1.9.

Notes

- Upgraded to Jinja2 3, which slightly changed how Jinja's `PackageLoader` works. Adding some empty `templates/` subdirectories to some hlink packages fixed the issues.
- Upgraded to newer versions of flake8 and black, which caused only a few minor formatting changes.
- Made Sphinx docs automatically track hlink's version instead of requiring a manual update.

3.1.0

Overview
This version of hlink contains a few tweaks of v3.0.0, the initial open source version of hlink.

Changes
- Started exporting true positive and true negative data along with false positive and false negative data in model exploration.
- Fixed a bug where "exact_all_mult" wasn't handled correctly in config validation.
- Polished the repo for open sourcing.

Notes
- Added a quickcheck pytest marker for quickly testing a subset of core functionality.
- Set up GitHub Actions.

Page 3 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.