Amazon-denseclus

Latest version: v0.2.2

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

0.2.2

* Updated `evaluate` helper function to do both DBCV and Calinski-Harabasz
* Added new Notebook for exploring clustering on SageMaker Jumpstart
* Dependency version bumps

0.2.1

* splitting up modules - numerical and categorical have there own files now for future enhancements
* changed `score` method to `evaluate` ; now scores via DBCV, coverage and return lables
* set gpu settings consolidated, now just `use_gpu` set to False or true
* add version file for automated setup

0.2.0

Summary

Add `predict` method based on the combine method for `ensemble`.
When ensemble is selected, **Denseclus does not combine** the umaps, instead it fits clusterer for each UMAP.
When predict is called it used `approximate_predict` in HDSCAN to then vote on the cluster assignment.

**Other changes**

* Change default method from 'contrast' to 'intersection'
* Change default distance metric for categoricals to `jaccard` for later rapids integration
* Increase overall test coverage
* `prediction_data=False` for combined UMAPs, `True` for ensemble
* Update examples to reflect changes

v.0.1.2
A few minor tweaks to the library primarily to help with maintenance.

1) Adding Continuous Deployment CD workflow to directly publish to PyPI when merged into main
2) Fixed `__repr__` and `__str__` methods so the don't return the whole fitted dataframe
3) Fixed coverage runs and made tox a single call

0.1.1

Adding feature to auto-impute.
Will call simple imputation under the hood for both categorical and numerical features.
The user can configure these to non-defaults with keyword arguments.

In addition, updated the HDBSCAN so that parameter search comes first as DenseClus converges to the optimal solution for DBCV. I don't know why.

PS: Really should be semantic version 2 but I am going this route instead.

https://github.com/awslabs/amazon-denseclus/issues/23

0.1.0

*Description of changes:*

** New Feature: Configure underlying Algorithms**

**Update: Now Supported for Python 3. 11 (and only Python 3.11)**

**Other Updates**
* Move to using Ruff for linting
* Address some bugs and user warnings in the package code
* Update and lint notebooks
* Refactor unit tests with fixtures
* Update tox, precommit, etc to run on latest Python
* Refactor of Makefile to support all above
* Better error handling
* Update workflows in GHA to remove redudancy
* Better issues tracking templates

Releases

Has known vulnerabilities