Pyspark-ds-toolbox

Latest version: v0.4.3

Safety actively analyzes 685670 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.4.2

Fixed
* `pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()`: `bucket_fraction` argument behavior.

Changed
* `pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()`: Return `dict[dfs_iv]` from a spark dataframe to `dict[df_iv]` to a pandas df.

0.4.1

Fixed
* `pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()`: behavior with `num_features` and `cat_features` arguments.

0.4.0

Added

* Added the `pyspark_ds_toolbox.ml.feature_selection.information_value` module and all its functionalities
* `feature_selection_with_iv()`
* `compute_woe_iv()`
* `WeightOfEvidenceComputer()`

0.3.4

Breaking Changes

* `pyspark_ds_toolbox.ml.data_prep.features_vector.get_features_vector`: Now returns a list with pyspark indexers, encoders and assemblers, to used with pipelines.
* `pyspark_ds_toolbox.ml.classification.baseline_classifiers.py`: Models now are returned as pipelines.

0.3.3

Changed

* `pyspark_ds_toolbox.ml.classification.baseline_binary_classfiers` has a `mlflow_experiment_name` argument.


Fixed

* `pyspark_ds_toolbox.ml.feature_importance.native_spark`.

0.3.2

Changed

* Fuctionalities from module `pyspark_ds_toolbox.wrangling` was refactored into `pyspark_ds_toolbox.wrangling.reshape.py` and `pyspark_ds_toolbox.wrangling.data_quality.py`;
* Fuctionalities from module `pyspark_ds_toolbox.ml.data_prep` was refactored into `pyspark_ds_toolbox.ml.data_prep.class_weights.py` and `pyspark_ds_toolbox.ml.data_prep.features_vector.py`.

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.