Changed * `pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()`: Return `dict[dfs_iv]` from a spark dataframe to `dict[df_iv]` to a pandas df.
0.4.1
Fixed * `pyspark_ds_toolbox.ml.feature_selection.information_value.feature_selection_with_iv()`: behavior with `num_features` and `cat_features` arguments.
0.4.0
Added
* Added the `pyspark_ds_toolbox.ml.feature_selection.information_value` module and all its functionalities * `feature_selection_with_iv()` * `compute_woe_iv()` * `WeightOfEvidenceComputer()`
0.3.4
Breaking Changes
* `pyspark_ds_toolbox.ml.data_prep.features_vector.get_features_vector`: Now returns a list with pyspark indexers, encoders and assemblers, to used with pipelines. * `pyspark_ds_toolbox.ml.classification.baseline_classifiers.py`: Models now are returned as pipelines.
0.3.3
Changed
* `pyspark_ds_toolbox.ml.classification.baseline_binary_classfiers` has a `mlflow_experiment_name` argument.
* Fuctionalities from module `pyspark_ds_toolbox.wrangling` was refactored into `pyspark_ds_toolbox.wrangling.reshape.py` and `pyspark_ds_toolbox.wrangling.data_quality.py`; * Fuctionalities from module `pyspark_ds_toolbox.ml.data_prep` was refactored into `pyspark_ds_toolbox.ml.data_prep.class_weights.py` and `pyspark_ds_toolbox.ml.data_prep.features_vector.py`.