Automunge

Latest version: v8.33

Safety actively analyzes 681857 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 79 of 99

3.37

- fixed bug for scenario of dataframes passed with non-range index introduced in 3.31

3.36

- new infill type 'lcinfill' available for assignment in assigninfill, comparable to modeinfill but applies least common value of a set instead of most common value
- new validation check run on passed assigncat dictionary to ensure passed keys have corresponding entries of family tree assignments available in process_dict
- fixed issue with ML infill postmunge application (turned out to be originating from edge case where one of columns had an entire set subject to infill)
- fixed second issue with ML infill associated with reseting a marker after infill
- added note to READ ME about infilliterate
- updates to feature importance associated with edge cases when model not trained
- updated process_dict MLinfilltype entries for exc2, exc3, exc4 from 'label' to 'numeric' ('label' MLinfilltype for now discontinued)
- removed convention of dropping rows corresponding to label column NaN, new convention is label columns are subject to default infill associated with transformation category

3.35

- added support for hyperparameter tuning of predictive models to feature importance evaluation in both automunge(.) and postmunge(.) (previously was only available for ML infill)

3.34

- found and fixed edge case bug for assigned infill options meaninfill / medianinfill / modeinfill associated with cases where an entire set is subject to infill
- found that infill assignments were in some cases reseting the dtypes of sets returned from transform functions (eg in some cases changing a column from integers to floats). Updated infill functions to ensure returned data types are consistent with those types returned from transform functions

3.33

- added support for "modeinfill" (infill with most common value) to '1010' binary encoded sets
- reintroduced "modeinfill" for one-hot encoded sets
- improved implementation of modeinfill (moved all scenarios into single function for simplicity and clarity)

3.32

- new process_dict MLinfilltype entry option of 'binary' to distinguish between single column categorical ordinal entries (singlct) and single column categorical boolean entries (binary)
- updated methods that test if a column is boolean by basing on the process_dict entry for MLinfilltype instead of inspecting unique values in column (for improved operating efficiency)
- (except retained label smoothing functions boolean column testing in original method in case user wants to call these methods outside of an automunge(.) call)
- corrected process_dict MLinfilltype entries for bkt1, bkt2 to singlct
- fixed PCA bug associated with checking for excl categories
- removed normalize_components parameter from scikit SparsePCA application option as will be depreciated
- fixed postmunge feature importance bug associated with PCA

Page 79 of 99

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.