Automunge

Latest version: v8.33

Safety actively analyzes 715032 Python packages for vulnerabilities to keep your Python projects secure.

Page 60 of 99

4.60

- created companion columntype_report for columns returned from label set
- available in postprocess_dict as postprocess_dict['label_columntype_report']
- new label processing category 'lbos' that applies ordinal encoding followed by conversion to string
- such as to support downstream machine learning libraries that may consider data types of labels for determination of application of regression vs classification
- (ie some libraries may treat integer sets as targets for regression when user intends classification)
- note that inversion is supported to recover original form eg to convert predictions back to form

4.59

- added returned_PCA_columns to postprocess_dict
- added a memory clear operation to PCA transform to reduce overhead
- added returned_Binary_columns to postprocess_dict
- improved printouts for PCA and Binary dimensionality reductions
- removed ordinal columns from Binary dimensionality reduction, just made more sense
- improved read me writeup for ord4, intended as a scaled metric ranking redundant entries by frequency of occurance
- new report classifying returned columns by data type now available in postprocess_dict as postprocess_dict['columntype_report']
- includes aggregated lists of columns per types: continuous, boolean, ordinal, onehot, onehot_sets, binary, binary_sets, passthrough
- where onehot captures all one-hot encoded columns, and onehot_sets is redundant except that it subaggregates by those from same transform as a list of lists
- similarily with binary and binary_sets
- these aggregations should be helpful for training downstream models in libraries that accept specification of column types, such as eg for entity embeddings

4.58

- improved printouts to support postmunge(.) troubleshooting
- for cases where df_train columns passed to automunge(.) inconsistent with df_test passed to postmunge(.)
- removed a validation test for bug scenario that was eliminated as part of 4.55 update
- and also simplified the reporting for ML infill validation test added in 4.56 since only needs to be run once instead of for every column

4.57

- found and fixed small bug in postmunge feature importance evaluation

4.56

- added a validation and printout for ML infill edge case scenario
- a little clean up to infill validation aggregations

4.55

- rewrote the insertinfill function for simplicty / clarity, also to segregate supporting columns from attachment to df_train, one more edge case failure mode eliminated
- this is a pretty central support function and was one of the first that had written, part of reason it was a little sloppy, much cleaner now
- new validation infill_suffixoverlap_results out of abundance of caution
- also a few more tweaks to dataframe column inspections to use more efficient .columns instead of list()
- similar tweaks to a few dictionary key inspections

Page 60 of 99

Releases

Has known vulnerabilities

Previous Next

Automunge

Page 60 of 99

4.60

4.59

4.58

4.57

4.56

4.55

Page 60 of 99

Links

Releases