Automunge

Latest version: v8.33

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Page 89 of 99

2.71

- a few code comment cleanups
- fixed default MLinfill parameter

2.70

Updates associated with quality control audit to ensure application matches documentation.
- extensive update to READ ME documentation
- new design philosophy, any processing function can be applied to any type of data,
although if it is not suited for that data (such as applying numerical transform to a
categorical set) it will return all 0's
- updated all processing functions to achieve this functionality
- updated NArw function and evalcategory function to perform evaluations on copy of
column instead of source column to avoid potential for data corruption
- new NArw root categories based on different distinct NArowtypes NArw (justNaN), NAr2
(numeric) and NAr3 (positivenumeric)
- updated log transforms and bxcx transform functions for postitivenumeric NArwtype
- exc2 now forces all data to numeric and applies modeinfill
- exc3 now includes the bins trasnform such as if user wants to prepare for class
imbalance oversampling a numerical set while leaving the format of source label column
intact (previously we had these in exc2, I think since exc2 is the base transform is less
confusing to include bins in exc3)
- corrected mnm3 transform, found an inconsistency in infill methods between dualprocess
and postprocess functions
- updated postmunge label processing to be consistent with automunge in that rows
corresponding to missing label values are dropped.
- found and corrected bug associated with infill (was saving and accessing infill values
incorrectly resulting in unintentional overwrite in some instances)
- fixed bug with ors6
- fixed bug with dhmc
- fixed bug with sccs
- a few code comment cleanups in suite of time-series processing functions
- corrected derivation of driftreport metrics for a few of the numerical transforms to
take place prior to transformations
- corrected family tree for nmdx
- addressed outlier scenario for splt, spl2, spl6 when cateogrical entries include
numeric values, converted all numbers to strings for this transform

2.69

- New processing function spl6, ors6
(comparable to spl5, ors6, but with an additional splt transform applied downstream of the spl5 for identification of a second tier of text overlaps within the text overlaps)
- New driftreport metric for activation ratios of categorical entries boolean encoded in 1010
- Corrected column id suffix appender for spl5 transform

2.68

- added dependancies to setup file
- new root category spl5, comparable to spl2 in which strings are parsed for overlap and new column created with overlap entries consolidated, but in this version entries without overlap are set to 0
- new root category ors5, uses spl5, in which a copy of column is ordinal encoded with ord3 and a second copy of column has spl5 transforms applied then ordinal encoded with ord3

2.67

- new automunge parameter evalcat modularizes the automated evaluation of column properties for assignment of root transformation categories, allowing user to pass custom functions for this purpose (as we're little stretched thin I think there is much room for innovation on this point and don't want to hold anyone back with my version if they have a better mousetrap)
formal documentation for this option is pending, for now if you'd like to experiment copy the evalcategory function from master file
- new driftreport metrics for numerical sets such as standard deviation for min-max scaled columns, min and max for z-score normalized columns, activation ratios for binary encoded sets, and column specific activation ratios for one-hot-encoded sets
- fixed bug associated with driftreport assembly for one-hot encoded columns

2.66

- small correction to 'spl2' processing function for populating data structures

Page 89 of 99

Releases

Has known vulnerabilities

Previous Next

Automunge

Page 89 of 99

2.71

2.70

2.69

2.68

2.67

2.66

Page 89 of 99

Links

Releases