Automunge

Latest version: v8.33

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Page 45 of 99

5.50

- a quality control audit performed on returned data types from ordinal transforms
- turns out we had a few inconsistent approaches
- where base ordinal transforms ordl and ord3 set the returned data type as a function of size of encoding space
- (ranging from uint8 / uint16 / uint32)
- and a few of the other ordinal transforms did not include these conditional types
- so went ahead and updated returned data types for transformation functions pwor / bnwo / bneo / bkt3 / bkt4 to be conditional based on size of encoding space
- also updated wkds transform to set data type as int8
- also small cleanup to correct a few MLinfill types associated with spl2 transform (from singlct to exclude)
- note this isn't expected to change operation for any current family tree configurations, just trying to keep everything consistent

5.49

- a small improvement
- added an entry to the postreports_dict reports returned from postmunge
- now includes details of row counts that served as basis for drift stats
- including row counts from automunge train set and postmunge test set
- may be a little helpful for quickly running sanity check on validaty of drift stats

5.48

- a review of the noise injection transform DPod identified an opportunity for cleaner code
- by replacing a for loop through activations to a single operation performed in parallel
- much cleaner this way, should be more efficient

5.47

- a review of the noise injection transforms identified a point inconsistent with code comments / readme description
- specifically in DPmm and DPrt the scaled noise is subject to a cap on outliers at +/- midpoint of range
- to ensure returned range consistent with scaled input
- realized the implementation had instead of capping noise subjected outliers to infill
- so reverted to capped outliers to be consistent with documentation
- note that this is not expected to have any material impact on experiment results from Numbers Game paper
- since the noise profile of those experiments had standard deviations well below the cap threshold

5.46

- found and fixed a small issue where postmunge inversion was overwriting one of entries in postprocess_dict
- associated with entry for postprocess_dict['finalcolumns_train']
- which was creating an edge case associated with numpy inversion in cases where Binary transform applied
- also two other mostly immaterial revisions to prevent postprocess_dict overwrites during postmunge
- associated with entries for traindata and infilliterate

5.45

- found a small oversight with auto ML options for ML infill
- realized the autogluon and catboost implementations were missing the random seeding
- from randomseed parameter passed to automunge
- so went ahead and incorproated random seeding into fit / model initialization
- in general we try to use a consistent random seed for all methods based on this parameter

Page 45 of 99

Releases

Has known vulnerabilities

Previous Next

Automunge

Page 45 of 99

5.50

5.49

5.48

5.47

5.46

5.45

Page 45 of 99

Links

Releases