Automunge

Latest version: v8.33

Safety actively analyzes 681866 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 45 of 99

5.50

- a quality control audit performed on returned data types from ordinal transforms
- turns out we had a few inconsistent approaches
- where base ordinal transforms ordl and ord3 set the returned data type as a function of size of encoding space
- (ranging from uint8 / uint16 / uint32)
- and a few of the other ordinal transforms did not include these conditional types
- so went ahead and updated returned data types for transformation functions pwor / bnwo / bneo / bkt3 / bkt4 to be conditional based on size of encoding space
- also updated wkds transform to set data type as int8
- also small cleanup to correct a few MLinfill types associated with spl2 transform (from singlct to exclude)
- note this isn't expected to change operation for any current family tree configurations, just trying to keep everything consistent

5.49

- a small improvement
- added an entry to the postreports_dict reports returned from postmunge
- now includes details of row counts that served as basis for drift stats
- including row counts from automunge train set and postmunge test set
- may be a little helpful for quickly running sanity check on validaty of drift stats

5.48

- a review of the noise injection transform DPod identified an opportunity for cleaner code
- by replacing a for loop through activations to a single operation performed in parallel
- much cleaner this way, should be more efficient

5.47

- a review of the noise injection transforms identified a point inconsistent with code comments / readme description
- specifically in DPmm and DPrt the scaled noise is subject to a cap on outliers at +/- midpoint of range
- to ensure returned range consistent with scaled input
- realized the implementation had instead of capping noise subjected outliers to infill
- so reverted to capped outliers to be consistent with documentation
- note that this is not expected to have any material impact on experiment results from Numbers Game paper
- since the noise profile of those experiments had standard deviations well below the cap threshold

5.46

- found and fixed a small issue where postmunge inversion was overwriting one of entries in postprocess_dict
- associated with entry for postprocess_dict['finalcolumns_train']
- which was creating an edge case associated with numpy inversion in cases where Binary transform applied
- also two other mostly immaterial revisions to prevent postprocess_dict overwrites during postmunge
- associated with entries for traindata and infilliterate

5.45

- found a small oversight with auto ML options for ML infill
- realized the autogluon and catboost implementations were missing the random seeding
- from randomseed parameter passed to automunge
- so went ahead and incorproated random seeding into fit / model initialization
- in general we try to use a consistent random seed for all methods based on this parameter

Page 45 of 99

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.