Automunge

Latest version: v8.33

Safety actively analyzes 706259 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 77 of 99

3.49

- new automunge(.) parameters defaultcategoric, defaultnumeric, defaultdatetime
- to simplify overwriting default transform categories for categoric, numeric, and datetime data under automation
- these default to defaultcategoric = '1010', defaultnumeric = 'nmbr', defaultdatetime = 'dat6'
- for example to change default categorical encoding from binary encoding to one-hot, pass defaultcategoric = 'text'
- or to change default numeric normalization from z-score to mean scaling, pass defaultnumeric = 'mean'
- note that family trees for default transformations can alternatively be overwritten with a passed trasnsformdict
- updated READ ME description of evalcat parameter for new evalcategory function arguments in support of defaultcategoric, defaultnumeric, defaultdatetime
- 'mean' normalization transform now accepts comparable parameters to 'nmbr', including cap/floor/multiplier/offset
- corrected description of mean scaling derviation for 'mean' transform in READ ME

3.48

- corrected the default setting for retn transform multiplier parameter from False to 1
- updated methods for retn transform, now if all values in set are <1, scales data between -1 and 0
- (all positive sets are given minmax scaling between 0 and 1, all negative sets are given maxmin scaling between -1 and 0, and mixed sets are given retn scaling between -1 and 1 (at min/max decimal points based on set's min/max)
- corrected MLinfilltype process_dict entries for wkdy, bshr, hldy from singlct to binary
- removed the postmunge(.) convention to drop rows corresponding to label column nan to match recent update to automunge(.)
- found and replaced two straggler methods that had relied on evaluating header string composition
- moved postmunge(.) initialization of empty label set a little earlier to fix just found bug for shuffletrain parameter for cases without passed labels

3.47

- corrected parameter validations for featuremethod parameter for legal entries in range 0-100
- updated feature importance dimensionality reduction methods to only retain NArw columns that correspond to those columns that remain in the set
- label smoothing is now available for one-hot encoded label sets in which some rows may be missing an activation
- new options for automunge(.) parameter TrainLabelFreqLevel 'test' and 'traintest' for oversampling preparation in test set in addition to train set (for scenario where the set designated as 'df_test' may also include labels and be intended for a training operation)

3.46

- new feature importance report variation and printouts associated with columns sorted by metrics 'metric' and 'metric2'
- available for automunge(.) in postprocess_dict['FS_sorted'] and for postmunge(.) in postreports_dict['FS_sorted']
- added an 'origcolumn' entry to feature importance report
- added support to hldy transform for passed data with timestamp included

3.45

- Rolled out a new section of READ ME containing concise sorted list of all suffix appenders so user passing custom transform function can confirm no overlap.
- A little housekeeping, found that suffix appenders for bnep/bne7/bne9 had an extra underscore so a few small tweeks to match conventions of other transforms.
- New 'holiday_list' parameter accepted for hldy trasnform as list of dates in string enocdings (eg ['2020/03/30'])
- (hldy is boolean indicator if a date is a US Federal Holiday)
- 'holiday_list' parameter allows user to add additional dates to list of recognized holidays.

3.44

- New parameters accepted for 'nmbr' (z-score normalization function):
- 'cap' and 'floor' for setting cap and lfoor to pre-transform values, note thsi applied prior to default mean infill derivation, not can also be passed as True for setting based on min/max found in train set, or can be passed as False for off
- 'multiplier' and 'offset' for applying mnultiplier and offset to posttransform values, note multiplier is applied prior to offset, these default to 1 and 0
- New parameters accepted for 'retn' (retained sign normalization function):
- same as those shown above (cap/floor/multipliuer/offset)
- an additional retn parameter of 'divisor' to select between 'minmax' for divided by max-min or 'std' to divide by standard deviation

Page 77 of 99

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.