Automunge

Latest version: v8.33

Safety actively analyzes 714919 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 99

7.94

- fixed a variable name typo originating from 7.87 that was interfering with leakage_tolerance inspection for MLinfill

7.93

- bug fix for variable initialization in labelsencoding_dict support function
- these testings have identified a bug channel not identifiable in our prior validation setup
- associated with python dictionary memory sharing
- this rollout includes the relocation of our validation tests to run on a cloud setup with support of TestPyPI prior to formal PyPI upload
- which we expect will resolve this channel going forward

7.92

- quick bug fix to __stochastic_compute_categoric
- was missing a column header variable initialization, I believe unintentionally struck with 7.88

7.91

- found a bug I believe introduced in 7.87
- turns out is a hard to spot validation channel from a misnamed dictionary variable
- which didn't show up in local testing due to memory sharing of python dictionaries
- but was spotting in a cloud session by accessing from PyPI
- will probably start running redundant validations in a cloud session adjacent to rollouts in some fashion
- this case was originating from assignparam variable in support function __check_for_protected_features

7.90

- new automunge parameter cat_type, accepts boolean defaulting to False
- when activated, returned categoric integer encodings are converted to pandas categoric on a column by column basis based on MLinfilltype
- in some cases this may actually slightly increase dataframe memory usage and is redundant with information stored in the postprocess_dict, however we expect there are potential downstream workflows where a user may prefer categoric data type which is the reason for the option.
- note that for cases where a categoric transform feature did not have full representation in the training data set (e.g. as could be the case for fixed width bins with bnwd/bnwo/variants), it is possible that this option will result in test data returned with missing values designated as NaN entries (which is partly why this is not the default).
- Note that this same basis is carried through to postmunge.

7.89

- new entropy seeding option available for sampling_dict specification as 'seeding_type'
- refers to the distinction of whether user passed entropy seeds are to be integrated as supplemental seeds in conjunction with those seeds sourced from the operating system verses using the passed entropy seeds as the only seeds
- sampling_dict['seeding_type'] = 'supplemental_seeds' means that entropy seeds are integrated into np.random.SeedSequence in conjunction with entropy seeding from the OS.
- sampling_dict['seeding_type'] = 'primary_seeds' means the passed entropy seeds are the only source of seeding.
- note that unless otherwise specified, 'primary_seeds' is used as the default seeding_type in conjunction with the sampling_dict['sampling_type'] = 'bulk_seeds' scenario, and 'supplemental_seeds' is used as the default seeding_type in conjunction with all other sampling_dict['sampling_type'] scenarios
- also new printstatus scenario available for automunge and postmunge as 'summary'
- 'summary' only prints the openning and closing messages for succinctness purposes
- struck a redundant support function in mlti
- also, validation result now logged when user doesn't specify processdict['noise_transform'] as missing_process_dict_noise_transform
- in other words, for custom noise transforms passed through processdict, it is important to specify processdict['noise_transform'] to ensure entropy seeding works consistent to documentation

Page 6 of 99

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.