Automunge

Latest version: v8.33

Safety actively analyzes 715032 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 74 of 99

3.67

- new processing function srch
- accepts parameter 'search' as a list of search strings
- parses categorical set entry strings to find character subset overlaps with search strings
- when overlaps identified returns new columns with boolean activations for identified overlaps
- new processing function src2
- comparable to srch, but assumes unique values in test set is same or subset of train set for more efficient operation
- fixed bug with postmunge(.) infill application originating from 3.61 (a real head scratcher)
- fixed edge case bug for postmunge ML infill
- corrected process_dict entry for splt transform

3.66

- extended the methods rolled out in 3.65, with additional parameter accepted for splt family transforms splt/spl2/spl5/spl7/spl8/spl9/sp10
- (splt transforms are the string parsing functions which identify character overlaps between unique entries in a categorical set)
- 'excluded_characters' parameter can be passed as a list of strings, defaults to `[' ', ',', '.', '?', '!', '(', ')']`
- these are the strings that are excluded from evaluation of character overlaps when 'space_and_punctuation' parameter passed as False
- thus a user can designate custom sets of characters which they wish to exclude from overlap identifications
- note that entries in this list may also include multi-character strings

3.65

- new parameter accepted for splt family transforms splt/spl2/spl5/spl7/spl8/spl9/sp10
- (splt transforms are the string parsing functions which identify character overlaps between unique entries in a categorical set)
- 'space_and_punctuation' parameter can be passed as True/False, defaults to True
- when passed as False, character overlaps are only recorded when excluding space and punctuation characters in their composition
- based on the space and punctuation characters [' ', ',', '.', '?', '!', '(', ')']
- as an example, when using spl9 function to evaluate a set with unique entries {'North Florida', Central Florida', 'South Florida', 'The Keys'}, if this parameter set as default of True, the returned set would have unique entries {'th Florida', 'Central Florida', 'The Keys'}
- if this parameter passed as False, the returned set would instead have unique entries {'Florida', 'The Keys'}

3.64

- new processing function "tail bins" as tlbn
- intended for use to evaluate feature importance of different segments of a numerical set's distribution
- returns equal population bins in seperate columns with activations replaced by min-max scaled values within that segment's range (between 0-1) and other values subject to an infill of -1
- (where the -1 infill intended as a register to signal to ML training that infill applied)
- note that the bottom bin has order reversed to maintain consistent -1 register and support subsequent values out of range
- accepts parameter 'bincount' as an integer to specify number of bins returned
- when run through feature importance, the metric 'metric2', such as printed and returned in postprocess_dict['FS_sorted'], can give an indication of relative importance of different segments of the distribution
- this may be useful to evaluate influence of tail events for instance
- also corrected parameter initializations for bnep/bne7/bne9/bneo/bn7o/bn9o

3.63

- new validation check to detect presence of infinite loops in transformdict entries
- in other words, solved the halting problem

3.62

- fixed bug in postmunge(.) originating from 3.61 update
- fixed material typo in READ ME associated with documenting one of data structures for 'singlct' MLinfilltype

Page 74 of 99

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.