Automunge

Latest version: v8.33

Safety actively analyzes 681857 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 52 of 99

5.8

- corrected printouts in assignnan
- one more revision to categoric transforms associated with recent updates to allow distinct encodings between numbers and string equivalents
- added str_convert parameter to ordl, ord3, 1010, onht such as revert to to consistently encoding between strings and numbers
- str_convert defaults to False e.g. 2 != '2', when passed as True e.g. 2 == '2'
- thus allowing allowing user to have consistent convention betweehn these transforms and text transform if desired
- again where text is one-hot encoding that requires a string convert operation due to convention for returned column headers
- this is really diving into the weeds. Details matter.

5.7

- cleaned up the implementation to address edge case identified in 5.6
- now dtype conversion is conditional which may improve efficiency
- applicable to categoric transforms with sequences of replace operations

5.6

- revision to one hot encoding function onht and binary encoding function 1010
- to facilitate distinguished encodings between numerical and equivalent string entries
- for example, previously entries of 2 and '2' would have been consistently encoded
- now these each will return a distinct encoding
- note that due to column labeling convention, 'text' version of one hot encoding retains treatment of numbers as strings
- so if you want to disginguish numbers from strings in one hot encoding use onht instead of text
- also found a remote edge case for ordl associated with dtype shift between object and int after a replace operation, cleaned it up

5.5

- revision to ordinal encoding functions ordl and ord3
- to facilitate distinguished encodings between numerical and equivalent string entries
- for example, previously entries of 2 and '2' would have been consistently encoded
- now these each will return a distinct encoding
- consistent support to one-hot and binary encodings pending

5.4

- a revision to support functions associatec with edge cases for infill such as inf values and those infill points assigned in assignnan
- to use .loc instead of np.where in order to retain pandas column properties
- such as to ensure ordered categoric method rolled out in 5.3 works properly

5.3

- one more cleanup to columntype_report populating support function
- (fixed scenario when Binary transform applied)
- columntype_report now in good shape
- a few code comment cleanups
- added 'ordered_overide' parameter to ordinal encodings ordl and ord3
- which are the integer encodings sorted by alphabetic and frequency respectively
- ordered_overide is boolean defaults to True
- when activated target columns are inspected for if they are Pandas categorical with ordered = True
- in which case pandas may have already recieved input from user of an ordered sequence of categoric entries
- in which case the ordinal integer encoding order defers to the recieved designation

Page 52 of 99

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.