- found and fixed silly bug with the oversampling method - removed an unused function relic - found an opportunity to consolidate two of the MLinfilltypes to a single entry - which is much cleaner / less confusing now - in summary, MLinfilltypes multirt and multisp are both now aggregated as multirt - multisp no more
4.29
- removed a redundant adjacent row infill application in dxdt and dxd2 - corrected NArowtype processdict entry for shuffle transform to exclude
4.28
- new data privacy option for string parsing functions via 'int_headers' parameter - 'int_headers' is boolean, defaults to False - when passed as True string partitions are not included in returned column headers - such as may be appropriate for like healthcare applications or such - also improved inversion for string parsing with concurrent activations, now with more information retention
4.27
- found and fixed bug for feature importance dimensionality reduction that in some cases was interfering with postmunge - fixed feature importance dimensionality reduction printouts to retain order of columns in returned set - added new printout to Binary dimensionality reduction with list of consolidatecd boolean columns - tweak to code from 4.26 to run more efficiently
4.26
- tweak to feature importance dimensionality reduction - now binary encoded columns, such as via '1010' transform, are retained as a full set in returned data, even when a subset would otherwise have been part of dimensionality reduction - (based on inspection of MLinfilltype for the transformation category)
4.25
- aggregated postmunge inversion operations into a support function - inversion now supported for sets with feature importance dimensionality reduction