Updates associated with quality control audit to ensure application matches documentation.
- extensive update to READ ME documentation
- new design philosophy, any processing function can be applied to any type of data,
although if it is not suited for that data (such as applying numerical transform to a
categorical set) it will return all 0's
- updated all processing functions to achieve this functionality
- updated NArw function and evalcategory function to perform evaluations on copy of
column instead of source column to avoid potential for data corruption
- new NArw root categories based on different distinct NArowtypes NArw (justNaN), NAr2
(numeric) and NAr3 (positivenumeric)
- updated log transforms and bxcx transform functions for postitivenumeric NArwtype
- exc2 now forces all data to numeric and applies modeinfill
- exc3 now includes the bins trasnform such as if user wants to prepare for class
imbalance oversampling a numerical set while leaving the format of source label column
intact (previously we had these in exc2, I think since exc2 is the base transform is less
confusing to include bins in exc3)
- corrected mnm3 transform, found an inconsistency in infill methods between dualprocess
and postprocess functions
- updated postmunge label processing to be consistent with automunge in that rows
corresponding to missing label values are dropped.
- found and corrected bug associated with infill (was saving and accessing infill values
incorrectly resulting in unintentional overwrite in some instances)
- fixed bug with ors6
- fixed bug with dhmc
- fixed bug with sccs
- a few code comment cleanups in suite of time-series processing functions
- corrected derivation of driftreport metrics for a few of the numerical transforms to
take place prior to transformations
- corrected family tree for nmdx
- addressed outlier scenario for splt, spl2, spl6 when cateogrical entries include
numeric values, converted all numbers to strings for this transform