Rdt

Latest version: v1.15.0

Safety actively analyzes 723144 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 10

1.8.0

This release adds the 'random' missing value replacement strategy, which uses random values of the dataset to fill in missing values.
Additionally users are now able to use the `UniformUnivariate` distribution within the Gaussian Normalizer with this update.

This release contains fixes for the `ClusterBasedNormalizer` which crashes in the reverse transform caused by values being out of bounds
and a patch for the randomization issue dealing with different values after applying `reset_randomization`.

Anonymization has been moved into RDT library from SDV as it was found to self contained module for RDT and would reduce dependencies needed in SDV.

Features

* Make the default missing value imputation 'mean' - Issue[730](https://github.com/sdv-dev/RDT/issues/730) by R-Palazzo
* When no rounding scheme is detected, log the info instead of showing a warning - Issue[709](https://github.com/sdv-dev/RDT/issues/709) by frances-h
* The GaussianNormalizer should accept distribution names that are consistent with scipy - Issue[656](https://github.com/sdv-dev/RDT/issues/656) by fealho
* The GaussianNormalizer should accept uniform distributions - Issue[655](https://github.com/sdv-dev/RDT/issues/655) by fealho
* Remove psutil - Issue[615](https://github.com/sdv-dev/RDT/issues/615) by fealho
* Consider deprecating the FrequencyEncoder - Issue[614](https://github.com/sdv-dev/RDT/issues/614) by fealho
* Replace missing values with variable (random) values from the dataset - Issue[606](https://github.com/sdv-dev/RDT/issues/606)

Bugs

* RDT Uniform Encoder creates nan Value bug - Issue[719](https://github.com/sdv-dev/RDT/issues/719) by lajohn4747
* HyperTransformer transforms while fitting and messes up the random seed - Issue[716](https://github.com/sdv-dev/RDT/issues/716) by pvk-developer
* Resolve locales warning for specific sdtype/locale combos (eg. en_US with postcode) - Issue[701](https://github.com/sdv-dev/RDT/issues/701) by pvk-developer
* The OrderedLabelEncoder should not accept duplicate categories - Issue[673](https://github.com/sdv-dev/RDT/issues/673) by frances-h
* ClusterBasedNormalizer crashes on reverse transform (IndexError) - Issue[672](https://github.com/sdv-dev/RDT/issues/672) by fealho
* Unnecessary warning in OneHotEncoder when there are nan values - Issue[616](https://github.com/sdv-dev/RDT/issues/616) by fealho

Maintenance

* Remove performance tests - Issue[707](https://github.com/sdv-dev/RDT/issues/707) by fealho
* ClusterBasedNormalizer code cleanup - Issue[696](https://github.com/sdv-dev/RDT/issues/696) by fealho
* Switch default branch from master to main - Issue[687](https://github.com/sdv-dev/RDT/issues/687) by amontanez24

Deprecations

* The `frequencyEncoder` transformer will no longer be supported in future versions of RDT. Please use the `UniformEncoder` transformer instead.
* `GaussianNormalizer` distribution option names have been updated to be consistent with scipy. `gaussian` -> `norm`, `student_t`-> `t`, and `truncated_gaussian` -> `truncnorm`

1.7.0

This release adds 3 new transformers:

1. `UniformEncoder` - A categorical and boolean transformer that converts the column into a uniform distribution.
2. `OrderedUniformEncoder` - The same as above, but the order for the categories can be specified, changing which range in the uniform distribution each category belongs to.
3. `IDGenerator`- A text transformer that drops the input column during transform and returns IDs during reverse transform. The IDs all take the form \<prefix>\<number>\<suffix> and can be configured with a custom prefix, suffix and starting point.

Additionally, the `AnonymizedFaker` is enhanced to support the text sdtype.

Deprecations

* The `get_input_sdtype` method is being deprecated in favor of `get_supported_sdtypes`.

New Features

* Create IDGenerator transformer - Issue [675](https://github.com/sdv-dev/RDT/issues/675) by R-Palazzo
* Add UniformEncoder (and its ordered version) - Issue [678](https://github.com/sdv-dev/RDT/issues/678) by R-Palazzo
* Allow me to use AnonymizedFaker with sdtype text columns - Issue [688](https://github.com/sdv-dev/RDT/issues/688) by amontanez24

Maintenance

* Deprecate get_input_sdtype - Issue [682](https://github.com/sdv-dev/RDT/issues/682) by R-Palazzo

1.6.1

This release updates the default transformers used for certain sdtypes. It also enables the `AnonymizedFaker` and `PseudoAnonymizedFaker` to work with any sdtype besides boolean, categorical, datetime, numerical or text.

Bugs

* [Enterprise Usage] Unable to assign generic PII transformers (eg. AnonymizedFaker) - Issue [674](https://github.com/sdv-dev/RDT/issues/674) by amontanez24

New Features

* Update the default transformers that HyperTransformer assigns to each sdtype - Issue [664](https://github.com/sdv-dev/RDT/issues/664) by amontanez24

1.6.0

This release adds the ability to generate missing values to the `AnonymizedFaker`. Users can now provide the `missing_value_generation` parameter during initialization. They can set it to `None` to not generate any missing values, or `'random'` to generate random missing values in the same proportion as the fitted data.

Additionally, this release improves the `NullTransformer` by allowing nulls to be replaced on the forward transform even if `missing_value_generation` is set to None. It also fixes a bug that was causing the `UnixTimestampEncoder` to return a different dtype than the input on `reverse_transform`. This was particularly problematic when datetime columns are represented as ints.

New Features

* AnonymizedFaker should be able to model and generate missing values - Issue [660](https://github.com/sdv-dev/RDT/issues/660) by R-Palazzo

Bugs

* The datetime transformers don't give me back the same dtype sometimes - Issue [657](https://github.com/sdv-dev/RDT/issues/657) by frances-h
* RDT NullTransformer doesn't replace nulls if missing_value_generation is None - Issue [658](https://github.com/sdv-dev/RDT/issues/658) by amontanez24

Maintenance

* Remove python 3.7 builds - Issue [663](https://github.com/sdv-dev/RDT/issues/663) by amontanez24
* Drop support for Python 3.7 - Issue [666](https://github.com/sdv-dev/RDT/issues/666) by amontanez24

Internal

* Add add-on modules to sys.modules - Issue [653](https://github.com/sdv-dev/RDT/issues/653) by amontanez24

1.5.0

This release adds a new parameter called `missing_value_generation` to the initialization of certain transformers to specify how missing values should be created. The parameter can be used in the `FloatFormatter`, `BinaryEncoder`, `UnixTimestampEncoder`, `OptimizedTimestampEncoder`, `GaussianNormalizer` and `ClusterBasedNormalizer`. Additionally, it fixes a bug that was causing every column that had nulls to generate them in the same place.

Deprecations

* The `model_missing_values` parameter is being deprecated in favor of the new `missing_value_generation` parameter.

Bugs

* Fix randomization when creating null values - Issue [639](https://github.com/sdv-dev/RDT/issues/639) by fealho

New Features

* Allow a no-op handling strategy for missing values (nulls) - Issue [644](https://github.com/sdv-dev/RDT/issues/644) by pvk-developer
* Add add-on detection for premium transformers - Issue [646](https://github.com/sdv-dev/RDT/issues/646) by frances-h

Maintenance

* Performance tests still fragile - Issue [641](https://github.com/sdv-dev/RDT/issues/641) by fealho
* Investigate removing quality tests - Issue [642](https://github.com/sdv-dev/RDT/issues/642) by amontanez24

1.4.2

This release fixes a bug that caused datetime and numerical transformers to crash if a column was all NaNs. Additionally, it adds support for Pandas 2.0!

Bugs

* Numerical & datetime transformers crash if the entire column is null - Issue [637](https://github.com/sdv-dev/RDT/issues/637) by fraces-h

Maintenance

* Remove upper bound for pandas - Issue [633](https://github.com/sdv-dev/RDT/issues/633) by pvk-developer

Page 4 of 10

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.