Gretel-synthetics

Latest version: v0.22.14

Safety actively analyzes 682244 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 7 of 11

0.15.0

Major changes:

- Totally refactored modules and package structure. This will enable future contributions to utilize other underlying underlying ML libraries as the core engine. Configurations are now specific to underlying engine. `LocalConfig` can be replaced with `TensorFlowConfig`, although the former is still supported for backwards compatibility.

- With TensorFlow 2.4.x, TensorFlow Privacy can be used to provide differential private training with modified Keras DP optimizers.

- Added new tokenizer module that can be used independently from the underlying model training. By default, we continue to use SentencePiece as the tokenizer. We have also added a char-by-char tokenizer that can be useful to use when using differential privacy.

- Misc bug fixes and optimizations

- Changes in this release are backwards compatible with previous versions.

Please see our updated README and examples directory.

0.15.0.rc0

0.14.1

Enable "Smart Seeding" which allows a prefix to be provided during line generation. The generator will complete the line based on the provided seed. When training on structured data (DataFrames) this enables the first N column values to be pre-provided and then remaining columns will be generated based on the initial values.

0.14.0

- Introduce Keras Early Stopping and Save Best Model features. Set default number of epochs to 100 which should allow most training sequences to automatically stop without potential over-fitting.

- Provide better tracking of which epoch's model was used as the best one in the model history table

- Temporarily disable DP mode

0.13.0

🏃 🏃‍♀️ Speedups! Utilized TensorFlow functions and batch predictions for massive speedups in text generation!

0.12.0

- Included SentencePiece support for tokenizing large datasets. By default, if a dataset is larger than 100,000 lines, we will utilize a sample of 100k to build the tokenizer.

- Switched to a `loky` based backend for multi-processing. This fixes several run-time bugs on Windows platforms and increases general stability for consistent multi-processing text generation.

Page 7 of 11

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.