Gretel-synthetics

Latest version: v0.22.14

Safety actively analyzes 682244 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 10 of 11

0.9.3

⬆️ Upgraded to latest SetencePiece and added a `max_line_len` param to the Config options. This allows you to override the default SentencePiece line limit and set a custom one. During our testing, we found that we had to set the limit a few thousand characters higher than the actual line limit. For a line that was 49500 chars long, we had to make the limit about 53000, etc.

0.9.2

🐛 On installation from PIP where setup.py would fail.

📓 Updates to UCI Notebook

0.9.1

This update removes the `annotations` module from being used in order to provide type checks. We also provide Python 3.6 support by using the `[3.6]` extras option. By default, the package will work on Colab since Colab already installs a back port of `dataclasses`. So installing on Colab with the extras is not necessary.

0.9.0

**NOTE**: This release introduces some new constructs that are NOT backwards compatible with older versions.

⚙️ Configuration Changes:
* By default, we will not assume any structure in your training text. Lines will be generated without any presumed delimiter between the text. To use a delimiter you must specify the `field_delimiter` param when constructing your configuration. Our example notebooks have been updated to reflect this.
* Overwrite protection, if there is already a model and tokenizer in your checkpoint directory, you will receive a `RunTimeError` when attempting to train a new model that would overwrite the old data. If you wish to keep overwriting (like during rapid model generation / testing), set the `overwrite` param to `True` in your configuration. Example notebook has been updated to show this param.

👩‍🍳 Cooking up new data
* Previously, we would yield a `dict` when generating a new record. Instead, we will yield a `gen_text` object. This object has the same data, but you access the various components as attrs of the object, for example if you have a `line` variable that was emitted from the generator, you can access the raw text by doing `line.text`
* If you provided a delimiter during configuration. The `gen_text` objects are aware of this, and you can get your generated fields by using the `values_to_list()` method of the object. See our docs for more detail on this object: https://gretel-synthetics.readthedocs.io/en/stable/api/generate.html

👨‍💻 Code cleanup and test updates

0.8.0

📖 Module docs now available at https://gretel-synthetics.readthedocs.io

🚧 Minor updates to internals to support better documentation

0.7.1

📚 Tutorial and doc improvements

* Use installed Tensorflow library by default (Colab uses optimized Tensorflow version for TPU)
* Optionally, install pinned version of Tensorflow with `pip install gretel-synthetics[tf]`

Page 10 of 11

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.