Datasets

Latest version: v3.1.0

Safety actively analyzes 681857 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 6

4.9.1

Added

Changed

Deprecated

Removed

Fixed

- The installation on macOS now works (see issues
[4805](https://github.com/tensorflow/datasets/issues/4805) and
[4852](https://github.com/tensorflow/datasets/issues/4852)). The ArrayRecord
dependency is lazily loaded, so the
[TensorFlow-less path](https://www.tensorflow.org/datasets/tfless_tfds) is
not possible at the moment on macOS. A fix for this will follow soon.

Security

4.9.0

Added

- Native support for JAX and PyTorch. TensorFlow is no longer a dependency for
reading datasets. See the
[documentation](https://www.tensorflow.org/datasets/tfless_tfds).
- Added minival split to
[LVIS dataset](https://www.tensorflow.org/datasets/catalog/lvis).
- [Mixed-human](https://www.tensorflow.org/datasets/catalog/robomimic_mh) and
[machine-generated](https://www.tensorflow.org/datasets/catalog/robomimic_mg)
robomimic datasets.
- WebVid dataset.
- ImagenetPI dataset.
- [Wikipedia](https://www.tensorflow.org/datasets/catalog/wikipedia) for
20230201.

Changed

- Support for `tensorflow=2.12`.

Deprecated

Removed

Fixed

Security

4.8.3

Added

Changed

Deprecated

- Python 3.7 support: this version and future version use Python 3.8.

Removed

Fixed

- Flag `ignore_verifications` from Hugging Face's `datasets.load_dataset` is
deprecated, and used to cause errors in `tfds.load(huggingface:foo)`.

Security

4.8.2

Deprecated

- Python 3.7 support: this is the last version of TFDS supporting Python 3.7.
Future versions will use Python 3.8.

Fixed

- `tfds new` and `tfds build` better support the new recommended datasets
organization, where individual datasets have their own package under
`datasets/`, builder class is called `Builder` and is defined within module
`${dsname}_dataset_builder.py`.

Security

4.8.1

Changed

- Added file `valid_tags.txt` to not break builds.
- TFDS no longer relies on TensorFlow DTypes. We chose NumPy DTypes to keep the
typing expressiveness, while dropping the heavy dependency on TensorFlow. We
migrated all our internal datasets. Please, migrate accordingly:
- `tf.bool`: `np.bool_`
- `tf.string`: `np.str_`
- `tf.int64`, `tf.int32`, etc: `np.int64`, `np.int32`, etc
- `tf.float64`, `tf.float32`, etc: `np.float64`, `np.float32`, etc

4.8.0

Added

- [API] `DatasetBuilder`'s description and citations can be specified in
dedicated `README.md` and `CITATIONS.bib` files, within the dataset package
(see https://www.tensorflow.org/datasets/add_dataset).
- Tags can be associated to Datasets, in the `TAGS.txt` file. For
now, they are only used in the generated documentation.
- [API][Experimental] New `ViewBuilder` to define datasets as transformations
of existing datasets. Also adds `tfds.transform` with functionality to apply
transformations.
- Loggers are also called on `tfds.as_numpy(...)`, base `Logger` class has a
new corresponding method.
- `tfds.core.DatasetBuilder` can have a default limit for the number of
simultaneous downloads. `tfds.download.DownloadConfig` can override it.
- `tfds.features.Audio` supports storing raw audio data for lazy decoding.
- The number of shards can be overridden when preparing a dataset:
`builder.download_and_prepare(download_config=tfds.download.DownloadConfig(num_shards=42))`.
Alternatively, you can configure the min and max shard size if you want TFDS
to compute the number of shards for you, but want to have control over the
shard sizes.

Changed

Deprecated

Removed

Fixed

Security

Page 2 of 6

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.