Added
- Dataset-as-folder: Dataset can now be self-contained module in a folder with
checksums, dummy data,... This simplify implementing datasets outside the
TFDS repository.
- `tfds.load` can now load dataset without using the generation class. So
`tfds.load('my_dataset:1.0.0')` can work even if `MyDataset.VERSION ==
'2.0.0'` (See 2493).
- TFDS CLI (see https://www.tensorflow.org/datasets/cli for detail).
- `tfds.testing.mock_data` does not require metadata files anymore!
- `tfds.as_dataframe(ds, ds_info)` with custom visualisation
([example](https://www.tensorflow.org/datasets/overview#tfdsas_dataframe)).
- `tfds.even_splits` to generate subsplits (e.g. `tfds.even_splits('train',
n=3) == ['train[0%:33%]', 'train[33%:67%]', ...]`.
- `DatasetBuilder.RELEASE_NOTES` property.
- `tfds.features.Image` now supports PNG with 4-channels.
- `tfds.ImageFolder` now supports custom shape, dtype.
- Downloaded URLs are available through `MyDataset.url_infos`.
- `skip_prefetch` option to `tfds.ReadConfig`.
- `as_supervised=True` support for `tfds.show_examples`, `tfds.as_dataframe`.
- tfds.features can now be saved/loaded, you may have to overwrite
[FeatureConnector.from_json_content](https://www.tensorflow.org/datasets/api_docs/python/tfds/features/FeatureConnector?version=nightly#from_json_content)
and `FeatureConnector.to_json_content` to support this feature.
- Script to detect dead-urls.
- New datasets.
Changed
- `tfds.as_numpy()` now returns an iterable which can be iterated multiple
times. To migrate: `next(ds)` -> `next(iter(ds))`.
- Rename `tfds.features.text.Xyz` -> `tfds.deprecated.text.Xyz`.
Removed
- `DatasetBuilder.IN_DEVELOPMENT` property.
- `tfds.core.disallow_positional_args` (should use Py3 `*,` instead).
- Testing against TF 1.15. Requires Python 3.6.8+.
Fixed
- Better archive extension detection for `dl_manager.download_and_extract`.
- Fix `tfds.__version__` in TFDS nightly to be PEP440 compliant
- Fix crash when GCS not available.
- Improved open-source workflow, contributor guide, documentation.
- Many other internal cleanups, bugs, dead code removal, py2->py3 cleanup,
pytype annotations,...
- Datasets updates.