Torchdata

Latest version: v0.11.0

Safety actively analyzes 723607 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 2

0.10.1

What's Changed

This release introduces 3 major changes:
1) Introducing [`torchdata.nodes`](https://pytorch.org/data/main/what_is_torchdata_nodes.html), a library of extensible and composable iterators that lets you chain together common dataloading and pre-proc operations! This initial release includes the following features, with more on the way:

* [Multi-threaded paralellism](https://pytorch.org/data/main/torchdata.nodes.html#torchdata.nodes.ParallelMapper), and experimental support for [Free-Threaded (No-GIL) Python](https://peps.python.org/pep-0703/), in addition to the typical Multi-process parallelism.
* Note: FT Python support is experimental, requires Python 3.13t and [torch>=2.5.0](https://github.com/pytorch/pytorch/blob/main/RELEASE.md), and is currently only tested for Linux
* [Multi-dataset weighted sampling](https://pytorch.org/data/main/torchdata.nodes.html#torchdata.nodes.MultiNodeWeightedSampler)
* [State Management through state_dict/load_state_dict methods](https://pytorch.org/data/main/torchdata.nodes.html#torchdata.nodes.Loader)
* [Near-feature-parity](https://pytorch.org/data/main/migrate_to_nodes_from_utils.html) with torch.utils.data.DataLoader, with full support for existing torch.utils.data.Dataset (IterableDataset and persistent_workers coming soon!).
* Refer to the [`torchdata.nodes` docs](https://pytorch.org/data/main/what_is_torchdata_nodes.html) for more details.

2) This release drops support for DataPipes and DataLoader2. Release v0.9 was the last stable release which includes them. Please see [this issue](https://github.com/pytorch/data/issues/1196) for more details.

3) PyTorch's [official conda channel is deprecated](https://github.com/pytorch/pytorch/issues/138506). TorchData has removed its conda builds as well. TorchData will be available for installation through pip, on PyPI and download.pytorch.org.

**Full Changelog**: https://github.com/pytorch/data/compare/v0.9.0...v0.10.1

divyanshk ramanishsingh andrewkho

0.9.0

What's Changed
This was a relatively small release compared to previous. This will notably be the last stable release to feature DataPipes and DataLoader2!
* Drop Python 3.8 support
* Make DistributedSampler stateful by ramanishsingh in https://github.com/pytorch/data/pull/1315

New Contributors
* jovianjaison made their first contribution in https://github.com/pytorch/data/pull/1314
* ramanishsingh made their first contribution in https://github.com/pytorch/data/pull/1315

**Full Changelog**: https://github.com/pytorch/data/commits/v0.9.0

0.8.0

[TorchData 0.8.0](https://github.com/pytorch/data/releases/tag/v0.8.0) [Latest](https://github.com/pytorch/data/releases/latest)

Highlights

We are excited to announce the release of TorchData 0.8.0. This first release of [StatefulDataLoader](https://github.com/pytorch/data/tree/release/0.8/torchdata/stateful_dataloader#statefuldataloader), which is a drop-in replacement for [torch.utils.data.DataLoader](https://github.com/pytorch/pytorch/blob/main/torch/utils/data/dataloader.py), offering state_dict/load_state_dict methods for handling mid-epoch checkpointing.

Deprecations

⚠️ June 2024 Status Update: Removing DataPipes and DataLoader V2

We are re-focusing the torchdata repo to be an iterative enhancement of torch.utils.data.DataLoader. We do not plan on continuing development or maintaining the [DataPipes] and [DataLoaderV2] solutions, and they will be removed from the torchdata repo. We'll also be revisiting the DataPipes references in pytorch/pytorch. In release torchdata==0.8.0 (July 2024) they will be marked as deprecated, and in 0.9.0 (Oct 2024) they will be deleted. Existing users are advised to pin to torchdata==0.8.0 or an older version until they are able to migrate away. Subsequent releases will not include DataPipes or DataLoaderV2. The old version of this README is [available here](https://github.com/pytorch/data/blob/v0.7.1/README.md). Please reach out if you suggestions or comments (please use https://github.com/pytorch/data/issues/1196 for feedback).

**Full Changelog**: https://github.com/pytorch/data/commits/v0.8.0

0.7.1

[TorchData 0.7.1](https://github.com/pytorch/data/releases/tag/v0.7.1) [Latest](https://github.com/pytorch/data/releases/latest)

Current status
⚠️ As of July 2023, we have paused active development on TorchData and have paused new releases. We have learnt a lot from building it and hearing from users, but also believe we need to re-evaluate the technical design and approach given how much the industry has changed since we began the project. During the rest of 2023 we will be re-evaluating our plans in this space. Please reach out if you suggestions or comments (please use https://github.com/pytorch/data/issues/1196 for feedback).

This is a patch release, which is compatible with [PyTorch 2.1.1](https://github.com/pytorch/pytorch/releases/tag/v2.1.1). There are no new features added.

0.7.0

Current status

**:warning: As of July 2023, we have paused active development on TorchData and have paused new releases. We have learnt a lot from building it and hearing from users, but also believe we need to re-evaluate the technical design and approach given how much the industry has changed since we began the project. During the rest of 2023 we will be re-evaluating our plans in this space. Please reach out if you suggestions or comments (please use [1196](https://github.com/pytorch/data/issues/1196) for feedback).**

Bug Fixes

- MPRS request/response cycle for workers (https://github.com/pytorch/data/commit/40dd648bdd2b7b9c078ba3d2f47316b6dd4446d3)
- Sequential reading service checkpointing (https://github.com/pytorch/data/commit/8d452cf4d0688fdce478089fe77cba52fc27e1c3)
- Cancel future object and always run callback in FullSync during shutdown (1171)
- DataPipe, Ensures Prefetcher shuts down properly (1166)
- DataPipe, Fix FullSync shutdown hanging issue while paused (1153)
- DataPipe, Fix a word in WebDS DataPipe (1156)
- DataPipe, Add handler argument to iopath DataPipes (1154)
- Prevent in_memory_cache from yielding from source_dp when it's fully cache (1160)
- Fix pin_memory to support single-element batch (1158)
- DataLoader2, Removing delegation for 'pause', 'limit', and 'resume' (1067)
- DataLoader2, Handle MapDataPipe by converting to IterDataPipe internally by default (1146)

New Features

- Implement InProcessReadingService (1139)
- Enable miniepoch for MultiProcessingReadingService (1170)
- DataPipe, Implement pause/resume for FullSync (1130)
- DataLoader2, Saving and restoring initial seed generator (998)
- Add ThreadPoolMapper (1052)

0.6.1

Highlights

This minor release is aligned with PyTorch 2.0.1 and primarily fixes bugs that are introduced in the 0.6.0 release. We sincerely thank our users and contributors for spotting various bugs and helping us to fix them.

Bug Fixes

DataLoader2

* Properly clean up processes and queues for MPRS and Fix pause for prefetch (1096)
* Fix DataLoader2 `seed = 0` bug (1098)
* Previously, if `seed = 0` was passed into `DataLoader2`, the `seed` value in `DataLoader2` would not be set and the seed would be unused. This change fixes that and allow `seed = 0` to be used normally.
* Fix `worker_init_fn` to update DataPipe graph and move worker prefetch to the end of Worker pipeline (1100)

DataPipe

* Fix `pin_memory_fn` to support `namedtuple` (1086)
* Fix typo for `portalocker` at import time (1099)

Improvements

DataPipe

* Skip `FullSync` operation when `world_size == 1` (1065)

Docs

* Add long project description to `setup.py` for display on PyPI (1094)

Beta Usage Note

This library is currently in the Beta stage and currently does not have a fully stable release. The API may change based on user feedback or performance. We are committed to bring this library to stable release, but future changes may not be completely backward compatible. If you install from source or use the nightly version of this library, use it along with the PyTorch nightly binaries. If you have suggestions on the API or use cases you'd like to be covered, please open a GitHub issue. We'd love to hear thoughts and feedback. As always, we welcome new contributors to our repo.

Page 1 of 2

Releases

Has known vulnerabilities

Torchdata

Page 1 of 2

0.10.1

0.9.0

0.8.0

0.7.1

0.7.0

0.6.1

Page 1 of 2

Links

Releases