In this release, we focused on building a [**Data Hub for offline RL**](https://pytorch.org/rl/reference/data.html#datasets), providing a universal <lib>2gym conversion tool (1795) and improving the doc.
TorchRL Data Hub
TorchRL now offers many offline datasets in robotics and control or gaming, all under a single data format ([TED for TorchRL Episode Data Format](https://pytorch.org/rl/reference/data.html#torchrl-episode-data-format-ted)). All datasets are one step away of being downloaded: `dataset = <Name>ExperienceReplay(dataset_id, root="/path/to/storage", download=True)` is all you need to get started.
This means that you can now download OpenX 1751 or Roboset 1743 datasets and combine them in a single replay buffer 1768 or swap one another in no time and with no extra code.
We allow many new sampling techniques, like sampling slices of trajectories with or without repetition etc.
As always you can append your favourite transform to these transforms.
TorchRL2Gym universal converter
1795 introduces a new universal converter for simulation libraries to gym.
As RL practitioner, it's sometimes difficult to accommodate for the many different environment APIs that exist. TorchRL now provides a way of registering any env in gym(nasium). This allows users to build their dataset in torchrl and integrate them in their code base with no effort if they are already using gym as a backend. It also allows to transform DMControl or Brax envs (among others) to gym without the need for an extra library.
PPO and A2C compatibility with distributed models
Functional calls can now be turned off for PPO and A2C loss modules, allowing users to run RLHF training loops at scale! 1804
TensorDict-free replay buffers
You can now use TorchRL's replay buffer with ANY tensor-based structure, whether it involves dict, tuples or lists. In principle, storing data **contiguously** on **disk** given any gym environment is as simple as
python
rb = ReplayBuffer(storage=LazyMemmapStorage(capacity))
obs_, reward, terminal, truncated, info = env.step(action)
rb.add((obs, obs_, reward, terminal, truncated, info, action))
sampling a tuple obs, reward, terminal, truncated, info
obs, obs_, reward, terminal, truncated, info = rb.sample()
This is independent of TensorDict and it supports many components of our replay buffers as well as transforms. Check the doc [here](https://pytorch.org/rl/reference/data.html#composable-replay-buffers).
Multiprocessed replay buffers
TorchRL's replay buffers can now be shared across processes. Multiprocessed RBs can not only be read from but also extended on different workers. 1724
SOTA checks
We introduce a list of scripts to check that our training scripts work ok before each release: 1822
Throughput of Gym and DMControl
We removed loads of checks in GymLikeEnv if some basic conditions are met, which improves the throughput significantly for simple envs. 1803
Algorithms
We introduce discrete CQL 1666 , discrete IQL 1793 and Impala 1506.
What's Changed: PR description
* [BugFix] Fix incorrect deprecation warning by mikemykhaylov in https://github.com/pytorch/rl/pull/1655
* [Bug] TensorDictMaxValueWriter raises error when no sample in a batch is accepted by albertbou92 in https://github.com/pytorch/rl/pull/1664
* [BugFix] Fix "done" instead of "terminated" mistakes by MarCnu in https://github.com/pytorch/rl/pull/1661
* [Feature] CatFrames constant padding by albertbou92 in https://github.com/pytorch/rl/pull/1663
* doc(README): remove typo by Deep145757 in https://github.com/pytorch/rl/pull/1665
* [Docs] Update README.md by vaibhav-009 in https://github.com/pytorch/rl/pull/1667
* [Minor] Update dreamer example tests by vmoens in https://github.com/pytorch/rl/pull/1668
* [Feature] Introduce grouping in VMAS by matteobettini in https://github.com/pytorch/rl/pull/1658
* [BugFix] assertion error message, envs/util.py by laszloKopits in https://github.com/pytorch/rl/pull/1669
* [Doc] Set `action_spec` instead of `input_spec` by FrankTianTT in https://github.com/pytorch/rl/pull/1657
* [BugFix] Fix submitit IP address/node name retrieval by vmoens in https://github.com/pytorch/rl/pull/1672
* [Doc] Document (and test) compound actor by vmoens in https://github.com/pytorch/rl/pull/1673
* [Doc] Update rollout_recurrent.png to account for terminal by vmoens in https://github.com/pytorch/rl/pull/1677
* [Doc] Add EGreedyWrapper back in the doc by vmoens in https://github.com/pytorch/rl/pull/1680
* [Doc] Fix `TanhDelta` docstring by matteobettini in https://github.com/pytorch/rl/pull/1683
* [Doc] Add discord badge on README by vmoens in https://github.com/pytorch/rl/pull/1686
* [CI] Downgrade RAY to fix CI by vmoens in https://github.com/pytorch/rl/pull/1687
* [BugFix] MaxValueWriter cuda compatibility by albertbou92 in https://github.com/pytorch/rl/pull/1689
* Upload docs for preview on HUD by DanilBaibak in https://github.com/pytorch/rl/pull/1682
* [Doc] Update pendulum and rnn tutos by vmoens in https://github.com/pytorch/rl/pull/1691
* [Algorithm] Discrete CQL by BY571 in https://github.com/pytorch/rl/pull/1666
* [BugFix] Minor fix in the logging of PPO and A2C examples by albertbou92 in https://github.com/pytorch/rl/pull/1693
* [CI] Enable retry mechanism by DanilBaibak in https://github.com/pytorch/rl/pull/1681
* [Refactor] Minor changes in prep of https://github.com/pytorch/tensordict/pull/541 by vmoens in https://github.com/pytorch/rl/pull/1696
* [BugFix] fix dreamer actor by FrankTianTT in https://github.com/pytorch/rl/pull/1697
* [Refactor] Deprecate direct usage of memmap tensors by vmoens in https://github.com/pytorch/rl/pull/1684
* Revert "[Refactor] Deprecate direct usage of memmap tensors" by vmoens in https://github.com/pytorch/rl/pull/1698
* [Refactor] Deprecate direct usage of memmap tensors by vmoens in https://github.com/pytorch/rl/pull/1699
* [Doc] Fix discord link by vmoens in https://github.com/pytorch/rl/pull/1701
* [BugFix] make sure the params of exploration-wrapper is float by FrankTianTT in https://github.com/pytorch/rl/pull/1700
* [Fix] EndOfLifeTransform fix in end of life detection by albertbou92 in https://github.com/pytorch/rl/pull/1705
* [CI] Fix benchmark on gpu by vmoens in https://github.com/pytorch/rl/pull/1706
* [Algorithm] IMPALA and VTrace module by albertbou92 in https://github.com/pytorch/rl/pull/1506
* [Doc] Fix discord link by vmoens in https://github.com/pytorch/rl/pull/1712
* [Refactor] Refactor functional calls in losses by vmoens in https://github.com/pytorch/rl/pull/1707
* [CI] Fix CI by vmoens in https://github.com/pytorch/rl/pull/1711
* [BugFix] Make casting to 'meta' device uniform across cost modules by vmoens in https://github.com/pytorch/rl/pull/1715
* [BugFix] Change ppo mujoco example to match paper results by albertbou92 in https://github.com/pytorch/rl/pull/1714
* [Minor] Hide params in ddpg actor-critic by vmoens in https://github.com/pytorch/rl/pull/1716
* [BugFix] Fix hold_out_net by vmoens in https://github.com/pytorch/rl/pull/1719
* [BugFix] `RewardSum` key check by matteobettini in https://github.com/pytorch/rl/pull/1718
* [Feature] Allow usage of a different device on main and sub-envs in ParallelEnv and SerialEnv by vmoens in https://github.com/pytorch/rl/pull/1626
* [Refactor] Better weight update in collectors by vmoens in https://github.com/pytorch/rl/pull/1723
* [Feature] Shared replay buffers by vmoens in https://github.com/pytorch/rl/pull/1724
* [CI] FIx nightly builds on osx by vmoens in https://github.com/pytorch/rl/pull/1726
* [BugFix] _call_actor_net does not handle multiple inputs by albertbou92 in https://github.com/pytorch/rl/pull/1728
* [Feature] Python-based RNN Modules by albertbou92 in https://github.com/pytorch/rl/pull/1720
* [BugFix, Test] Fix flaky gym vecenvs tests by vmoens in https://github.com/pytorch/rl/pull/1727
* [BugFix] Fix non-full TensorStorage indexing by vmoens in https://github.com/pytorch/rl/pull/1730
* [Feature] Minari datasets by vmoens in https://github.com/pytorch/rl/pull/1721
* [Feature] All VMAS scenarios available by matteobettini in https://github.com/pytorch/rl/pull/1731
* [Feature] pickle-free RB checkpointing by vmoens in https://github.com/pytorch/rl/pull/1733
* [CI] Fix doc upload by vmoens in https://github.com/pytorch/rl/pull/1738
* [BugFix] Fix RNNs trajectory split in VMAP calls by vmoens in https://github.com/pytorch/rl/pull/1736
* [CI] Fix doc upload by vmoens in https://github.com/pytorch/rl/pull/1739
* [BugFix, Feature] Fix DDQN implementation by vmoens in https://github.com/pytorch/rl/pull/1737
* [Algorithm] Update DQN example by albertbou92 in https://github.com/pytorch/rl/pull/1512
* [BugFix] Use rsync in doc workflow by vmoens in https://github.com/pytorch/rl/pull/1741
* [BugFix] Fix compat with new memmap API by vmoens in https://github.com/pytorch/rl/pull/1744
* [Feature] Roboset datasets by vmoens in https://github.com/pytorch/rl/pull/1743
* [Algorithm] Simpler IQL example by BY571 in https://github.com/pytorch/rl/pull/998
* [Performance] Faster RNNs by vmoens in https://github.com/pytorch/rl/pull/1732
* [BugFix, Test] Fix torch.vmap call in RNN tests by vmoens in https://github.com/pytorch/rl/pull/1749
* [BugFix] Fix discrete SAC log-prob by vmoens in https://github.com/pytorch/rl/pull/1750
* [Minor] Remove dead code in RolloutFromModel by ianbarber in https://github.com/pytorch/rl/pull/1752
* [Minor] Fix runnability of RLHF example in examples/rlhf by ianbarber in https://github.com/pytorch/rl/pull/1753
* [Feature] SliceSampler by vmoens in https://github.com/pytorch/rl/pull/1748
* [CI] Fix windows CI by vmoens in https://github.com/pytorch/rl/pull/1746
* [CI] Fix CI for optional dependencies by vmoens in https://github.com/pytorch/rl/pull/1754
* [Feature] V-D4RL by vmoens in https://github.com/pytorch/rl/pull/1756
* [Benchmark] Fix RB benchmarks by vmoens in https://github.com/pytorch/rl/pull/1760
* [BugFix] Fix RLHF by vmoens in https://github.com/pytorch/rl/pull/1757
* [BugFix] Fix slice sampler by vmoens in https://github.com/pytorch/rl/pull/1762
* [Feature] BurnInTransform by albertbou92 in https://github.com/pytorch/rl/pull/1765
* [Bug] Minor change burnin transform by albertbou92 in https://github.com/pytorch/rl/pull/1770
* [BugFix] Fix sampling of last item in SliceSampler by vmoens in https://github.com/pytorch/rl/pull/1774
* [Feature] Open-X Embodiement datasets by vmoens in https://github.com/pytorch/rl/pull/1751
* [BugFix] Fix documentation of threads for batched envs. by skandermoalla in https://github.com/pytorch/rl/pull/1776
* [BugFix, CI] Fix OpenML datasets runs by vmoens in https://github.com/pytorch/rl/pull/1779
* [Versioning] Bump v0.3.0 and fix m1-wheels by vmoens in https://github.com/pytorch/rl/pull/1780
* [Feature] Composite replay buffers by vmoens in https://github.com/pytorch/rl/pull/1768
* [BugFix, Feature] Vmap randomness in losses by BY571 in https://github.com/pytorch/rl/pull/1740
* [Algorithm] Update discrete SAC example by BY571 in https://github.com/pytorch/rl/pull/1745
* [Docs] Pointers to BenchMARL by matteobettini in https://github.com/pytorch/rl/pull/1710
* [Feature] Immutable writer for datasets by vmoens in https://github.com/pytorch/rl/pull/1781
* [Feature] Remove and check for prints in codebase using flake8-print by vmoens in https://github.com/pytorch/rl/pull/1758
* [BUG] Missing import for some Samplers in Data module by albertbou92 in https://github.com/pytorch/rl/pull/1784
* [BugFix] Ensure that infos and samples have the same batch-size in SamplerEnsemble by vmoens in https://github.com/pytorch/rl/pull/1786
* [BugFix] Writers extend() method should always return indices in data.device by albertbou92 in https://github.com/pytorch/rl/pull/1785
* [Doc] Revamp envs doc by vmoens in https://github.com/pytorch/rl/pull/1787
* [BugFix] Less flaky gym vecenv test by vmoens in https://github.com/pytorch/rl/pull/1790
* [CI] Regroup tests by vmoens in https://github.com/pytorch/rl/pull/1791
* [CI] Remove stable GPU tests from CI by vmoens in https://github.com/pytorch/rl/pull/1792
* Update README.md to fix CI banner by vmoens in https://github.com/pytorch/rl/pull/1794
* [Feature] `SamplerWithoutReplacement` state dictionary by matteobettini in https://github.com/pytorch/rl/pull/1788
* [BugFix] Higher time threshold for PEnv by vmoens in https://github.com/pytorch/rl/pull/1799
* [Feature] SignTransform by albertbou92 in https://github.com/pytorch/rl/pull/1798
* [Feature] Extend MaxValueWriter with reduce parameter for the rank_key by albertbou92 in https://github.com/pytorch/rl/pull/1796
* [BugFix] Fixes bug in MaxValueWriter tests by albertbou92 in https://github.com/pytorch/rl/pull/1801
* [Performance] faster gym-like class by vmoens in https://github.com/pytorch/rl/pull/1803
* [Feature] GenDGRL by vmoens in https://github.com/pytorch/rl/pull/1773
* [Performance] Minor improvements to step_and_maybe_reset in batched envs by vmoens in https://github.com/pytorch/rl/pull/1807
* [Algorithm] Discrete IQL by BY571 in https://github.com/pytorch/rl/pull/1793
* [Doc] More depth in VMAS docs by matteobettini in https://github.com/pytorch/rl/pull/1802
* [BugFix] Remove select() in favor of empty() by vmoens in https://github.com/pytorch/rl/pull/1811
* Bump jinja2 from 3.1.2 to 3.1.3 in /docs by dependabot in https://github.com/pytorch/rl/pull/1812
* [BugFix] Make `TransformedEnv` mirror `allow_done_after_reset` property of base env by matteobettini in https://github.com/pytorch/rl/pull/1810
* [Doc] Update StepCounter doc by skandermoalla in https://github.com/pytorch/rl/pull/1813
* [Feature] Improve info_dict reader by vmoens in https://github.com/pytorch/rl/pull/1809
* [CI, Minor] Regroup Gen-DGRL CI with other libs by vmoens in https://github.com/pytorch/rl/pull/1814
* [Versioning] Housekeeping in setup.py by vmoens in https://github.com/pytorch/rl/pull/1816
* [Feature] TorchRL2Gym conversion by vmoens in https://github.com/pytorch/rl/pull/1795
* [BugFix, CI] Fix snapshop imports in stable CI by vmoens in https://github.com/pytorch/rl/pull/1821
* [Feature] More flexibility in loading PettingZoo by matteobettini in https://github.com/pytorch/rl/pull/1817
* [Docs] Fix doc of ToTensorImage transforms.py by skandermoalla in https://github.com/pytorch/rl/pull/1824
* [BugFix] Fix device of container generated values in transforms by vmoens in https://github.com/pytorch/rl/pull/1827
* [Feature] Atari DQN dataset by vmoens in https://github.com/pytorch/rl/pull/1815
* [Feature] Non-functional objectives (PPO, A2C, Reinforce) by vmoens in https://github.com/pytorch/rl/pull/1804
* [Refactor] change default CKPT_BACKEND to torch by vmoens in https://github.com/pytorch/rl/pull/1830
* pyproject.toml: remove unknown properties by GaetanLepage in https://github.com/pytorch/rl/pull/1828
* [Doc, Feature] Doc improvements for video recording and CSV video formats by vmoens in https://github.com/pytorch/rl/pull/1829
* [Feature] PyTrees in replay buffers by vmoens in https://github.com/pytorch/rl/pull/1831
* [BugFix] Fix sequential step counts by vmoens in https://github.com/pytorch/rl/pull/1838
* [Doc] TED format by vmoens in https://github.com/pytorch/rl/pull/1836
* [Doc] References to TED by vmoens in https://github.com/pytorch/rl/pull/1839
* [BugFix] Temporarily set lazy legacy to True by vmoens in https://github.com/pytorch/rl/pull/1840
* [BugFix] Fix gym info scalar infos by vmoens in https://github.com/pytorch/rl/pull/1842
* [Refactor] LAZY_LEGACY_OP=False by vmoens in https://github.com/pytorch/rl/pull/1832
* [Feature] `serial_for_single` arg in batched envs by vmoens in https://github.com/pytorch/rl/pull/1846
* [BugFix] Fix VD4RL by vmoens in https://github.com/pytorch/rl/pull/1834
* [Doc] Make tutos runnable without colab by vmoens in https://github.com/pytorch/rl/pull/1826
* [Feature] Fine control over devices in collectors by vmoens in https://github.com/pytorch/rl/pull/1835
* [Feature, BugFix] Better thread control in penv and collectors by vmoens in https://github.com/pytorch/rl/pull/1848
* [CI] Update macos image by vmoens in https://github.com/pytorch/rl/pull/1849
* [BugFix] thread setting bug by vmoens in https://github.com/pytorch/rl/pull/1852
* Remove unused completed_keys property from StepCounter. by skandermoalla in https://github.com/pytorch/rl/pull/1854
* [Feature] Submitit run script by albertbou92 in https://github.com/pytorch/rl/pull/1822
* [BugFix] Fix flaky gym penv test by vmoens in https://github.com/pytorch/rl/pull/1853
* [CI] Fix macos build by vmoens in https://github.com/pytorch/rl/pull/1856
New Contributors
* mikemykhaylov made their first contribution in https://github.com/pytorch/rl/pull/1655
* MarCnu made their first contribution in https://github.com/pytorch/rl/pull/1661
* Deep145757 made their first contribution in https://github.com/pytorch/rl/pull/1665
* vaibhav-009 made their first contribution in https://github.com/pytorch/rl/pull/1667
* laszloKopits made their first contribution in https://github.com/pytorch/rl/pull/1669
* ianbarber made their first contribution in https://github.com/pytorch/rl/pull/1752
* dependabot made their first contribution in https://github.com/pytorch/rl/pull/1812
* GaetanLepage made their first contribution in https://github.com/pytorch/rl/pull/1828
**Full Changelog**: https://github.com/pytorch/rl/compare/v0.2.1...v0.3.0