Torchrl

Latest version: v0.7.2

Safety actively analyzes 714919 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 4

0.7.2

We are releasing TorchRL 0.7.2, a minor update that addresses several important bug fixes to improve the stability and reliability of our library.

This release is particularly crucial as it resolves a critical issue (2840) where, under certain conditions, the device setting of the parallel environment would prevent the tensors in the buffers from being properly cloned. This resulted in rollouts returning the same tensor instances across steps, potentially leading to incorrect behavior and results.

**Due to the severity of this bug, we strongly recommend that all users upgrade to TorchRL 0.7.2 to ensure the accuracy and reliability of their experiments.**

The full list of changes can be found below:

- [Doc] Fix formatting errors by vmoens (2786)
- [BugFix] correct dim for resolving dtype in _split_and_pad_sequence by KubaMichalczyk and vmoens (2801)
- [BugFix] Fix collector with no buffers and devices by vmoens (2809)
- [BE] Fix some typos by antoinebrl and vmoens (2811)
- [Doc] Add docstring for MCTSForest.extend by kurtamohler and vmoens (2795)
- [CI] Fix libs workflows by vmoens (2800)
- [BugFix] Fix env.full_done_spec~s~ by vmoens (2815)
- [BugFix] Fix batch_locked check in check_env_specs + error message ca… by vmoens (2817)
- [BugFix] GAE warning when gamma/lmbda are tensors by louisfaury and vmoens (2838)
- [BugFix] Tree make node fix by rolo and vmoens (2839)
- [BugFix] Fix PEnv device copies by vmoens (2840)

**Full Changelog**: https://github.com/pytorch/rl/compare/v0.7.1...v0.7.2

0.7.1

We are pleased to announce the release of torchrl v0.7.1, which includes several bug fixes, documentation updates, and backend improvements.

Bug Fixes
- Fixed collector timeouts (2774)
- Fixed composite setitem (2778)
- Ensured that Composite.set returns self as TensorDict does (2784)
- Fixed PPOs with composite distribution (2791)
- Used brackets to get non-tensor data in gym envs (2769)
- Avoided calling reset during env init (2770)
- NonTensor should not convert anything to numpy (2771)

Documentation Updates:
- Fixed tutorials (2768)
- Solved ref issues in docstrings (2776)
- Fixed formatting errors (2786)

Backend Improvements:
- Made better logits in cost tests (2775)
- Ensured abstractmethods are implemented for specs (2790)
- Removed deprec specs from tests (2767)

Thank you to antoinebrl, and louisfaury for contributing to this release!

**Full Changelog**: https://github.com/pytorch/rl/compare/v0.7.0...v0.7.1

0.7.0

As always, we want to warmly thank the RL community who's supporting this project. A special thanks to our first time
contributors:

* priba made their first contribution in https://github.com/pytorch/rl/pull/2543
* carschandler made their first contribution in https://github.com/pytorch/rl/pull/2545
* 4d616e61 made their first contribution in https://github.com/pytorch/rl/pull/2624
* valterschutz made their first contribution in https://github.com/pytorch/rl/pull/2626
* raresdan made their first contribution in https://github.com/pytorch/rl/pull/2616
* oslumbers made their first contribution in https://github.com/pytorch/rl/pull/2609
* codingWhale13 made their first contribution in https://github.com/pytorch/rl/pull/2682

as well as all the users who wrote issues, suggestions, started discussions here, on [discord](https://discord.gg/cZs26Qq3Dd),
on the [pytorch forum](https://discuss.pytorch.org/) or elsewhere! We value your feedback!

BC-Breaking changes and Deprecated behaviors

Removed classes

As announced, we removed the following classes:

- AdditiveGaussianWrapper
- InPlaceSampler
- NormalParamWrapper
- OrnsteinUhlenbeckProcessWrapper

Default MLP config
The default MLP depth has passed from 3 to 0 (i.e., now `MLP(in_features=3, out_features=4)` is equivalent to a regular
`nn.Linear` layer).

Locking envs
Environments specs are now carefully locked by default (2729, 2730). This means that
python
env.observation_spec = spec

is allowed (specs will be unlocked/re-locked automatically) but
python
env.observation_spec["value"] = spec

will not work. The core idea here is that we want to cache as much info as we can, such as action keys or whether
the env has dynamic specs. We can only do that if we can guarantee that the env has not been modified. Locking the specs
provides us such guarantee.
Note that a version of this already existed but it was not as robust as the new one.

Changes to composite distributions

TL;DR: We're changing the way log-probs and entropies are collected and written in ProbabilisticTensorDictModule and
in CompositeDistribution. The `"sample_log_prob"` default key will soon be `"<value>_log_prob` (or
`("path", "to", "<value>_log_prob")` for nested keys). For `CompositeDistribution`, a different log-prob will be
written for each leaf tensor in the distribution. This new behavior is controlled by the
`tensordict.nn.set_composite_lp_aggregate(mode: bool)` function or by the `COMPOSITE_LP_AGGREGATE` environment variable.
We strongly encourage users to adopt the new behavior by setting `tensordict.nn.set_composite_lp_aggregate(False).set()`
at the beginning of their training script.

The behavior of `CompositeDistribution` and its interaction with on-policy losses such as PPO has changed.
The PPO documentation now includes a section about multi-head policies and the examples also give such information.

See the [tensordict v0.7.0 release notes](https://github.com/pytorch/tensordict/releases/tag/v0.7.0) or #2707 to know more.

[Deprecation] Change the default MLP depth (2746) (12e6bce60) by vmoens ghstack-source-id: bd34b8e9112c4fc3a30bd095e3ac073a7d0b5469
[Deprecation] Gracing old *Spec with v0.8 versioning (2751) (fa697fe59) by vmoens ghstack-source-id: e7c6e0a4b8520da887fe7e602a351c3c72a08c4c
[Deprecation] Remove AdditiveGaussianWrapper (2748) (6c7f4fbda) by vmoens ghstack-source-id: 78f248e1239a04fc5213aa4418a158f741679593
[Deprecation] Remove InPlaceSampler (2750) (0feef11f9) by vmoens ghstack-source-id: eeae1bf0611a5d293f533767eee7b9700e720cc8
[Deprecation] Remove NormalParamWrapper (2747) (a38604e47) by vmoens ghstack-source-id: 4a70178f54f9e25d602c86a0b61248d66f3e39bd
[Deprecation] Remove OrnsteinUhlenbeckProcessWrapper (2749) (0111a8795) by vmoens ghstack-source-id: 401fdfaca2e27122d5a67fc7177e1015047f0098

New features

Compile compatibility

We gave a strong focus on a better compatibility with torch.compile across the SOTA training scripts which now
all accept a `compile=1` argument. The overall speedups range from 1 to 4x

<img width="566" alt="Screenshot 2025-02-05 at 21 20 54" src="https://github.com/user-attachments/assets/bd36ce4d-426e-4d7d-a4da-ff62eee78240" />

Loss module speedups are displayed in the [README.md](https://github.com/pytorch/rl) page.

Replay buffers are also mostly compatible with compile now (with the notable exception of distributed and memmaped ones).

Specs: auto_spec_, `<attr>_spec_unbatched`

You can now use `env.auto_spec_` to set the specs automatically based on a dummy rollout.

For batched environments, the unbatched spec can now be accessed via `env.<attr>_spec_unbatched`. This is useful to
create random policies, for example.

New transforms

We added `TrajCounter` (2532), `Hash` and `Tokenizer` (2648, 2700) and LineariseReward (2681).

LazyStackStorage

We provide a new `ListStorage`-based storage (`LazyStackStorage`) that automatically represents samples as a `LazyStackedTensorDict`
which makes it easy to store ragged tensors (although not contiguously in memory) 2723.

ChessEnv

A new `torchrl.envs.ChessEnv` allows users to train agents to play chess!

Tutorials on exporting torchrl modules

We also opensourced a tutorial to export TorchRL modules on hardware: 2557

Full list of features

[Feature, Test] Adding tests for envs that have no specs (2621) (c72583f75) by vmoens ghstack-source-id: 4c75691baa1e70f417e518df15c4208cff189950
[Feature,Refactor] Chess improvements: fen, pgn, pixels, san, action mask (2702) (d425777b8) by vmoens ghstack-source-id: f294a2bc99a17911c9b62558d530b148d3c0350f
[Feature] A2C compatibility with compile (2464) (507766a88) by vmoens ghstack-source-id: 66a7f0d1dd82d6463d61c1671e8e0a14ac9a55e7
[Feature] ActionDiscretizer custom sampling (2609) (3da76f006) oslumbers Co-authored-by: Oliver Slumbers <oliver.slumbershelsing.ai>
[Feature] Add Hash transform (2648) (50011dcf1) kurtamohler ghstack-source-id: dccf63fe4f9d5f76947ddb7d5dedcff87ff8cdc5
[Feature] Add `Choice` spec (2713) (9368ca68e) kurtamohler ghstack-source-id: afa315a311845ab39ade3e75046f32757f9d94f1
[Feature] Add `LossModule.reset_parameters_recursive` (2546) (218d5bf70) by kurtamohler
[Feature] Add `Stack` transform (2567) (594462d6b) by kurtamohler
[Feature] Add deterministic_sample to masked categorical (2708) (49d9897af) by vmoens ghstack-source-id: d34fcf9b44d7a7c60dbde80b0835189f990ef226
[Feature] Adds ordinal distributions (2520) (c851e1698) by louisfaury Co-authored-by: louisfaury
[Feature] Avoid some recompiles of `ReplayBuffer.extend/sample` (2504) (0f29c7e93) kurtamohler
[Feature] CQL compatibility with compile (2553) (e2be42e82) by vmoens ghstack-source-id: d362d6c17faa0eb609009bce004bb4766e345d5e
[Feature] CROSSQ compatibility with compile (2554) (01a421e76) by vmoens ghstack-source-id: 98a2b30e8f6a1b0bc583a9f3c51adc2634eb8028
[Feature] CatFrames.make_rb_transform_and_sampler (2643) (9ee1ae7ee) by vmoens ghstack-source-id: 7ecf952ec9f102a831aefdba533027ff8c4c29cc
[Feature] ChessEnv (2641) (17983d43e) by vmoens ghstack-source-id: 087c3b12cd621ea11a252b34c4896133697bce1a
[Feature] Composite.batch_size (2597) (2e82cab19) by vmoens ghstack-source-id: 621884a559a71e80a4be36c7ba984fd08be47952
[Feature] Composite.pop (2598) (8d16c12bd) by vmoens ghstack-source-id: 64d5bd736657ef56e37d57726dfcfd25b16b699f
[Feature] Composite.separates (2599) (83e0b0568) by vmoens ghstack-source-id: fbfc4308a81cd96ecc61723df8c0eb870c442def
[Feature] Custom conversion tool for gym specs (2726) (dbc8e2ee0) by vmoens ghstack-source-id: d38bb02f15267a9b1637b3ed25fb44ef013e2456
[Feature] DDPG compatibility with compile (2555) (7d7cd9538) by vmoens ghstack-source-id: f18928a419f81794d6870fd4e9fe1205c1b137e1
[Feature] DQN compatibility with compile (2571) (f149811da) by vmoens ghstack-source-id: 113dc8c4a5562d217ed867ace1942b2f6b8a39f9
[Feature] DT compatibility with compile (2556) (fbfe10488) by vmoens ghstack-source-id: 362b6e88bad4397f35036391729e58f4f7e4a25d
[Feature] Discrete SAC compatibility with compile (2569) (9e2d214fa) by vmoens ghstack-source-id: ddc131acedbbe451b28758e757a8c240ebd72b80
[Feature] Ensure out-place policy compatibility in rollout and collectors (2717) (ec370c6b6) by vmoens ghstack-source-id: 41a6aa56e0a045a20224b96f9537a7ae3ae14494
[Feature] EnvBase.auto_specs_ (2601) (d537dcb63) by vmoens ghstack-source-id: 329679238c5172d7ff13097ceaa189479d4f4145
[Feature] EnvBase.check_env_specs (2600) (00d3199ec) by vmoens ghstack-source-id: 332dbf92db496c71c5ce6aba340ad123eac0f5d6
[Feature] GAIL compatibility with compile (2573) (6482766b8) by vmoens ghstack-source-id: 98c7602ec0343d7a83cb19bddeb579484c42e77e
[Feature] IQL compatibility with compile (2649) (2cfc2abd6) by vmoens ghstack-source-id: 77bca166701d28dd69ef3964f55ab4f3e4b17fed
[Feature] LLMHashingEnv (2635) (30d21e599) by vmoens ghstack-source-id: d1a20ecd023008683cf18cf9e694340cfdbdac8a
[Feature] LazyStackStorage (2723) (fe3f00c6c) by vmoens ghstack-source-id: e9c031470aa0bdafbb2b26c73c06b25685a128e5
[Feature] Linearise reward transform (2681) (ff1ff7e9c) by louisfaury Co-authored-by: louisfaury
[Feature] Log each entropy for composite distributions in PPO (2707) (319bb68f0) by louisfaury Co-authored-by: louisfaury
[Feature] Log pbar rate in SOTA implementations (2662) (1ce25f19a) by vmoens ghstack-source-id: 283cc1bb4ad2d60281296d2cfb78ec41c77f4129
[Feature] MCTSForest (2307) (e9d167711) by vmoens ghstack-source-id: 9ac5cd3de39a4dbe1c7c33cb71ff6f45a886ae65
[Feature] Make PPO compatible with composite actions and log-probs (2665) (256a7002c) by vmoens ghstack-source-id: c41718e697f9b6edda17d4ddb5bd6d41402b7c30
[Feature] PPO compatibility with compile (2652) (f5a187d7d) by vmoens ghstack-source-id: 0ed29f352fcd85f0dc0683d90e95bdbecf6c14f9
[Feature] Re-enable cache for specs (2730) (4262ab91e) by vmoens ghstack-source-id: 797132312bfd9749f8926a2dd0b03eff65b8f51c
[Feature] SAC compatibility with compile (2655) (87a59fb30) by vmoens ghstack-source-id: b57caeaf6e2d3690fb3311f4c9b8cca8575d3974
[Feature] Send info dict to the storage device in RBs (2527) (d524d0d6b) by vmoens ghstack-source-id: 4ed60d649b17f96b49f90d234e679937c60a3c32
[Feature] TD3 compatibility with compile (2658) (1b7eda199) by vmoens ghstack-source-id: fb94307557f2b8604403b48211e3da6fb2139e28
[Feature] TD3-bc compatibility with compile (2657) (91064bc27) by vmoens ghstack-source-id: 8a33e39829f620c1e1a579a0255162ba93eaca91
[Feature] TensorSpec.enumerate() (2354) (14b63e4f0) by vmoens ghstack-source-id: 9db2f5ee47a197eb0403cb4622266fb03b99360f
[Feature] TrajCounter transform (2532) (05aeb8975) by vmoens ghstack-source-id: 62a3091e5c9072f26266143319f30de1729c0d4e
[Feature] UnaryTransform for input entries (2700) (093a1599f) by vmoens ghstack-source-id: bb0ea97f47bdad6ba5e73692969fece4e2efbfb4
[Feature] `example_data` for NonTensor spec (2698) (80690d221) by vmoens ghstack-source-id: 6fe5d82763dfcc9044d6debe88f0f34bb739c987
[Feature] automatically determine return_contiguous (2724) (cac93eb0e) by vmoens ghstack-source-id: 6d1fc31d87cb021e6286cdb07db2d9b0e2302f7d
[Feature] env.step_mdp (2636) (4bc40a808) by vmoens ghstack-source-id: 145e37cd772fdd74e35e5ffe6accc5c81ad689f3
[Feature] flexible batch_locked for jumanji (2382) (35a78139b) by vmoens ghstack-source-id: e356b6511ff3da8a6c583747214cfa90f42c9083
[Feature] lock_ / unlock_ graphs (2729) (601483e71) by vmoens ghstack-source-id: 01e375e636b97b26a89f9bbab2e955db6c85978a
[Feature] multiagent data standardization: PPO advantages (2677) (b7a0d11e5) by matteobettini Co-authored-by: Vincent Moens <vmoensmeta.com>
[Feature] no_cuda_sync arg in collectors (2727) (280297aee) by vmoens ghstack-source-id: 9baba31b3ee844882fd4b6a6f69874946caf3b3e
[Feature] single_<attr>_spec (2549) (58c384713) by vmoens ghstack-source-id: 27e247ea1775e455999a114dd6d95fac748376c4
[Feature] spec.cardinality (2638) (dd26ae79f) by vmoens ghstack-source-id: 1160900f8a81dd51dc72436e1af69c8248bff162
[Feature] spec.is_empty(recurse) (2596) (097d8ad98) by vmoens ghstack-source-id: faa3b1df5133c77462d6dd013d3854d684cc7e94
[Feature] timeit.printevery (2653) (187de7c8b) by vmoens ghstack-source-id: 19165bbfbea5cdc0a6b159493fb02571bab872f3
[Minor,Feature] Add `prefix` arg to `timeit.todict` (2576) (7bc84d15d) by vmoens ghstack-source-id: f1ff685caf6e8950d02dfc44ad2c1eb496495ad1
[Minor,Feature] `group_optimizers` (2577) (7829bd3f3) by vmoens ghstack-source-id: 81a94ed641544a420bb1c455921ca6a17ecd6a22

Doc

[Doc] Add AOTInductor back (2564) (9f8f77cdb) by vmoens ghstack-source-id: 774eb5973045861f284fdc67f74945b1eecdeaf2
[Doc] Add Tokenizer and auto-reset doc link (2754) (ee4006a6b) by vmoens ghstack-source-id: 90f55b568e85ae151bea4370025144c19e74602b
[Doc] Add `Stack` transform link in docs (2689) (c5f1565de) by kurtamohler
[Doc] Adding recurrent policies to export tutorial (2559) (705123870) by vmoens ghstack-source-id: 1f1af399b120db8bbb1789748641f44fd3b1bd5e
[Doc] Better doc for SliceSampler (2607) (90572ac11) by vmoens ghstack-source-id: 7d79ef7d37c4dc2ffbdff5b422cf5da24d93c0da
[Doc] Fix broken links and formatting issues in doc (2574) (5a2d9e205) by vmoens ghstack-source-id: 4e3f84fe436de6a6e9696894cd06318a98e4a23b
[Doc] Fix modules doc (2531) (edbf3dee3) by vmoens
[Doc] Fix tutorials (2560) (2f3b4cd4d) by vmoens ghstack-source-id: 6c9114384015e76e96b3bbd0c8893cc42344537a
[Doc] Fix typo in torchrl/modules/distributions/continuous.py (2624) (b2e9f291a) by Mana
[Doc] Fix typos (2682) (f672c708f) by Nils Kiele Co-authored-by: Vincent Moens <vincentmoensgmail.com>
[Doc] MADDPG bug fix of buffer device and improve explaination (2519) (3e4b2928e) by matteobettini
[Doc] Minor fixes to the docs and type hints (2548) (50a35f69b) by thomasbbrunner
[Doc] Tutorial on exporting TorchRL models (2557) (c0187a93e) by vmoens ghstack-source-id: b93146e22d8376563e7ac302b5cff95f09ae50d4
[Doc] Typo in docs for actors.py (2545) (19dbeebf0) by carschandler
[Doc] Update docstring for TruncatedNormal with correct parameter names (2625) (d22266d05) by valterschutz Co-authored-by: Valter Schutz <valterschutzproton.me>
[Doc] actor docstrings (2626) (825779935) by valterschutz Co-authored-by: Valter Schutz <valterschutzproton.me>
[Doc] fix several typos (2603) (de153bf45) by carschandler
[Doc] torchrl_demo.py revamp (2561) (304e707ef) by vmoens ghstack-source-id: 2f0087850e4a7d4d4393f0662156af9bfca8e3e1
[Example] Efficient Trajectory Sampling with CompletedTrajRepertoire (2642) (b840a772c) by vmoens ghstack-source-id: 4d5c587c69230aa8f3a1b9b6fe19f52fa683d703
[Example] RNN-based policy example (2675) (d009835b4) by vmoens ghstack-source-id: ef0087e9b5cba40be428f57ef70ecd2f63483d03
[Example] Using Collector's device args (2705) (539c2158d) by vmoens ghstack-source-id: 9aec8daa53000bdfd6091be706c7bc46778d5983

Performance

[Performance] Accelerate slice sampler on GPU (2672) (84c3ec322) by vmoens ghstack-source-id: a4dc1515d8b51f5ec150b2fae4e1a84254f2af09
[Performance] Avoid cloning trajs in SliceSampler (2671) (4fd54fef4) by vmoens ghstack-source-id: 2e133fcea716b202694cfa84df3f6e4ba3507bbc
[Performance] Improve performance of compiled ReplayBuffer (2529) (2a07f4c0f) by kurtamohler
[Benchmark] Add benchmark for compiled `ReplayBuffer.extend/sample` (2514) (5e03a5518) kurtamohler ghstack-source-id: d4562697e2c1a8392cf5bdcadb50f8b7b6939e41

Better engineering

[BE] Add trailing spaces when necessary (2581) (600760f5b) by vmoens ghstack-source-id: 198b5b5668cce8336d44206c10dacb8a9b1a9785
[BE] Add type annotation for tensor_keys to facilitate auto-complete (2696) (4b3279a3f) by vmoens ghstack-source-id: b4a8fe38e7c6b028759eef082f65f26036bc0250
[Refactor,CI] Refactor SOTA tests (2583) (c0ba3ff54) by vmoens ghstack-source-id: b14c59bb1ca7bf056bde05fa0abd01fa7e9b3710
[Refactor] Allow safe-tanh for torch >= 2.6.0 (2580) (1474f8517) by vmoens ghstack-source-id: 92df1954451453ee051bbde499f6db5ebaafed49
[Refactor] Deprecate recurrent_mode API to use decorators/CMs instead (2584) (14b277513) by vmoens ghstack-source-id: 80f705e022abc111df3960fc09576d5e266ed4dd
[Refactor] Refactor trees (2634) (57dc25a44) by vmoens ghstack-source-id: 368ba4c4402b6db0bc8b0688802ce161db9776b7
[Refactor] Rename Recorder and LogReward (2616) (607ebc52d) by Goia Rares Dan Tiago
[Refactor] Use <spec>_unbatched in VMAS (2593) (a126a6f94) by vmoens ghstack-source-id: 2190278de44ba59a3bc8d38398fddae9ecc42a84
[Refactor] Use default device instead of CPU in losses (2687) (c3b9d1dc7) by vmoens ghstack-source-id: 8b98062c3ae88d8780ef7428fdfa07e305c790b9
[Refactor] compile compatibility improvements (2578) (db7f08d76) by vmoens ghstack-source-id: 95f8241b56e42b80e828485cb5f377288bff6f5e
[Quality,BE] Better doc for step_mdp (2639) (ef5a37d8a) by vmoens ghstack-source-id: 1f5aed6fb2e97ead9d379f9545ae742f7728c585
[Quality] Better TD construction in codebase (2565) (a4c1ee3b3) by vmoens ghstack-source-id: 9e280d9d7d4a735e5055beb0450d933547530e55
[Quality] Better warning when c++ binaries failed to be imported (2541) (0a13cbd5e) by vmoens
[Quality] IMPALA auto-device (2654) (526b38d5c) by vmoens ghstack-source-id: abbb3048f33c9f7f6a623e32e139871093ea74fa
[Minor] Fix doc and MARL tests (2759) (ad7d2a10b) by vmoens ghstack-source-id: 9308be3ebc7fac30b5bde321792eb97069d55996
[Minor] Fix fbcode imports of mocking classes (2526) (da0bf1897) by vmoens ghstack-source-id: 74f9f3bedf8f48988a1956084548f6cd2f720934
[Minor] Make fbcode happy with imports (2517) (a70b258cd) by vmoens ghstack-source-id: d4bfce9d51269bc0ab6154ee4c2d1e1ff7af0895

Bug fixes

[BugFix, BE] Document and fix fps passing in recorder and loggers (2694) (61e05b3d9) by vmoens ghstack-source-id: b3996a9a27643eb5da8a78135f6b9fcef3685f17
[BugFix,Doc] Fix BATCHED_PIPE_TIMEOUT refs and doc (2695) (dc25a55a7) by vmoens ghstack-source-id: 6e43c4ff1c319545cf0952abf6f35f3e7ed473e0
[BugFix,Doc] Revert dynamic shape in export tutorial (2563) (9d292a007) by vmoens ghstack-source-id: fc856218e840469a5bb0143241d100e9cc612538
[BugFix,Test,Benchmark] Fix graph breaks induced by device context manager (2602) (152bc81b7) by vmoens ghstack-source-id: 0df2728928280a43de4abd30afed20826b0af091
[BugFix,Test] test chess rendering (2721) (ddbb6fdd5) by vmoens ghstack-source-id: 59b37e6fa2f8c11f600eea334da0bd8257ed382c
[BugFix] Account for composite actions in gym (2718) (1246db197) by vmoens ghstack-source-id: c09b59904a89d45fa24a61a5e8a24fe307320794
[BugFix] Account for terminating data in SAC losses (2606) (c8676f4a8) by vmoens ghstack-source-id: dc1870292786c262b4ab6a221b3afb551e0efb9b
[BugFix] ActionDiscretizer scalar integration (2619) (830f2f26c) by vmoens ghstack-source-id: b22102f3730914b125ef0f813f4d2f22dec0b26e
[BugFix] Allow expanding TensorDictPrimer transforms shape with parent batch size (2552) (83a7a57da) by Albert Bou Co-authored-by: Vincent Moens <vmoensmeta.com>
[BugFix] Avoid KeyError in slice sampler (for compile) (2670) (21eeca42c) by vmoens ghstack-source-id: 6e2a3036f0e50d365387cced50a761b97a47317d
[BugFix] Better account of composite distributions in PPO (2622) (90c8e40f6) by vmoens ghstack-source-id: 3d86f99bc5b20a53e4092d786e96a5f7e83405ac
[BugFix] Compatibility of tensordict primers with batched envs (specifically for LSTM and GRU) (2668) (f4709c143) by vmoens ghstack-source-id: e1da58ecfd36ca01b8a11fe90e5f3c5fe77f064c
[BugFix] Fix MARL PPO tutorial action_spec call (2628) (1ca134cc3) by vmoens ghstack-source-id: 1d9058c45b28c0f0279e4243a2a0f96c622a51d8
[BugFix] Fix batching envs with non tensor data (2674) (ab4250ec7) by vmoens ghstack-source-id: daba8a95459cfa978da09291757b6380fab4f308
[BugFix] Fix call to tree.plot in tests (2547) (09d6866e0) by vmoens ghstack-source-id: 4a5babbf46294ab6ed4a791e26cfacaf3a41a2e0
[BugFix] Fix collector length with non-empty batch size (2575) (b87597922) by vmoens ghstack-source-id: 0c6a7a49f0570fad083340a64dd89c0f4c220c06
[BugFix] Fix compile weakrefs errors (2742) (ffa99b2a2) by vmoens ghstack-source-id: 3cb4c62f465a3c0581064b3ff89290b9d225eb3f
[BugFix] Fix device transfer for collectors with init_random_frames mixed devices (2704) (1d45117ba) by vmoens ghstack-source-id: 1684399a7c84dd19b396db6c903fbf68c971c73d
[BugFix] Fix export aoti_compile_and_package API change (2629) (1cffffee9) by vmoens ghstack-source-id: 07a0f063f8955815157c2a3eac02c6460a82f672
[BugFix] Fix failing tests (2582) (863121a27) by vmoens ghstack-source-id: a43a2e3dbf76cd63c57ae00028df04b41a4e2f2b
[BugFix] Fix get_default_device calls in older PT versions (2586) (705ecc2bb) by vmoens ghstack-source-id: fd3a739d38feba075073801dda362be598822a94
[BugFix] Fix imports (2605) (d90b9e3d1) by vmoens ghstack-source-id: db85f2611c1c0b22e9179b4fdd6c2dcea78ac8dd
[BugFix] Fix init_random_frames=0 (2645) (19dfefc84) by vmoens ghstack-source-id: 38a544ea15631f9affb4c385c09e7c4df94af55d
[BugFix] Fix missing min/max alpha clamps in losses (2684) (ed656a15f) by vmoens
[BugFix] Fix output of `SipHash(as_tensor=False)` (2664) (1fc9577c4) by kurtamohler
[BugFix] Fix partial device transfers in collector (2703) (afb81de51) by vmoens ghstack-source-id: 2cd74c2d6fceaf079122ae801b67bdbfc29cddaf
[BugFix] Fix pendulum device (2516) (6799a7f5d) by vmoens ghstack-source-id: bcaf20de6e317d4bda0e1511e0b1e46653a6f352
[BugFix] Fix safe probabilistic backward by removing in-place modif (2755) (2f8c118e3) by vmoens ghstack-source-id: 574eb1f9b662c1eb5be25e97020e11b3fadf625e
[BugFix] Fix tests failing because of https://github.com/pytorch/pytorch/pull/137602 (165163abe) by vmoens
[BugFix] Fix typing for python 3.9 (2631) (e7062a1d6) by vmoens ghstack-source-id: 663da84096214611804a726e2d38d27a6f21c958
[BugFix] Fix typing in chess env (2646) (cb8e241b2) by vmoens ghstack-source-id: ad6086bbb7d1ee528ca24ec1d1232da47372e2b5
[BugFix] Fix typing in llm env (2647) (e3c304733) by vmoens ghstack-source-id: b5608f91756b5a81141941903158417a111e0710
[BugFix] Fix version parsing in extensions (2542) (997d90e1b) by vmoens ghstack-source-id: 903f2b01b508b81b1b4f92c4297d390da79fe8a2
[BugFix] PettingZoo dict action spaces (2692) (1a6c9e2d0) by matteobettini
[BugFix] Remove erroneous python 3.8 compatibility classifier (2540) (528875a9f) by vmoens
[BugFix] Remove raisers in specs (2651) (bb6f87adb) by vmoens ghstack-source-id: a005a62847aa2ff1d286f2c4ad13fd14f9e631d3
[BugFix] Rename RayCollector example file to avoid ImportError (2525) (8eac84ad2) by Albert Bou
[BugFix] Support for tensor collection in the `PPOLoss` (2543) (0eabb7897) by Pau Riba Co-authored-by: Pau Riba <pau.ribahelsing.ai>
[BugFix] Temporarily remove unsafe caching in envs (2728) (dc63e820d) by vmoens ghstack-source-id: a139cf6dc9fcfcfa525a6aa6375163d379593550
[BugFix] Wrong spec returned (2604) (a1e21f598) by matteobettini
[BugFix] action_spec_unbatched whenever necessary (2592) (d30599ec0) by vmoens ghstack-source-id: ec87794dabaf5023dac85cfc898a7c000e93331d
[BugFix] adapt log-prob TD batch-size to advantage shape in PPO (2756) (cb37521e1) by vmoens ghstack-source-id: 8ccd12f65f4a74a42356a630e0e5a1f015337d4a
[BugFix] make buffers zero-dim in exploration modules (2591) (a47b32c07) by vmoens ghstack-source-id: fd2705eb9132169da4871b27b354f7895c644061
[BugFix] patch rand_action in TransformedEnv to read the base_env method (2699) (2c19fcc70) by vmoens ghstack-source-id: 04e2e85e2675cf34c349ebadb8fa85a5aff2e532
[BugFix] requested_frames_per_batch in distributed collectors (2579) (408cf7d04) by vmoens ghstack-source-id: 49289de6956460d9aed13d982eb8003eafc35118
[BugFix] skip_done_states in SAC (2613) (de61e4d5e) by vmoens ghstack-source-id: 39d97360e3b0e45dd8c327487eac50ddafe2254d

CI and Tests

[Test] Add tests and a few fixes for ChessEnv (2661) (7bbd7e3b6) kurtamohler ghstack-source-id: d0fbb520e35c74305041340722a7560ac2f958f2
[Test] Add tests for CatFrames with PermuteTransform (2715) (d4e401993) kurtamohler ghstack-source-id: e554d1cda8d7e4458c9397f1f93345c855e68e5c
[Test] Add tests for Tree (2738) (bb9440b40) kurtamohler ghstack-source-id: 8f7aa07a4d36aa3664eaa19cc35bd66fb9e61c24
[Test] Fix warnings in SOTA tests (2710) (a90106475) by vmoens ghstack-source-id: c79223b5d6548a6c5a6ef649f6eb8e1703258815
[Test] More comprehensive tests for auto_spec (2640) (6c7d233a4) by vmoens ghstack-source-id: 75352490436fd706af3d36f9b8016e80a8a3f46a
[Test] Skip tokenizer tests if transformers is not in workspace (2744) (20a19fe2a) by vmoens ghstack-source-id: b92facfd14cba62511e7888567c94d3986419ab5
[Test] Str2StrEnv test (2725) (5fd509232) by vmoens ghstack-source-id: 45a0e5f4b33c4624758171b9fe31f1e3932ff5e4
[CI, BugFix] Py3.8 for old deps (2568) (f3275dab0) by vmoens ghstack-source-id: 13c7923c0e5c8725c12c3bacc6c21b250d9f7457
[CI] Change doc image (2632) (2511c04a5) by vmoens ghstack-source-id: eceab242294ec55135d79f29e848345a5d5d455e
[CI] Cuda 12.4 (2733) (37a514d6c) by vmoens ghstack-source-id: 2f3842a17d03e530add9608ee4525347a7c6a0e5
[CI] Fix Cairo-2 Chess import error (2743) (10f015e0c) by vmoens ghstack-source-id: c2bcbfc4522bd1b4f1fea3dbb006dc9552b09cb4
[CI] Fix docs upload (2587) (0f592266f) by vmoens ghstack-source-id: 49d7df06340fc432c29cd9f2d0ed2ae3d5619a38
[CI] Fix dreamer run in SOTA tests (2627) (aed03fda4) by vmoens ghstack-source-id: dfe3ab6fe0d29fcdcaf57f31f84d04e07e36bad3
[CI] Fix nightly build (2666) (133d70936) by vmoens ghstack-source-id: 5502fa94b6abcc154e020dcb165093fdc30ca025
[CI] Fix olddeps dm_control (2734) (3ac61270f) by vmoens ghstack-source-id: 750edcb8cd6b17167f77fb7c9ebd538608cfbde6
[CI] Fix windows build (2760) (03f56ffb0) by vmoens
[CI] Install stable torch/tv in docs when on release branch (2761) (57bdc6aec) by vmoens ghstack-source-id: 7c39c049c7cff0ee112be2d07597f2e291d2fafd
[CI] Local import of PIL (2720) (d628a507f) by vmoens ghstack-source-id: 6eb4ace11022632e902a7277dd51344bb9fe1f65
[CI] Longer timeout for windows (2765) (4c06ce2b8) by vmoens ghstack-source-id: 381e7e39d650e0178178a78076321a2210237b39
[CI] Make MAX_IDLE_COUNT a feature of tests (2752) (963f3cdf6) by vmoens ghstack-source-id: 2bf31dfff3d7862a54abeea86c8c5cc47a0f302d
[CI] Remove gym import in test_libs.py (2719) (f2cf5e044) by vmoens ghstack-source-id: b0474588cfc81ed135d70efb58203c0b503f4ff0
[CI] Revert upgrade of upload image in docs (2585) (236d38f8a) by vmoens ghstack-source-id: f323dd2667a073b6c763ed17a793ecd0eec6b7be
[CI] Upgrade GHA versions (2740) (cd4f359ef) by vmoens ghstack-source-id: 1876f1f0c18cb11c74edc9d96c17fdc985bc7b1a
[CI] Upgrade cu121 to cu124 (2764) (5da1f6522) by vmoens ghstack-source-id: 4b3c9c0c31a60a5e151ff13b21e54853dc426416
[CI] Upgrade to v0.7 (2745) (0ecfbe36e) by vmoens ghstack-source-id: e548bbbb4578d44a8eee000ab0a40c89713afc27
[CI] linux_job_v2.yml (2570) (527a26a27) by vmoens ghstack-source-id: ae13b53bd2885263e80019c087171421f5f7d0d5
[CI] minari[hf] (2722) (dda0df165) by vmoens ghstack-source-id: 6eb84d906dfbc66839706f328e214014aef7b65f
[CI] workflow permissions (2706) (b000685f3) by vmoens ghstack-source-id: f520a1b1e7697b1147cb29e66e2ecb1d07cb4cbc

0.6.0

What's Changed

We introduce wrappers for [**ML-Agents**](https://github.com/Unity-Technologies/ml-agents) and [**OpenSpiel**](https://github.com/google-deepmind/open_spiel). See the doc [here for OpenSpiel](https://pytorch.org/rl/main/reference/generated/torchrl.envs.OpenSpielEnv.html?highlight=openspiel#torchrl.envs.OpenSpielEnv) and [here for MLAgents](https://pytorch.org/rl/main/reference/generated/torchrl.envs.UnityMLAgentsEnv.html#torchrl.envs.UnityMLAgentsEnv).

We introduce support for [partial steps](2377, 2381), allowing you to run rollouts that ends only when all envs are done without resetting those who have reached a termination point.

We add the capability of passing replay buffers directly to data collectors, to avoid inter-process synced communications - thereby drastically speeding up data collection. See the [doc of the collectors](https://pytorch.org/rl/main/reference/collectors.html) for more info.

The GAIL algorithm has also been integrated in the library (2273).

We ensure that all loss modules are compatible with torch.compile without graph breaks (for a typical built). Execution of compiled losses is usually in the range of 2x faster than its eager counterpart.

Finally, we have sadly decided not to support Gymnasium v1.0 and future releases as the new autoreset API is fundamentally incompatible with TorchRL. Furthermore, it does not guarantee the same level of reproducibility as previous releases. See [this discussion](https://github.com/pytorch/rl/discussions/2483) for more information.

We provide wheels for aarch64 machines, but not being able to upload them to PyPI we provide them attached to these release notes.

Deprecations
* [Deprecation] Deprecate default num_cells in MLP (2395) by vmoens
* [Deprecations] Deprecate in view of v0.6 release 2446 by vmoens

New environments
* [Feature] Add `OpenSpielWrapper` and `OpenSpielEnv` (2345) by kurtamohler
* [Feature] Add env wrapper for Unity MLAgents (2469) by kurtamohler

New features
* [Feature] Add `group_map` support to MLAgents wrappers (2491) by kurtamohler
* [Feature] Add scheduler for alpha/beta parameters of PrioritizedSampler (2452) Co-authored-by: Vincent Moens by LTluttmann
* [Feature] Check number of kwargs matches num_workers (2465) Co-authored-by: Vincent Moens by antoine.broyelle
* [Feature] Compiled and cudagraph for policies 2478 by vmoens
* [Feature] Consistent Dropout (2399) Co-authored-by: Vincent Moens by depictiger
* [Feature] Deterministic sample for Masked one-hot 2440 by vmoens
* [Feature] Dict specs in vmas (2415) Co-authored-by: Vincent Moens by 55539777+matteobettini
* [Feature] Ensure transformation keys have the same number of elements (2466) by f.broyelle
* [Feature] Make benchmarked losses compatible with torch.compile 2405 by vmoens
* [Feature] Partial steps in batched envs 2377 by vmoens
* [Feature] Pass replay buffers to MultiaSyncDataCollector 2387 by vmoens
* [Feature] Pass replay buffers to SyncDataCollector 2384 by vmoens
* [Feature] Prevent loading existing mmap files in storages if they already exist 2438 by vmoens
* [Feature] RNG for RBs (2379) by vmoens
* [Feature] Randint on device for buffers 2470 by vmoens
* [Feature] SAC compatibility with composite distributions. (2447) by albertbou92
* [Feature] Store MARL parameters in module (2351) by vmoens
* [Feature] Support wrapping IsaacLab environments with GymEnv (2380) by yu-fz
* [Feature] TensorDictMap 2306 by vmoens
* [Feature] TensorDictMap Query module 2305 by vmoens
* [Feature] TensorDictMap hashing functions 2304 by vmoens
* [Feature] break_when_all_done in rollout 2381 by vmoens
* [Feature] inline `hold_out_net` 2499 by vmoens
* [Feature] replay_buffer_chunk 2388 by vmoens

New Algorithms
* [Algorithm] GAIL (2273) Co-authored-by: Vincent Moens by Sebastian.dittert

Fixes
* [BugFix, CI] Set `TD_GET_DEFAULTS_TO_NONE=1` in all CIs (2363) by vmoens
* [BugFix] Add `MultiCategorical` support in PettingZoo action masks (2485) Co-authored-by: Vincent Moens by matteobettini
* [BugFix] Allow for composite action distributions in PPO/A2C losses (2391) by albertbou92
* [BugFix] Avoid `reshape(-1)` for inputs to `DreamerActorLoss` (2496) by kurtamohler
* [BugFix] Avoid `reshape(-1)` for inputs to `objectives` modules (2494) Co-authored-by: Vincent Moens by kurtamohler
* [BugFix] Better dumps/loads (2343) by vmoens
* [BugFix] Extend RB with lazy stack 2453 by vmoens
* [BugFix] Extend RB with lazy stack (revamp) 2454 by vmoens
* [BugFix] Fix Compose input spec transform (2463) Co-authored-by: Louis Faury louisfaury
* [BugFix] Fix DeviceCastTransform 2471 by vmoens
* [BugFix] Fix LSTM in GAE with vmap (2376) by vmoens
* [BugFix] Fix MARL-DDPG tutorial and other MODE usages (2373) by vmoens
* [BugFix] Fix displaying of tensor sizes in buffers 2456 by vmoens
* [BugFix] Fix dumps for SamplerWithoutReplacement (2506) by vmoens
* [BugFix] Fix get-related errors (2361) by vmoens
* [BugFix] Fix invalid CUDA ID error when loading Bounded variables across devices (2421) by cbhua
* [BugFix] Fix listing of updated keys in collectors (2460) by vmoens
* [BugFix] Fix old deps tests 2500 by vmoens
* [BugFix] Fix support for MiniGrid envs (2416) by kurtamohler
* [BugFix] Fix tictactoeenv.py 2417 by vmoens
* [BugFix] Fixes to RenameTransform (2442) Co-authored-by: Vincent Moens by thomasbbrunner
* [BugFix] Make sure keys are exclusive in envs (1912) by vmoens
* [BugFix] TensorDictPrimer updates spec instead of overwriting (2332) Co-authored-by: Vincent Moens by matteobettini
* [BugFix] Use a RL-specific NO_DEFAULT instead of TD's one (2367) by vmoens
* [BugFix] compatibility to new Composite dist log_prob/entropy APIs 2435 by vmoens
* [BugFix] torch 2.0 compatibility fix 2475 by vmoens

Performance
* [Performance] Faster `CatFrames.unfolding` with `padding="same"` (2407) by kurtamohler
* [Performance] Faster `PrioritizedSliceSampler._padded_indices` (2433) by kurtamohler
* [Performance] Faster `SliceSampler._tensor_slices_from_startend` (2423) by kurtamohler
* [Performance] Faster target update using foreach (2046) by vmoens

Documentation
* [Doc] Better doc for inverse transform semantic 2459 by vmoens
* [Doc] Correct minor erratum in `knowledge_base` entry (2383) by depictiger
* [Doc] Document losses in README.md 2408 by vmoens
* [Doc] Fix README example (2398) by vmoens
* [Doc] Fix links to tutos (2409) by vmoens
* [Doc] Fix pip3install typos in Readme (2342) by 43245438+TheRisenPhoenix
* [Doc] Fix policy in getting started (2429) by vmoens
* [Doc] Fix tutorials for release 2476 by vmoens
* [Doc] Fix wrong default value for flatten_tensordicts in ReplayBufferTrainer (2502) by vmoens
* [Doc] Minor fixes to comments and docstrings (2443) by thomasbbrunner
* [Doc] Refactor README (2352) by vmoens
* [Docs] Use more appropriate ActorValueOperator in PPOLoss documentation (2350) by GaetanLepage
* [Documentation] README rewrite and broken links (2023) by vmoens

Not user facing
* [CI, BugFix] Fix CI (2489) by vmoens
* [CI] Add aarch64-linux wheels (2434) by vmoens
* [CI] Disable compile tests on windows 2510 by vmoens
* [CI] Fix 3.12 gymnasium installation 2474 by vmoens
* [CI] Fix CI errors (2394) by vmoens
* [CI] Fix GPU benchmark upload (2508) by vmoens
* [CI] Fix Minari tests (2419) Co-authored-by: Vincent Moens by 42100908+younik
* [CI] Fix benchmark workflows (2488) by vmoens
* [CI] Fix broken workflows (2418) by vmoens
* [CI] Fix broken workflows (2428) by vmoens
* [CI] Fix gymnasium version in minari 2512 by vmoens
* [CI] Fix h5py dependency in olddeps 2513 by vmoens
* [CI] Fix windows build legacy 2450 by vmoens
* [CI] Fix winndows compile tests 2511 by vmoens
* [CI] Remove 3.8 jobs 2412 by vmoens
* [CI] Resolve DMC and mujoco pinned versions (2396) by vmoens
* [CI] Run docs on all PRs (2413) by vmoens
* [CI] pin DMC and mujoco (2374) by vmoens
* [Minor] Fix test_compose_action_spec (2493) Co-authored-by: Louis Faury by louisfaury
* [Minor] Fix typos in `advantages.py` (2492) Co-authored-by: Louis Faury by louisfaury
* [Quality] Split utils.h and utils.cpp (2348) by vmoens
* [Refactor] Limit the deepcopies in collectors 2451 by vmoens
* [Refactor] Refactor calls to get without default that raise KeyError (2353) by vmoens
* [Refactor] Rename specs to simpler names (2368) by vmoens
* [Refactor] Use empty_like in storage construction 2455 by vmoens
* [Versioning] Fix torch deps (2340) by vmoens
* [Versioning] Gymnasium 1.0 incompatibility errors 2484 by vmoens
* [Versioning] Versions for 0.6 (2509) by vmoens

New Contributors
As always, we want to show how appreciative we are of the vibrant open-source community that keeps TorchRL alive.

* yu-fz made their first contribution in https://github.com/pytorch/rl/pull/2380
* cbhua made their first contribution in https://github.com/pytorch/rl/pull/2421
* younik made their first contribution in https://github.com/pytorch/rl/pull/2419
* thomasbbrunner made their first contribution in https://github.com/pytorch/rl/pull/2442
* LTluttmann made their first contribution in https://github.com/pytorch/rl/pull/2452
* louisfaury made their first contribution in https://github.com/pytorch/rl/pull/2463
* antoinebrl made their first contribution in https://github.com/pytorch/rl/pull/2466

**Full Changelog**: https://github.com/pytorch/rl/compare/v0.5.0...v0.6.0

0.5.0

What's Changed

This new release makes it possible to run environments that output non-tensor data. 1944

We also introduce dynamic specs, allowing environments to change the size of the observations / actions during the
course of a rollout. This feature is compatible with parallel environment and collectors! 2143

Additionally, it is now possible to update a Replay Buffer in-place by assigning values at a given index. 2224

Finally, TorchRL is now compatible with Python 3.12 (2282, 2281).

As always, a huge thanks to the vibrant OSS community that helps us developt this library!

New algorithms
* [Algorithm] CrossQ by BY571 in https://github.com/pytorch/rl/pull/2033
* [Algorithm] TD3+BC by BY571 in https://github.com/pytorch/rl/pull/2249

Features
* [Feature] ActionDiscretizer by vmoens in https://github.com/pytorch/rl/pull/2247
* [Feature] Add KL approximation in PPO loss metadata by albertbou92 in https://github.com/pytorch/rl/pull/2166
* [Feature] Add `modules.AdditiveGaussianModule` by kurtamohler in https://github.com/pytorch/rl/pull/2296
* [Feature] Add `modules.OrnsteinUhlenbeckProcessModule` by kurtamohler in https://github.com/pytorch/rl/pull/2297
* [Feature] Autocomplete for losses by vmoens in https://github.com/pytorch/rl/pull/2148
* [Feature] Crop Transform by albertbou92 in https://github.com/pytorch/rl/pull/2336
* [Feature] Dynamic specs by vmoens in https://github.com/pytorch/rl/pull/2143
* [Feature] Extract primers from modules that contain RNNs by albertbou92 in https://github.com/pytorch/rl/pull/2127
* [Feature] Jumanji from_pixels=True by vmoens in https://github.com/pytorch/rl/pull/2129
* [Feature] Make ProbabilisticActor compatible with Composite distributions by vmoens in https://github.com/pytorch/rl/pull/2220
* [Feature] Replay buffer checkpointers by vmoens in https://github.com/pytorch/rl/pull/2137
* [Feature] Some improvements to VecNorm by vmoens in https://github.com/pytorch/rl/pull/2251
* [Feature] Split-trajectories and represent as nested tensor by vmoens in https://github.com/pytorch/rl/pull/2043
* [Feature] _make_ordinal_device by vmoens in https://github.com/pytorch/rl/pull/2237
* [Feature] assigning values to RB storage by vmoens in https://github.com/pytorch/rl/pull/2224

Bug fixes
* [BugFix,Feature] Allow non-tensor data in envs by vmoens in https://github.com/pytorch/rl/pull/1944
* [BugFix] Allow zero alpha value for PrioritizedSampler by albertbou92 in https://github.com/pytorch/rl/pull/2164
* [BugFix] Expose MARL modules by vmoens in https://github.com/pytorch/rl/pull/2321
* [BugFix] Fit vecnorm out_keys by vmoens in https://github.com/pytorch/rl/pull/2157
* [BugFix] Fix Brax by vmoens in https://github.com/pytorch/rl/pull/2233
* [BugFix] Fix OOB sampling in PrioritizedSliceSampler by vmoens in https://github.com/pytorch/rl/pull/2239
* [BugFix] Fix VecNorm test in test_collectors.py by vmoens in https://github.com/pytorch/rl/pull/2162
* [BugFix] Fix `to` in MultiDiscreteTensorSpec by Quinticx in https://github.com/pytorch/rl/pull/2204
* [BugFix] Fix and test PRB priority update across dims and rb types by vmoens in https://github.com/pytorch/rl/pull/2244
* [BugFix] Fix another ctx test by vmoens in https://github.com/pytorch/rl/pull/2284
* [BugFix] Fix async gym env with non-sync resets by vmoens in https://github.com/pytorch/rl/pull/2170
* [BugFix] Fix async gym when all reset by vmoens in https://github.com/pytorch/rl/pull/2144
* [BugFix] Fix brax wrapping by vmoens in https://github.com/pytorch/rl/pull/2190
* [BugFix] Fix collector tests where device ordinal is needed by vmoens in https://github.com/pytorch/rl/pull/2240
* [BugFix] Fix collectors with non tensors by vmoens in https://github.com/pytorch/rl/pull/2232
* [BugFix] Fix done/terminated computation in slice samplers by vmoens in https://github.com/pytorch/rl/pull/2213
* [BugFix] Fix info reading with async gym by vmoens in https://github.com/pytorch/rl/pull/2150
* [BugFix] Fix isaac - bis by vmoens in https://github.com/pytorch/rl/pull/2119
* [BugFix] Fix lib tests by vmoens in https://github.com/pytorch/rl/pull/2218
* [BugFix] Fix max value within buffer during update priority by vmoens in https://github.com/pytorch/rl/pull/2242
* [BugFix] Fix max-priority update by vmoens in https://github.com/pytorch/rl/pull/2215
* [BugFix] Fix non-tensor passage in _StepMDP by vmoens in https://github.com/pytorch/rl/pull/2260
* [BugFix] Fix non-tensor passage in _StepMDP by vmoens in https://github.com/pytorch/rl/pull/2262
* [BugFix] Fix prefetch in samples without replacement - .sample() compatibility issues by vmoens in https://github.com/pytorch/rl/pull/2226
* [BugFix] Fix sampling in NonTensorSpec by vmoens in https://github.com/pytorch/rl/pull/2172
* [BugFix] Fix sampling of values from NonTensorSpec by vmoens in https://github.com/pytorch/rl/pull/2169
* [BugFix] Fix slice sampler end computation at the cursor place by vmoens in https://github.com/pytorch/rl/pull/2225
* [BugFix] Fix sliced PRB when only traj is provided by vmoens in https://github.com/pytorch/rl/pull/2228
* [BugFix] Fix strict length in PRB+SliceSampler by vmoens in https://github.com/pytorch/rl/pull/2202
* [BugFix] Fix strict_length in prioritized slice sampler by vmoens in https://github.com/pytorch/rl/pull/2194
* [BugFix] Fix tanh normal mode by vmoens in https://github.com/pytorch/rl/pull/2198
* [BugFix] Fix tensordict private imports by vmoens in https://github.com/pytorch/rl/pull/2275
* [BugFix] Fix test_specs.py by vmoens in https://github.com/pytorch/rl/pull/2214
* [BugFix] Fix torch 2.3 compatibility of padding indices by vmoens in https://github.com/pytorch/rl/pull/2216
* [BugFix] Fix truncated normal by vmoens in https://github.com/pytorch/rl/pull/2147
* [BugFix] Fix typo in weight assignment in PRB by vmoens in https://github.com/pytorch/rl/pull/2241
* [BugFix] Fix update_priority generic signature for Samplers by vmoens in https://github.com/pytorch/rl/pull/2252
* [BugFix] Fix vecnorm state-dicts by vmoens in https://github.com/pytorch/rl/pull/2158
* [BugFix] Global import of optional library by matteobettini in https://github.com/pytorch/rl/pull/2217
* [BugFix] Gym async with _reset full of `True` by vmoens in https://github.com/pytorch/rl/pull/2145
* [BugFix] MLFlow logger by GJBoth in https://github.com/pytorch/rl/pull/2152
* [BugFix] Make DMControlEnv aware of truncated signals by vmoens in https://github.com/pytorch/rl/pull/2196
* [BugFix] Make `_reset` follow `done` shape by matteobettini in https://github.com/pytorch/rl/pull/2189
* [BugFix] `EnvBase._complete_done` to complete "terminated" key properly by kurtamohler in https://github.com/pytorch/rl/pull/2294
* [BugFix] `LazyTensorStorage` only allocates data on the given device by matteobettini in https://github.com/pytorch/rl/pull/2188
* [BugFix] `done = done | truncated` in collector by vmoens in https://github.com/pytorch/rl/pull/2333
* [BugFix] buffer __iter__ for samplers without replacement + prefetch by JulianKu in https://github.com/pytorch/rl/pull/2185
* [BugFix] buffer `__iter__` for samplers without replacement + prefetch by JulianKu in https://github.com/pytorch/rl/pull/2178
* [BugFix] missing deprecated kwargs by fedebotu in https://github.com/pytorch/rl/pull/2125

Docs
* [Doc] Add Custom Options for VideoRecorder by N00bcak in https://github.com/pytorch/rl/pull/2259
* [Doc] Add documentation for masks in tensor specs by kurtamohler in https://github.com/pytorch/rl/pull/2289
* [Doc] Better doc for make_tensordict_primer by vmoens in https://github.com/pytorch/rl/pull/2324
* [Doc] Dynamic envs by vmoens in https://github.com/pytorch/rl/pull/2191
* [Doc] Edit README for local installs by vmoens in https://github.com/pytorch/rl/pull/2255
* [Doc] Fix algorithms references in tutos by vmoens in https://github.com/pytorch/rl/pull/2320
* [Doc] Fix documentation mismatch for default argument by TheRisenPhoenix in https://github.com/pytorch/rl/pull/2149
* [Doc] Fix links in doc by vmoens in https://github.com/pytorch/rl/pull/2151
* [Doc] Fix mistakes in docs for Trainer checkpointing backends by kurtamohler in https://github.com/pytorch/rl/pull/2285
* [Doc] Indicate necessary context to run multiprocessed collectors in doc by GJBoth in https://github.com/pytorch/rl/pull/2126
* [Doc] Restore colab links by vmoens in https://github.com/pytorch/rl/pull/2197
* [Doc] Update README.md by KPCOFGS in https://github.com/pytorch/rl/pull/2155
* [Doc] default_interaction_type doc by vmoens in https://github.com/pytorch/rl/pull/2177
* [Docs] InitTracker cleanup by matteobettini in https://github.com/pytorch/rl/pull/2330
* [Docs] Reintroduce BenchMARL pointers in MARL tutos by matteobettini in https://github.com/pytorch/rl/pull/2159

Performance
* [Performance, Refactor, BugFix] Faster loading of uninitialized storages by vmoens in https://github.com/pytorch/rl/pull/2221
* [Performance] consolidate TDs in ParallelEnv without buffers by vmoens in https://github.com/pytorch/rl/pull/2231

Others
* Fix "Run in Colab" and "Download Notebook" links in tutorials by kurtamohler in https://github.com/pytorch/rl/pull/2268
* Fix brax examples by Jendker in https://github.com/pytorch/rl/pull/2318
* Fixed several broken links in readme.md by drMJ in https://github.com/pytorch/rl/pull/2156
* Revert "[BugFix] Fix non-tensor passage in _StepMDP" by vmoens in https://github.com/pytorch/rl/pull/2261
* Revert "[BugFix] Fix tensordict private imports" by vmoens in https://github.com/pytorch/rl/pull/2276
* Revert "[BugFix] buffer `__iter__` for samplers without replacement + prefetch" by vmoens in https://github.com/pytorch/rl/pull/2182
* [CI, Tests] Fix windows tests by vmoens in https://github.com/pytorch/rl/pull/2337
* [CI] Bump jinja2 from 3.1.3 to 3.1.4 in /docs by dependabot in https://github.com/pytorch/rl/pull/2250
* [CI] Fix CI by vmoens in https://github.com/pytorch/rl/pull/2245
* [CI] Fix nightly by vmoens in https://github.com/pytorch/rl/pull/2279
* [CI] Fix wheels by vmoens in https://github.com/pytorch/rl/pull/2274
* [CI] Pin transformers version to < 4.42.0 to make vmap happy by vmoens in https://github.com/pytorch/rl/pull/2278
* [CI] Upgrade SDL to install pygame 2.6 by vmoens in https://github.com/pytorch/rl/pull/2248
* [CI] Windows build fix by vmoens in https://github.com/pytorch/rl/pull/2335
* [CI] python 3.12 nightlies by vmoens in https://github.com/pytorch/rl/pull/2281
* [Example,BugFix] Add a Async gym env example by vmoens in https://github.com/pytorch/rl/pull/2139
* [MINOR] Fix unclear language by software-samurai in https://github.com/pytorch/rl/pull/2165
* [Minor] Code quality improvements by vmoens in https://github.com/pytorch/rl/pull/2140
* [Quality] Fix low/high in SOTA implementations by vmoens in https://github.com/pytorch/rl/pull/2266
* [Quality] Fix repr of MARL modules by vmoens in https://github.com/pytorch/rl/pull/2192
* [Quality] Remove global seeding in set_seed by vmoens in https://github.com/pytorch/rl/pull/2195
* [Quality] Warn if the sampler is not prioritized but update_priority is called by vmoens in https://github.com/pytorch/rl/pull/2253
* [Quality] better error message for CompositeSpec shape mismatch by vmoens in https://github.com/pytorch/rl/pull/2223
* [Refactor] Deprecate NormalParamWrapper by vmoens in https://github.com/pytorch/rl/pull/2308
* [Refactor] Remove `_run_checks` from `TensorDict.__init__` by vmoens in https://github.com/pytorch/rl/pull/2256
* [Refactor] Update all instances of exploration `*Wrapper` to `*Module` by kurtamohler in https://github.com/pytorch/rl/pull/2298
* [Refactor] Use td.transpose in multi-step transform by vmoens in https://github.com/pytorch/rl/pull/2288
* [Refactor] tensordict._tensordict -> tensordict._C by vmoens in https://github.com/pytorch/rl/pull/2286
* [Tests] Fix VMAS tests by matteobettini in https://github.com/pytorch/rl/pull/2287
* [Tests] Fix windows tests by vmoens in https://github.com/pytorch/rl/pull/2219
* [Versioning] Add python 3.12 to setup.py by vmoens in https://github.com/pytorch/rl/pull/2282
* [Versioning] Allow any torch version for local builds by vmoens in https://github.com/pytorch/rl/pull/2130
* [Versioning] Bump torch 2.0 as minimal version by vmoens in https://github.com/pytorch/rl/pull/2200
* [Versioning] v0.5 bump by vmoens in https://github.com/pytorch/rl/pull/2267
* [Versioning] windows build - add legacy back and .bat env-script by vmoens in https://github.com/pytorch/rl/pull/2339
* init by vmoens in https://github.com/pytorch/rl/pull/2322

New Contributors
* GJBoth made their first contribution in https://github.com/pytorch/rl/pull/2126
* TheRisenPhoenix made their first contribution in https://github.com/pytorch/rl/pull/2149
* drMJ made their first contribution in https://github.com/pytorch/rl/pull/2156
* KPCOFGS made their first contribution in https://github.com/pytorch/rl/pull/2155
* software-samurai made their first contribution in https://github.com/pytorch/rl/pull/2165
* JulianKu made their first contribution in https://github.com/pytorch/rl/pull/2178
* Quinticx made their first contribution in https://github.com/pytorch/rl/pull/2204
* kurtamohler made their first contribution in https://github.com/pytorch/rl/pull/2268
* N00bcak made their first contribution in https://github.com/pytorch/rl/pull/2259
* Jendker made their first contribution in https://github.com/pytorch/rl/pull/2318

**Full Changelog**: https://github.com/pytorch/rl/compare/v0.4.0...v0.5.0

0.4.0

New Features:

- Better video rendering
* [Feature] A PixelRenderTransform by vmoens in https://github.com/pytorch/rl/pull/2099
* [Feature] Video recording in SOTA examples by vmoens in https://github.com/pytorch/rl/pull/2070
* [Feature] VideoRecorder for datasets and replay buffers by vmoens in https://github.com/pytorch/rl/pull/2069
- Replay buffer: sampling trajectories is now much easier, cleaner and faster
* [Benchmark] Benchmark slice sampler by vmoens in https://github.com/pytorch/rl/pull/1992
* [Feature] Add PrioritizedSliceSampler by Cadene in https://github.com/pytorch/rl/pull/1875
* [Feature] Span slice indices on the left and on the right by vmoens in https://github.com/pytorch/rl/pull/2107
* [Feature] batched trajectories - SliceSampler compatibility by vmoens in https://github.com/pytorch/rl/pull/1775
* [Performance] Faster slice sampler by vmoens in https://github.com/pytorch/rl/pull/2031
- Datasets: allow preprocessing datasets after download
* [Feature] Preproc for datasets by vmoens in https://github.com/pytorch/rl/pull/1989
- Losses: reduction parameters and non-functional execution
* [Feature] Add reduction parameter to On-Policy losses. by albertbou92 in https://github.com/pytorch/rl/pull/1890
* [Feature] Adds value clipping in ClipPPOLoss loss by albertbou92 in https://github.com/pytorch/rl/pull/2005
* [Feature] Offline objectives reduction parameter by albertbou92 in https://github.com/pytorch/rl/pull/1984
- Environment API: support "fork" start method in ParallelEnv, better handling of auto-resetting envs.
* [Feature] Use non-default mp start method in ParallelEnv by vmoens in https://github.com/pytorch/rl/pull/1966
* [Feature] Auto-resetting envs by vmoens in https://github.com/pytorch/rl/pull/2073
- Transforms
* [Feature] Allow any callable to be used as transform by vmoens in https://github.com/pytorch/rl/pull/2027
* [Feature] invert transforms appended to a RB by vmoens in https://github.com/pytorch/rl/pull/2111
* [Feature] Extend TensorDictPrimer default_value options by albertbou92 in https://github.com/pytorch/rl/pull/2071
* [Feature] Fine grained DeviceCastTransform by vmoens in https://github.com/pytorch/rl/pull/2041
* [Feature] BatchSizeTransform by vmoens in https://github.com/pytorch/rl/pull/2030
* [Feature] Allow non-sorted keys in CatFrames by vmoens in https://github.com/pytorch/rl/pull/1913
* [Feature] env.append_transform by vmoens in https://github.com/pytorch/rl/pull/2040
- New environment and improvements:
* [Environment] Meltingpot by matteobettini in https://github.com/pytorch/rl/pull/2054
* [Feature] Return depth from RoboHiveEnv by sriramsk1999 in https://github.com/pytorch/rl/pull/2058
* [Feature] PettingZoo possibility to choose reset strategy by matteobettini in https://github.com/pytorch/rl/pull/2048

Other features
* [Feature] Add time_dim arg in value modules by vmoens in https://github.com/pytorch/rl/pull/1946
* [Feature] Batched actions wrapper by vmoens in https://github.com/pytorch/rl/pull/2018
* [Feature] Better repr of RBs by vmoens in https://github.com/pytorch/rl/pull/1991
* [Feature] Execute rollouts with regular nn.Module instances by vmoens in https://github.com/pytorch/rl/pull/1947
* [Feature] Logger by vmoens in https://github.com/pytorch/rl/pull/1858
* [Feature] Passing lists of keyword arguments in ``reset`` for batched envs by vmoens in https://github.com/pytorch/rl/pull/2076
* [Feature] RB MultiStep transform by vmoens in https://github.com/pytorch/rl/pull/2008
* [Feature] Replace RewardClipping with SignTransform in Atari examples by albertbou92 in https://github.com/pytorch/rl/pull/1870
* [Feature] `reset_parameters` for multiagent nets by matteobettini in https://github.com/pytorch/rl/pull/1970
* [Feature] optionally set truncated = True at the end of rollouts by vmoens in https://github.com/pytorch/rl/pull/2042

Miscellaneous

* Fix onw typo by kit1980 in https://github.com/pytorch/rl/pull/1917
* Rename SOTA-IMPLEMENTATIONS.md to README.md by matteobettini in https://github.com/pytorch/rl/pull/2093
* Revert "[BugFix] Fix Isaac" by vmoens in https://github.com/pytorch/rl/pull/2118
* Update getting-started-5.py by vmoens in https://github.com/pytorch/rl/pull/1894
* [BugFix, Performance] Fewer imports at root by vmoens in https://github.com/pytorch/rl/pull/1930
* [BugFix,CI] Fix Windows CI by vmoens in https://github.com/pytorch/rl/pull/1983
* [BugFix,CI] Fix sporadically failing tests in CI by vmoens in https://github.com/pytorch/rl/pull/2098
* [BugFix,Refactor] Dreamer refactor by BY571 in https://github.com/pytorch/rl/pull/1918
* [BugFix] Adaptable non-blocking for mps and non cuda device in batched-envs by vmoens in https://github.com/pytorch/rl/pull/1900
* [BugFix] Call contiguous on rollout results in TestMultiStepTransform by vmoens in https://github.com/pytorch/rl/pull/2025
* [BugFix] Dedicated tests for on policy losses reduction parameter by albertbou92 in https://github.com/pytorch/rl/pull/1974
* [BugFix] Extend with a list of tensordicts by vmoens in https://github.com/pytorch/rl/pull/2032
* [BugFix] Fix Atari DQN ensembling by vmoens in https://github.com/pytorch/rl/pull/1981
* [BugFix] Fix CQL/IQL pbar update by vmoens in https://github.com/pytorch/rl/pull/2020
* [BugFix] Fix Exclude / Double2Float transforms by vmoens in https://github.com/pytorch/rl/pull/2101
* [BugFix] Fix Isaac by vmoens in https://github.com/pytorch/rl/pull/2072
* [BugFix] Fix KLPENPPOLoss KL computation by vmoens in https://github.com/pytorch/rl/pull/1922
* [BugFix] Fix MPS sync in device transform by vmoens in https://github.com/pytorch/rl/pull/2061
* [BugFix] Fix OOB TruncatedNormal LP by vmoens in https://github.com/pytorch/rl/pull/1924
* [BugFix] Fix R2Go once more by vmoens in https://github.com/pytorch/rl/pull/2089
* [BugFix] Fix Ray collector example error by albertbou92 in https://github.com/pytorch/rl/pull/1908
* [BugFix] Fix Ray collector on Python > 3.8 by albertbou92 in https://github.com/pytorch/rl/pull/2015
* [BugFix] Fix RoboHiveEnv tests by sriramsk1999 in https://github.com/pytorch/rl/pull/2062
* [BugFix] Fix _reset data passing in parallel env by vmoens in https://github.com/pytorch/rl/pull/1880
* [BugFix] Fix a bug in SliceSampler, indexes outside sampler lengths were produced by vladisai in https://github.com/pytorch/rl/pull/1874
* [BugFix] Fix args/kwargs passing in advantages by vmoens in https://github.com/pytorch/rl/pull/2001
* [BugFix] Fix batch-size expansion in functionalization by vmoens in https://github.com/pytorch/rl/pull/1959
* [BugFix] Fix broken gym tests by vmoens in https://github.com/pytorch/rl/pull/1980
* [BugFix] Fix clip_fraction in PO losses by vmoens in https://github.com/pytorch/rl/pull/2021
* [BugFix] Fix colab in tutos by vmoens in https://github.com/pytorch/rl/pull/2113
* [BugFix] Fix env.shape regex matches by vmoens in https://github.com/pytorch/rl/pull/1940
* [BugFix] Fix examples by vmoens in https://github.com/pytorch/rl/pull/1945
* [BugFix] Fix exploration in losses by vmoens in https://github.com/pytorch/rl/pull/1898
* [BugFix] Fix flaky rb tests by vmoens in https://github.com/pytorch/rl/pull/1901
* [BugFix] Fix habitat by vmoens in https://github.com/pytorch/rl/pull/1941
* [BugFix] Fix jumanji by vmoens in https://github.com/pytorch/rl/pull/2064
* [BugFix] Fix load_state_dict and is_empty td bugfix impact by vmoens in https://github.com/pytorch/rl/pull/1869
* [BugFix] Fix mp_start_method for ParallelEnv with single_for_serial by vmoens in https://github.com/pytorch/rl/pull/2007
* [BugFix] Fix multiple context syntax in multiagent examples by matteobettini in https://github.com/pytorch/rl/pull/1943
* [BugFix] Fix offline CatFrames by vmoens in https://github.com/pytorch/rl/pull/1953
* [BugFix] Fix offline CatFrames for pixels by vmoens in https://github.com/pytorch/rl/pull/1964
* [BugFix] Fix prints of size error when no file is associated with memmap by vmoens in https://github.com/pytorch/rl/pull/2090
* [BugFix] Fix replay buffer extension with lists by vmoens in https://github.com/pytorch/rl/pull/1937
* [BugFix] Fix reward2go for nd tensors by vmoens in https://github.com/pytorch/rl/pull/2087
* [BugFix] Fix robohive by vmoens in https://github.com/pytorch/rl/pull/2080
* [BugFix] Fix sampling without replacement with ndim storages by vmoens in https://github.com/pytorch/rl/pull/1999
* [BugFix] Fix slice sampler compatibility with split_trajs and MultiStep by vmoens in https://github.com/pytorch/rl/pull/1961
* [BugFix] Fix slicesampler terminated/truncated signaling by vmoens in https://github.com/pytorch/rl/pull/2044
* [BugFix] Fix strict-length for spanning trajectories by vmoens in https://github.com/pytorch/rl/pull/1982
* [BugFix] Fix strict_length=True in SliceSampler by vmoens in https://github.com/pytorch/rl/pull/2037
* [BugFix] Fix unwanted lazy stacks by vmoens in https://github.com/pytorch/rl/pull/2102
* [BugFix] Fix update in serial / parallel env by vmoens in https://github.com/pytorch/rl/pull/1866
* [BugFix] Fix vmas stacks by vmoens in https://github.com/pytorch/rl/pull/2105
* [BugFix] Fixed import for importlib by DanilBaibak in https://github.com/pytorch/rl/pull/1914
* [BugFix] Make KL-controllers independent of the model by vmoens in https://github.com/pytorch/rl/pull/1903
* [BugFix] Make sure ParallelEnv does not overflow mem when policy requires grad by vmoens in https://github.com/pytorch/rl/pull/1909
* [BugFix] More robust _StepMDP and multi-purpose envs by vmoens in https://github.com/pytorch/rl/pull/2038
* [BugFix] No grad on collector reset by matteobettini in https://github.com/pytorch/rl/pull/1927
* [BugFix] Non exclusive terminated and truncated by vmoens in https://github.com/pytorch/rl/pull/1911
* [BugFix] Refactor reductions by vmoens in https://github.com/pytorch/rl/pull/1968
* [BugFix] Remove `split_trajectories`'s reference to `("next", "done")`. by initmaks in https://github.com/pytorch/rl/pull/2094
* [BugFix] Remove reset on last step of a rollout by matteobettini in https://github.com/pytorch/rl/pull/1936
* [BugFix] Robust sync for non_blocking=True by vmoens in https://github.com/pytorch/rl/pull/2034
* [BugFix] Set default value for `normalize_advantage` to `False`. by DobromirM in https://github.com/pytorch/rl/pull/2050
* [BugFix] Set strict=False in tensordict.select() calls for objective classes by albertbou92 in https://github.com/pytorch/rl/pull/2004
* [BugFix] SliceSampler device and index mesh by vmoens in https://github.com/pytorch/rl/pull/1996
* [BugFix] Solve recursion issue in losses hook by vmoens in https://github.com/pytorch/rl/pull/1897
* [BugFix] Update cql docstring example by BY571 in https://github.com/pytorch/rl/pull/1951
* [BugFix] Update iql docstring example by BY571 in https://github.com/pytorch/rl/pull/1950
* [BugFix] Use same signature for append_transform in all cases by vmoens in https://github.com/pytorch/rl/pull/2091
* [BugFix] Use setdefault in _cache_values by vmoens in https://github.com/pytorch/rl/pull/1910
* [BugFix] Use traj_terminated in SliceSampler by Cadene in https://github.com/pytorch/rl/pull/1884
* [BugFix] Vmap randomness for value estimator by BY571 in https://github.com/pytorch/rl/pull/1942
* [BugFix] better device consistency in EGreedy by vmoens in https://github.com/pytorch/rl/pull/1867
* [BugFix] check_env_specs seeding logic by vmoens in https://github.com/pytorch/rl/pull/1872
* [BugFix] fix formatting for VideoRecorder docstring by sriramsk1999 in https://github.com/pytorch/rl/pull/1985
* [BugFix] fix trunc normal device by vmoens in https://github.com/pytorch/rl/pull/1931
* [BugFix] missing annotations import by vmoens in https://github.com/pytorch/rl/pull/2074
* [BugFix] state typo in RNG control module by vmoens in https://github.com/pytorch/rl/pull/1878
* [BugFix] to_observation_norm now works with keys which are not strings by maxweissenbacher in https://github.com/pytorch/rl/pull/2045
* [BugFix] union -> intersection in _StepMDP check by vmoens in https://github.com/pytorch/rl/pull/2039
* [CI,Doc] Sanitize version by vmoens in https://github.com/pytorch/rl/pull/2120
* [CI] Doc on release tag by vmoens in https://github.com/pytorch/rl/pull/2116
* [CI] Fix CI issues by vmoens in https://github.com/pytorch/rl/pull/2084
* [CI] Fix Doc CI by matteobettini in https://github.com/pytorch/rl/pull/2106
* [CI] Fixes sympy error by fixing mpmath version by vmoens in https://github.com/pytorch/rl/pull/1988
* [CI] Install ffmpeg in Robohive tests by vmoens in https://github.com/pytorch/rl/pull/2063
* [CI] Install stable torch and tensordict for release tests by vmoens in https://github.com/pytorch/rl/pull/1978
* [CI] Remove all macos x86 jobs by vmoens in https://github.com/pytorch/rl/pull/2117
* [CI] Remove x86 OSX jobs by vmoens in https://github.com/pytorch/rl/pull/2112
* [CI] Schedule workflows for releases by vmoens in https://github.com/pytorch/rl/pull/2114
* [CI] Temporarily remove snapshot from CI by vmoens in https://github.com/pytorch/rl/pull/2000
* [CI] Unpin mpmath by vmoens in https://github.com/pytorch/rl/pull/1997
* [CI] Upgrade 3.8 to 3.10 GPU jobs by vmoens in https://github.com/pytorch/rl/pull/2013
* [Deprecation] Deprecate in prep for release by vmoens in https://github.com/pytorch/rl/pull/1820
* [Doc,Feature] Better doc for modules and list of kwargs when possible by vmoens in https://github.com/pytorch/rl/pull/1990
* [Doc] Fix tutos by vmoens in https://github.com/pytorch/rl/pull/1863
* [Doc] Getting started tutos by vmoens in https://github.com/pytorch/rl/pull/1886
* [Doc] Improve PrioritizedSampler doc and get rid of np dependency as much as possible by vmoens in https://github.com/pytorch/rl/pull/1881
* [Doc] Installation instructions in API ref by vmoens in https://github.com/pytorch/rl/pull/1871
* [Doc] Per-release doc by vmoens in https://github.com/pytorch/rl/pull/2108
* [Documentation] Correct MaskedEnv Example in ActionMask Transform Documentation by Jonathanace in https://github.com/pytorch/rl/pull/2060
* [Examples] Move examples to sota-implementations by vmoens in https://github.com/pytorch/rl/pull/2016
* [Minor] Add env.shape attribute by vmoens in https://github.com/pytorch/rl/pull/1938
* [Minor] Lint by vmoens in https://github.com/pytorch/rl/pull/2096
* [Minor] Move distributed examples to examples by vmoens in https://github.com/pytorch/rl/pull/2097
* [Minor] Remove duplicate if statement in storages by vmoens in https://github.com/pytorch/rl/pull/2066
* [Minor] Remove warnings in test_cost by vmoens in https://github.com/pytorch/rl/pull/1902
* [Minor] Support init lazy storages with add by vmoens in https://github.com/pytorch/rl/pull/2028
* [Minor] Use the main branch for the M1 build wheels by DanilBaibak in https://github.com/pytorch/rl/pull/1965
* [Performance] Faster DMC by vmoens in https://github.com/pytorch/rl/pull/2002
* [Quality] Capture errors in specs transforms by vmoens in https://github.com/pytorch/rl/pull/2092
* [Quality] Make sure deprec warnings are displayed by vmoens in https://github.com/pytorch/rl/pull/2088
* [Refactor,Feature] Refactor collector shapes and stack_result in sync collector by vmoens in https://github.com/pytorch/rl/pull/1994
* [Refactor] Clearer separation between single_task and share_individual_td by vmoens in https://github.com/pytorch/rl/pull/2026
* [Refactor] Faster and more generic multi-agent nets by vmoens in https://github.com/pytorch/rl/pull/1921
* [Refactor] Refactor split_trajectories by vmoens in https://github.com/pytorch/rl/pull/1955
* [Refactor] Remove remnant legacy functional calls by vmoens in https://github.com/pytorch/rl/pull/1973
* [Refactor] Use filter_empty=False in apply for params by vmoens in https://github.com/pytorch/rl/pull/1882
* [Refactor] Use filter_empty=True in apply by vmoens in https://github.com/pytorch/rl/pull/1879
* [Tutorial] PettingZoo Parallel competitive tutorial by matteobettini in https://github.com/pytorch/rl/pull/2047
* [Versioning] Deprecations for 0.4 by vmoens in https://github.com/pytorch/rl/pull/2109
* [Versioning] New torch version by vmoens in https://github.com/pytorch/rl/pull/2110
* [Versioning] v0.4.0 by vmoens in https://github.com/pytorch/rl/pull/1860

New Contributors
* vladisai made their first contribution in https://github.com/pytorch/rl/pull/1874
* Cadene made their first contribution in https://github.com/pytorch/rl/pull/1884
* sriramsk1999 made their first contribution in https://github.com/pytorch/rl/pull/1985
* DobromirM made their first contribution in https://github.com/pytorch/rl/pull/2050
* Jonathanace made their first contribution in https://github.com/pytorch/rl/pull/2060
* maxweissenbacher made their first contribution in https://github.com/pytorch/rl/pull/2045
* initmaks made their first contribution in https://github.com/pytorch/rl/pull/2094

A big thanks to our dear contributors as well as the entire user base for helping with this lib!

**Full Changelog**: https://github.com/pytorch/rl/compare/v0.3.0...v0.4.0

Page 1 of 4

Releases

Has known vulnerabilities

Torchrl

Page 1 of 4

0.7.2

0.7.1

0.7.0

0.6.0

0.5.0

0.4.0

Page 1 of 4

Links

Releases