D3rlpy

Latest version: v2.7.0

Safety actively analyzes 701442 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 7

2111.03788

Benchmarks
The full benchmark results are finally available at [d3rlpy-benchmarks](https://github.com/takuseno/d3rlpy-benchmarks).

Algorithms
- Implicit Q-Learning (IQL)
- https://arxiv.org/abs/2110.06169

Enhancements
- `deterministic` option is added to `collect` method
- `rollout_return` metrics is added to online training
- `random_steps` is added to `fit_online` method
- `--save` option is added to `d3rlpy` CLI commands (thanks, pstansell )
- `multiplier` option is added to reward normalizers
- many reproduction scripts are added
- `policy_type` option is added to BC
- `get_atari_transition` function is added for the Atari 2600 offline benchmark procedure

Bugfix
- document fix (thanks, araffin )
- Fix TD3+BC's actor loss function
- Fix gaussian noise for TD3 exploration

1712.00378

batch online training
When training with computationally expensive environments such as robotics simulators or rich 3D games, it will take a long time to finish due to the slow environment steps.
To solve this, d3rlpy supports batch online training.
py
from d3rlpy.algos import SAC
from d3rlpy.envs import AsyncBatchEnv

if __name__ == '__main__': this is necessary if you use AsyncBatchEnv
env = AsyncBatchEnv([lambda: gym.make('Hopper-v2') for _ in range(10)]) distributing 10 environments in different processes

sac = SAC(use_gpu=True)
sac.fit_batch_online(env) train with 10 environments concurrently


docker image
Pre-built d3rlpy docker image is available in [DockerHub](https://hub.docker.com/repository/docker/takuseno/d3rlpy).

$ docker run -it --gpus all --name d3rlpy takuseno/d3rlpy:latest bash


enhancements
- `BEAR` algorithm is updated based on the official implementation
- new `mmd_kernel` option is available
- `to_mdp_dataset` method is added to `ReplayBuffer`
- `ConstantEpsilonGreedy` explorer is added
- `d3rlpy.envs.ChannelFirst` wrapper is added (thanks for reporting, feyza-droid )
- new dataset utility function `d3rlpy.datasets.get_d4rl` is added
- this is handling timeouts inside the function
- offline RL paper reproduction codes are added
- smoothed moving average plot at `d3rlpy plot` CLI function (thanks, pstansell )
- user-friendly messages for assertion errors
- better memory consumption
- `save_interval` argument is added to `fit_online`

bugfix
- core dumps are fixed in Google Colaboratory tutorials
- typos in some documentations (thanks for reporting, pstansell )

2.7.0

Breaking changes

Dependency
:warning: This release updates the following dependencies.
- Python 3.9 or later
- PyTorch v2.5.0 or later

OptimizerFactory
Import paths of `OptimizerFactory` has been changed from `d3rlpy.models.OptimizerFactory` to `d3rlpy.optimizers.OptimizerFactory`.
py
before
optim = d3rlpy.models.AdamFactory()

after
optim = d3rlpy.optimizers.AdamFactory()



x2-3 speed up with CudaGraph and torch.compile
In this PR, d3rlpy supports CudaGraph and torch.compile to dramatically speed up training. You can just turn on this new feature by providing `compile_graph` option:
py
import d3rlpy

enable CudaGraph and torch.compile
sac = d3rlpy.algos.SACConfig(compile_graph=True).create(device="cuda:0")

Here is some benchmark result with NVIDIA RTX4070:

2.6.2

This is an emergency update to resolve an issue caused by the new Gymnasium version v1.0.0. Additionally, d3rlpy internally checks versions of both Gym and Gymnasium to make sure that dependencies are correct.

2.6.1

Bugfix
There has been an issue in data-parallel distributed training feature of d3rlpy. Each process doesn't correctly synchronize parameters. In this release, this issue has been fixed and the data-parallel distributed training is working properly. Please check the latest example [script](https://github.com/takuseno/d3rlpy/blob/master/examples/distributed_offline_training.py) to see how to use it.

2.6.0

New Algorithm
[ReBRAC](https://arxiv.org/abs/2305.09836) has been added to d3rlpy! Please check a reproduction script [here](https://github.com/takuseno/d3rlpy/blob/master/reproductions/offline/rebrac.py).

Enhancement
- DeepMind Control support has been added. You can install dependencies by `d3rlpy install dm_control`. Please check an example script [here](https://github.com/takuseno/d3rlpy/blob/master/examples/deepmind_control.py).
- `use_layer_norm` option has been added to `VectorEncoderFactory`.

Bugfix
- Fix return-to-go calculation for Decision Transformer.
- Fix custom model documentation.

Page 1 of 7

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.