D3rlpy

Latest version: v2.8.1

Safety actively analyzes 722525 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 7

2111.03788

Benchmarks
The full benchmark results are finally available at [d3rlpy-benchmarks](https://github.com/takuseno/d3rlpy-benchmarks).

Algorithms
- Implicit Q-Learning (IQL)
- https://arxiv.org/abs/2110.06169

Enhancements
- `deterministic` option is added to `collect` method
- `rollout_return` metrics is added to online training
- `random_steps` is added to `fit_online` method
- `--save` option is added to `d3rlpy` CLI commands (thanks, pstansell )
- `multiplier` option is added to reward normalizers
- many reproduction scripts are added
- `policy_type` option is added to BC
- `get_atari_transition` function is added for the Atari 2600 offline benchmark procedure

Bugfix
- document fix (thanks, araffin )
- Fix TD3+BC's actor loss function
- Fix gaussian noise for TD3 exploration

1712.00378

batch online training
When training with computationally expensive environments such as robotics simulators or rich 3D games, it will take a long time to finish due to the slow environment steps.
To solve this, d3rlpy supports batch online training.
py
from d3rlpy.algos import SAC
from d3rlpy.envs import AsyncBatchEnv

if __name__ == '__main__': this is necessary if you use AsyncBatchEnv
env = AsyncBatchEnv([lambda: gym.make('Hopper-v2') for _ in range(10)]) distributing 10 environments in different processes

sac = SAC(use_gpu=True)
sac.fit_batch_online(env) train with 10 environments concurrently


docker image
Pre-built d3rlpy docker image is available in [DockerHub](https://hub.docker.com/repository/docker/takuseno/d3rlpy).

$ docker run -it --gpus all --name d3rlpy takuseno/d3rlpy:latest bash


enhancements
- `BEAR` algorithm is updated based on the official implementation
- new `mmd_kernel` option is available
- `to_mdp_dataset` method is added to `ReplayBuffer`
- `ConstantEpsilonGreedy` explorer is added
- `d3rlpy.envs.ChannelFirst` wrapper is added (thanks for reporting, feyza-droid )
- new dataset utility function `d3rlpy.datasets.get_d4rl` is added
- this is handling timeouts inside the function
- offline RL paper reproduction codes are added
- smoothed moving average plot at `d3rlpy plot` CLI function (thanks, pstansell )
- user-friendly messages for assertion errors
- better memory consumption
- `save_interval` argument is added to `fit_online`

bugfix
- core dumps are fixed in Google Colaboratory tutorials
- typos in some documentations (thanks for reporting, pstansell )

2.8.1

Bugfix
- Pin Gymnasium version at 1.0.0 to prevent version mismatch errors between `gymnasium` and `gymnasium-robotics`.

Enhancement
- maze2d datasets have been supported.

2.8.0

New algorithms
- [PRDC](https://arxiv.org/abs/2306.06569) (thanks, liyc-ai )
- [QDT](https://arxiv.org/abs/2209.03993) (thanks, takuyamagata )
- [TACR](https://dl.acm.org/doi/10.5555/3545946.3599088)

Enhancement
- Health check is updated to check if PyTorch version is 2.5.0 or later.
- Shimmy version has been upgraded.
- Minari version has been upgraded.

Bugfix
- Model loading error caused by mismatched optimizer data has been fixed (thanks, hasan-yaman )
- Fix `map_location` to support loading models trained with GPU onto CPU.
- Fix Adroit dataset support.

2.7.0

Breaking changes

Dependency
:warning: This release updates the following dependencies.
- Python 3.9 or later
- PyTorch v2.5.0 or later

OptimizerFactory
Import paths of `OptimizerFactory` has been changed from `d3rlpy.models.OptimizerFactory` to `d3rlpy.optimizers.OptimizerFactory`.
py
before
optim = d3rlpy.models.AdamFactory()

after
optim = d3rlpy.optimizers.AdamFactory()



x2-3 speed up with CudaGraph and torch.compile
In this PR, d3rlpy supports CudaGraph and torch.compile to dramatically speed up training. You can just turn on this new feature by providing `compile_graph` option:
py
import d3rlpy

enable CudaGraph and torch.compile
sac = d3rlpy.algos.SACConfig(compile_graph=True).create(device="cuda:0")

Here is some benchmark result with NVIDIA RTX4070:

2.6.2

This is an emergency update to resolve an issue caused by the new Gymnasium version v1.0.0. Additionally, d3rlpy internally checks versions of both Gym and Gymnasium to make sure that dependencies are correct.

Page 1 of 7

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.