batch online training
When training with computationally expensive environments such as robotics simulators or rich 3D games, it will take a long time to finish due to the slow environment steps.
To solve this, d3rlpy supports batch online training.
py
from d3rlpy.algos import SAC
from d3rlpy.envs import AsyncBatchEnv
if __name__ == '__main__': this is necessary if you use AsyncBatchEnv
env = AsyncBatchEnv([lambda: gym.make('Hopper-v2') for _ in range(10)]) distributing 10 environments in different processes
sac = SAC(use_gpu=True)
sac.fit_batch_online(env) train with 10 environments concurrently
docker image
Pre-built d3rlpy docker image is available in [DockerHub](https://hub.docker.com/repository/docker/takuseno/d3rlpy).
$ docker run -it --gpus all --name d3rlpy takuseno/d3rlpy:latest bash
enhancements
- `BEAR` algorithm is updated based on the official implementation
- new `mmd_kernel` option is available
- `to_mdp_dataset` method is added to `ReplayBuffer`
- `ConstantEpsilonGreedy` explorer is added
- `d3rlpy.envs.ChannelFirst` wrapper is added (thanks for reporting, feyza-droid )
- new dataset utility function `d3rlpy.datasets.get_d4rl` is added
- this is handling timeouts inside the function
- offline RL paper reproduction codes are added
- smoothed moving average plot at `d3rlpy plot` CLI function (thanks, pstansell )
- user-friendly messages for assertion errors
- better memory consumption
- `save_interval` argument is added to `fit_online`
bugfix
- core dumps are fixed in Google Colaboratory tutorials
- typos in some documentations (thanks for reporting, pstansell )