Algorithms
- Support the discrete version of Soft Actor-Critic
- https://arxiv.org/abs/1910.07207
- `fit_online` has `n_steps` argument instead of `n_epochs` for the complete reproduction of the papers.
OptimizerFactory
d3rlpy provides more flexible controls for optimizer configuration via `OptimizerFactory`.
py
from d3rlpy.optimizers import AdamFactory
from d3rlpy.algos import DQN
dqn = DQN(optim_factory=AdamFactory(weight_decay=1e-4))
See more at https://d3rlpy.readthedocs.io/en/v0.40/references/optimizers.html .
EncoderFactory
d3rlpy provides more flexible controls for the neural network architecture via `EncoderFactory`.
py
from d3rlpy.algos import DQN
from d3rlpy.encoders import VectorEncoderFactory
encoder factory
encoder_factory = VectorEncoderFactory(hidden_units=[300, 400], activation='tanh')
set OptimizerFactory
dqn = DQN(encoder_factory=encoder_factory)
Also you can build your own encoders.
py
import torch
import torch.nn as nn
from d3rlpy.encoders import EncoderFactory
your own neural network
class CustomEncoder(nn.Module):
def __init__(self, obsevation_shape, feature_size):
self.feature_size = feature_size
self.fc1 = nn.Linear(observation_shape[0], 64)
self.fc2 = nn.Linear(64, feature_size)
def forward(self, x):
h = torch.relu(self.fc1(x))
h = torch.relu(self.fc2(h))
return h
THIS IS IMPORTANT!
def get_feature_size(self):
return self.feature_size
your own encoder factory
class CustomEncoderFactory(EncoderFactory):
TYPE = 'custom' this is necessary
def __init__(self, feature_size):
self.feature_size = feature_size
def create(self, observation_shape, action_size=None, discrete_action=False):
return CustomEncoder(observation_shape, self.feature_size)
def get_params(self, deep=False):
return {
'feature_size': self.feature_size
}
dqn = DQN(encoder_factory=CustomEncoderFactory(feature_size=64))
See more at https://d3rlpy.readthedocs.io/en/v0.40/references/network_architectures.html .
Stable Baselines 3 wrapper
- Now d3rlpy is partially compatible with [Stable Baselines 3](https://github.com/DLR-RM/stable-baselines3).
- https://github.com/takuseno/d3rlpy/blob/master/d3rlpy/wrappers/sb3.py
- More documentations will be available soon.
bugfix
- fix the memory leak problem at `fit_online`.
- Now, you can train online algorithms with the big replay buffer size for the image observation.
- fix preprocessing at CQL.
- fix ColorJitter augmentation.
installation
PyPi
- From this version, d3rlpy officially supports Windows.
- The binary packages for each platform are built in GitHub Actions. And they are uploaded, which means that you don't have to install Cython to install this package from PyPi.
Anaconda
- From previous version, d3rlpy is available in conda-forge.