The major update has been finally released! Since the start of the project, this project has earned almost 1K GitHub stars :star: , which is a great milestone of d3rlpy. In this update, there are many major changes.
Upgrade Gym version
From this version, d3rlpy only supports the latest Gym version `0.26.0`. This change allows us to support `Gymnasium` in the future update.
Algorithm
Clear separation between configuration and algorithm
From this version, each algorithm (e.g. "DQN") has a config class (e.g. "DQNConfig"). This allows us to serialize and deserialize algorithms as described later.
py
dqn = d3rlpy.algos.DQNConfig(learning_rate=3e-4).create(device="cuda:0")
Decision Transformer
`Decision Transformer` is finally available! You can check [reproduction](https://github.com/takuseno/d3rlpy/blob/master/reproductions/offline/decision_transformer.py) code to see how to use it.
py
import d3rlpy
dataset, env = d3rlpy.datasets.get_pendulum()
dt = d3rlpy.algos.DecisionTransformerConfig(
batch_size=64,
learning_rate=1e-4,
optim_factory=d3rlpy.models.AdamWFactory(weight_decay=1e-4),
encoder_factory=d3rlpy.models.VectorEncoderFactory(
[128],
exclude_last_activation=True,
),
observation_scaler=d3rlpy.preprocessing.StandardObservationScaler(),
reward_scaler=d3rlpy.preprocessing.MultiplyRewardScaler(0.001),
context_size=20,
num_heads=1,
num_layers=3,
warmup_steps=10000,
max_timestep=1000,
).create(device="cuda:0")
dt.fit(
dataset,
n_steps=100000,
n_steps_per_epoch=1000,
save_interval=10,
eval_env=env,
eval_target_return=0.0,
)
Serialization
In this version, d3rlpy introduces a compact serialization, `d3` format, that includes both hyperparameters and model parameters in a single file. This makes it possible for you to easily save checkpoints and reconstruct algorithms for evaluation and deployment.
py
import d3rlpy
dataset, env = d3rlpy.datasets.get_cartpole()
dqn = d3rlpy.algos.DQNConfig().create()
dqn.fit(dataset, n_steps=10000)
save as d3 file
dqn.save("model.d3")
reconstruct the exactly same DQN
new_dqn = d3rlpy.load_learnable("model.d3")
ReplayBuffer
From this version, there is no clear separation between `ReplayBuffer` and `MDPDataset` anymore. Instead, `ReplayBuffer` has unlimited flexibility to support any kinds of algorithms and experiments. Please check details at [documentation](https://d3rlpy.readthedocs.io/en/v2.0.2/references/dataset.html).