Rl-baselines3-zoo

Latest version: v1.6.2

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 2

1.2.0

Breaking Changes
- Upgrade to Stable-Baselines3 (SB3) >= 1.2.0
- Upgrade to sb3-contrib >= 1.2.0

New Features

Bug fixes
- Fix `--load-last-checkpoint` (SammyRamone)
- Fix `TypeError` for `gym.Env` class entry points in `ExperimentManager` (schuderer)
- Fix usage of callbacks during hyperparameter optimization (SammyRamone)

Documentation

Other
- Added python 3.9 to Github CI
- Increased DQN replay buffer size for Atari games (nikhilrayaprolu)

1.1.0

Breaking Changes
- Upgrade to Stable-Baselines3 (SB3) >= 1.1.0
- Upgrade to sb3-contrib >= 1.1.0
- Add timeout handling (cf SB3 doc)
- `HER` is now a replay buffer class and no more an algorithm
- Removed `PlotNoiseRatioCallback`
- Removed `PlotActionWrapper`
- Changed `'lr'` key in Optuna param dict to `'learning_rate'` so the dict can be directly passed to SB3 methods (jkterry)

New Features
- Add support for recording videos of best models and checkpoints (mcres)
- Add support for recording videos of training experiments (mcres)
- Add support for dictionary observations
- Added experimental parallel training (with `utils.callbacks.ParallelTrainCallback`)
- Added support for using multiple envs for evaluation
- Added `--load-last-checkpoint` option for the enjoy script
- Save Optuna study object at the end of hyperparameter optimization and plot the results (`plotly` package required)
- Allow to pass multiple folders to `scripts/plot_train.py`
- Flag to save logs and optimal policies from each training run (jkterry)

Bug fixes
- Fixed video rendering for PyBullet envs on Linux
- Fixed `get_latest_run_id()` so it works in Windows too (NicolasHaeffner)
- Fixed video record when using `HER` replay buffer

Documentation
- Updated README (dict obs are now supported)

Other
- Added `is_bullet()` to `ExperimentManager`
- Simplify `close()` for the enjoy script
- Updated docker image to include latest black version
- Updated TD3 Walker2D model (thanks modanesh)
- Fixed typo in plot title (scottemmons)
- Minimum cloudpickle version added to `requirements.txt` (amy12xx)
- Fixed atari-py version (ROM missing in newest release)
- Updated `SAC` and `TD3` search spaces
- Cleanup eval_freq documentation and variable name changes (jkterry)
- Add clarifying print statement when printing saved hyperparameters during optimization (jkterry)
- Clarify n_evaluations help text (jkterry)
- Simplified hyperparameters files making use of defaults
- Added new TQC+HER agents
- Add `panda-gym`environments (qgallouedec)

1.0

Breaking Changes
- Upgrade to SB3 >= 1.0
- Upgrade to sb3-contrib >= 1.0

New Features
- Added 100+ trained agents + benchmark file
- Add support for loading saved model under python 3.8+ (no retraining possible)
- Added Robotics pre-trained agents (sgillen)

Bug fixes
- Bug fixes for `HER` handling action noise
- Fixed double reset bug with `HER` and enjoy script

Documentation
- Added doc about plotting scripts

Other
- Updated `HER` hyperparameters

Pre-Release 0.11.1 (2021-02-27)

Breaking Changes
- Removed `LinearNormalActionNoise`
- Evaluation is now deterministic by default, except for Atari games
- `sb3_contrib` is now required
- `TimeFeatureWrapper` was moved to the contrib repo
- Replaced old `plot_train.py` script with updated `plot_training_success.py`
- Renamed ``n_episodes_rollout`` to ``train_freq`` tuple to match latest version of SB3

New Features
- Added option to choose which `VecEnv` class to use for multiprocessing
- Added hyperparameter optimization support for `TQC`
- Added support for `QR-DQN` from SB3 contrib

Bug fixes
- Improved detection of Atari games
- Fix potential bug in plotting script when there is not enough timesteps
- Fixed a bug when using HER + DQN/TQC for hyperparam optimization

Documentation
- Improved documentation (cboettig)

Other
- Refactored train script, now uses a `ExperimentManager` class
- Replaced `make_env` with SB3 built-in `make_vec_env`
- Add more type hints (`utils/utils.py` done)
- Use f-strings when possible
- Changed `PPO` atari hyperparameters (removed vf clipping)
- Changed `A2C` atari hyperparameters (eps value of the optimizer)
- Updated benchmark script
- Updated hyperparameter optim search space (commented gSDE for A2C/PPO)
- Updated `DQN` hyperparameters for CartPole
- Do not wrap channel-first image env (now natively supported by SB3)
- Removed hack to log success rate
- Simplify plot script

Pre-Release 0.10.0 (2020-10-28)

Breaking Changes

New Features
- Added support for `HER`
- Added low-pass filter wrappers in `utils/wrappers.py`
- Added `TQC` support, implementation from sb3-contrib

Bug fixes
- Fixed `TimeFeatureWrapper` inferring max timesteps
- Fixed ``flatten_dict_observations`` in `utils/utils.py` for recent Gym versions (ManifoldFR)
- `VecNormalize` now takes `gamma` hyperparameter into account
- Fix loading of `VecNormalize` when continuing training or using trained agent

Documentation

Other
- Added tests for the wrappers
- Updated plotting script

0.8.0

Breaking Changes

New Features
- Distributed optimization (SammyRamone)
- Added ``--load-checkpoints`` to load particular checkpoints
- Added ``--num-threads`` to enjoy script
- Added DQN support
- Added saving of command line args (SammyRamone)
- Added DDPG support
- Added version
- Added ``RMSpropTFLike`` support

Bug fixes
- Fixed optuna warning (SammyRamone)
- Fixed `--save-freq` which was not taking parallel env into account
- Set `buffer_size` to 1 when testing an Off-Policy model (e.g. SAC/DQN) to avoid memory allocation issue
- Fixed seed at load time for `enjoy.py`
- Non-deterministic eval when doing hyperparameter optimization on atari games
- Use 'maximize' for hyperparameter optimization (SammyRamone)
- Fixed a bug where reward where not normalized when doing hyperparameter optimization (caburu)
- Removed `nminibatches` from `ppo.yml` for `MountainCar-v0` and `Acrobot-v1`. (blurLake)
- Fixed `--save-replay-buffer` to be compatible with latest SB3 version
- Close environment at the end of training
- Updated DQN hyperparameters on simpler gym env (due to an update in the implementation)

Documentation

Other
- Reformat `enjoy.py`, `test_enjoy.py`, `test_hyperparams_opt.py`, `test_train.py`, `train.py`, `callbacks.py`, `hyperparams_opt.py`, `utils.py`, `wrappers.py` (salmannotkhan)
- Reformat `record_video.py` (salmannotkhan)
- Added codestyle check `make lint` using flake8
- Reformat `benchmark.py` (salmannotkhan)
- Added github ci
- Fixes most linter warnings
- Now using black and isort for auto-formatting
- Updated plots

Page 2 of 2

Releases

Has known vulnerabilities

Rl-baselines3-zoo

Page 2 of 2

1.2.0

1.1.0

1.0

0.8.0

Page 2 of 2

Links

Releases