Rl-baselines3-zoo

Latest version: v1.6.2

Safety actively analyzes 641872 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

1.6.2

Breaking Changes
- RL Zoo is now a python package
- low pass filter was removed

New Features
- RL Zoo cli: `rl_zoo train` and `rl_zoo enjoy`

Bug fixes

Documentation

Other

1.6.1

**Progress bar and custom yaml file**

Breaking Changes
- Upgraded to Stable-Baselines3 (SB3) >= 1.6.1
- Upgraded to sb3-contrib >= 1.6.1

New Features
- Added `--yaml-file` argument option for `train.py`to read hyperparameters from custom yaml files (JohannesUl)

Bug fixes
- Added `custom_object` parameter on record_video.py (Affonso-Gui)
- Changed `optimize_memory_usage` to `False` for DQN/QR-DQN on record_video.py (Affonso-Gui)
- In `ExperimentManager` `_maybe_normalize` set `training` to `False` for eval envs,
to prevent normalization stats from being updated in eval envs (e.g. in EvalCallback) (pchalasani).
- Only one env is used to get the action space while optimizing hyperparameters and it is correctly closed (SammyRamone)
- Added progress bar via the `-P` argument using tqdm and rich

Documentation

Other

1.6.0

**RecurrentPPO (ppo_lstm) and Huggingface integration**

Breaking Changes
- Change default value for number of hyperparameter optimization trials from 10 to 500. (ernestum)
- Derive number of intermediate pruning evaluations from number of time steps (1 evaluation per 100k time steps.) (ernestum)
- Updated default --eval-freq from 10k to 25k steps
- Update default horizon to 2 for the `HistoryWrapper`
- Upgrade to Stable-Baselines3 (SB3) >= 1.6.0
- Upgrade to sb3-contrib >= 1.6.0

New Features
- Support setting PyTorch's device with thye `--device` flag (gregwar)
- Add `--max-total-trials` parameter to help with distributed optimization. (ernestum)
- Added `vec_env_wrapper` support in the config (works the same as `env_wrapper`)
- Added Huggingface hub integration
- Added `RecurrentPPO` support (aka `ppo_lstm`)
- Added autodownload for "official" sb3 models from the hub
- Added Humanoid-v3, Ant-v3, Walker2d-v3 models for A2C (pseudo-rnd-thoughts)
- Added MsPacman models

Bug fixes
- Fix `Reacher-v3` name in PPO hyperparameter file
- Pinned ale-py==0.7.4 until new SB3 version is released
- Fix enjoy / record videos with LSTM policy
- Fix bug with environments that have a slash in their name (ernestum)
- Changed `optimize_memory_usage` to `False` for DQN/QR-DQN on Atari games,
if you want to save RAM, you need to deactivate `handle_timeout_termination`
in the `replay_buffer_kwargs`

Documentation

Other
- When pruner is set to `"none"`, use `NopPruner` instead of diverted `MedianPruner` (qgallouedec)

1.5.0

**Support for Weight and Biases experiment tracking**

Breaking Changes
- Upgrade to Stable-Baselines3 (SB3) >= 1.5.0
- Upgrade to sb3-contrib >= 1.5.0
- Upgraded to gym 0.21

New Features
- Verbose mode for each trial (when doing hyperparam optimization) can now be activated using the debug mode (verbose == 2)
- Support experiment tracking via Weights and Biases via the `--track` flag (vwxyzjn)
- Support tracking raw episodic stats via `RawStatisticsCallback` (vwxyzjn, see https://github.com/DLR-RM/rl-baselines3-zoo/pull/216)

Bug fixes
- Policies saved during during optimization with distributed Optuna load on new systems (jkterry)
- Fixed script for recording video that was not up to date with the enjoy script

Documentation

Other

1.4.0

Breaking Changes
- Dropped python 3.6 support
- Upgrade to Stable-Baselines3 (SB3) >= 1.4.0
- Upgrade to sb3-contrib >= 1.4.0

New Features
- Added mujoco hyperparameters
- Added MuJoCo pre-trained agents
- Added script to parse best hyperparameters of an optuna study
- Added TRPO support
- Added ARS support and pre-trained agents

Bug fixes

Documentation
- Replace front image

Other

1.3.0

**rliable plots and bug fixes**

**WARNING: This version will be the last one supporting Python 3.6 (end of life in Dec 2021). We highly recommended you to upgrade to Python >= 3.7.**

Breaking Changes
- Upgrade to panda-gym 1.1.1
- Upgrade to Stable-Baselines3 (SB3) >= 1.3.0
- Upgrade to sb3-contrib >= 1.3.0

New Features
- Added support for using rliable for performance comparison

Bug fixes
- Fix training with Dict obs and channel last images

Documentation

Other
- Updated docker image
- constrained gym version: gym>=0.17,<0.20
- Better hyperparameters for A2C/PPO on Pendulum

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.