Stable-baselines3

Latest version: v2.6.0

Safety actively analyzes 723963 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 5

1.1.0

Breaking Changes
- Added support for Dictionary observation spaces (cf. SB3 doc)
- Upgraded to Stable-Baselines3 >= 1.1.0
- Added proper handling of timeouts for off-policy algorithms (cf. SB3 doc)
- Updated usage of logger (cf. SB3 doc)

Bug Fixes
- Removed unused code in ``TQC``

Others
- SB3 docs and tests dependencies are no longer required for installing SB3 contrib

Documentation
- updated QR-DQN docs checkmark typo (minhlong94)

1.0

Blog post: https://araffin.github.io/post/sb3/

Breaking Changes

- Upgraded to Stable-Baselines3 v1.0

Bug Fixes
- Fixed a bug with ``QR-DQN`` predict method when using ``deterministic=False`` with image space

1.0rc1

1.0rc0

0.11.1

Breaking Changes:

- Upgraded to Stable-Baselines3 >= 0.11.1

New Features:

- Added ``TimeFeatureWrapper`` to the wrappers
- Added ``QR-DQN`` algorithm (`ku2482`_)

Bug Fixes:

- Fixed bug in ``TQC`` when saving/loading the policy only with non-default number of quantiles
- Fixed bug in ``QR-DQN`` when calculating the target quantiles (ku2482, guyk1971)

Others:

- Updated ``TQC`` to match new SB3 version
- Moved ``quantile_huber_loss`` to ``common/utils.py`` (ku2482)

0.10.0

Breaking Changes

- **Warning:** Renamed ``common.cmd_util`` to ``common.env_util`` for clarity (affects ``make_vec_env`` and ``make_atari_env`` functions)

New Features

- Allow custom actor/critic network architectures using ``net_arch=dict(qf=[400, 300], pi=[64, 64])`` for off-policy algorithms (SAC, TD3, DDPG)
- Added Hindsight Experience Replay ``HER``. (megan-klaiber)
- ``VecNormalize`` now supports ``gym.spaces.Dict`` observation spaces
- Support logging videos to Tensorboard (SwamyDev)
- Added ``share_features_extractor`` argument to ``SAC`` and ``TD3`` policies

Bug Fixes

- Fix GAE computation for on-policy algorithms (off-by one for the last value) (thanks Wovchena)
- Fixed potential issue when loading a different environment
- Fix ignoring the exclude parameter when recording logs using json, csv or log as logging format (SwamyDev)
- Make ``make_vec_env`` support the ``env_kwargs`` argument when using an env ID str (ManifoldFR)
- Fix model creation initializing CUDA even when `device="cpu"` is provided
- Fix ``check_env`` not checking if the env has a Dict actionspace before calling ``_check_nan`` (wmmc88)
- Update the check for spaces unsupported by Stable Baselines 3 to include checks on the action space (wmmc88)
- Fixed feature extractor bug for target network where the same net was shared instead
of being separate. This bug affects ``SAC``, ``DDPG`` and ``TD3`` when using ``CnnPolicy`` (or custom feature extractor)
- Fixed a bug when passing an environment when loading a saved model with a ``CnnPolicy``, the passed env was not wrapped properly
(the bug was introduced when implementing ``HER`` so it should not be present in previous versions)

Others

- Improved typing coverage
- Improved error messages for unsupported spaces
- Added ``.vscode`` to the gitignore

Documentation

- Added first draft of migration guide
- Added intro to [imitation](https://github.com/HumanCompatibleAI/imitation) library (shwang)
- Enabled doc for ``CnnPolicies``
- Added advanced saving and loading example
- Added base doc for exporting models
- Added example for getting and setting model parameters

Page 4 of 5

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.