model = QRDQN("MlpPolicy", env, learning_starts=100)
New Features:
- Added ``rollout_buffer_class`` and ``rollout_buffer_kwargs`` arguments to MaskablePPO
- Log success rate ``rollout/success_rate`` when available for on policy algorithms
Others:
- Fixed ``train_freq`` type annotation for tqc and qrdqn (Armandpl)
- Fixed ``sb3_contrib/common/maskable/*.py`` type annotations
- Fixed ``sb3_contrib/ppo_mask/ppo_mask.py`` type annotations
- Fixed ``sb3_contrib/common/vec_env/async_eval.py`` type annotations
Documentation:
- Add some additional notes about ``MaskablePPO`` (evaluation and multi-process) (icheered)
**Full Changelog**: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/compare/v2.2.1...v2.3.0