- Added parallel stochastic in dv3: 225
- Update dependencies and python version: 230, 262, 263
- Added dv3 notebook for imagination and obs reconstruction: 232
- Created citation.cff: 233
- Added replay ratio for off-policy algorithms: 247
- Single strategy for the player (now it is instantiated in the `build_agent()` function: 244, 250, 258
- Proper `terminated` and `truncated` signals management: 251, 252, 253
- Added the possibility to choose whether or not to learn initial recurrent state: 256
- Added A2C benchmarks: 266
- Added `prepare_obs()` function to all the algorithms: 267
- Improved code readability: 248, 265
- bug fix: 220, 222, 224, 231, 243, 255, 257