Version `0.6.6` fixes bugs in the continual learning and continual evaluation setting, and updates the competition tutorial so that its default hyperparameters make sense with the current reward scaling of the TrackMania environment.
- It is now possible to use infinite numbers of episodes or samples
- The SAC hyperparameters in the competition tutorial have been adjusted to be meaningful, wandb logging has also been deactivated by default in this script (uncomment if needed)
- The competition evaluation script now works properly