This release introduces a set of reinforcement learning tools based on [ray[rllib]](https://docs.ray.io/en/master/rllib.html). In addition, it is now easier to replay and compare trajectories in different formats (sequence of states, log files, log data, and current simulation).
New features
* [python/simulator] Replay extra logs/trajectories with current simulation.
* [gym/toolbox/rllib] Add dedicated rllib toolbox.
* [gym/toolbox/rllib] Provide [PPO CAPS](https://arxiv.org/pdf/2012.06644.pdf) implementation.
Improvements
* [python/viewer] Refactor Panda3d backend screen refresh and Qt widget to avoid edge cases.
* [gym/toolbox/rllib] Do not provide default log dir since it is error prone