-------------------------- * Fix problem with soft update of target network where the update resulted in a wrong mixture. * Fix problem with init_steps and update_steps not being saved in the initialize method of the agent.
0.2.3
-------------------------- * Fix problem with the usage of the policy network for determining q-values in a double-q learning scenario.