Important
We have reorganized the project repository. Please see [Migrating from v0.4 to v0.5 documentation](../master/docs/Migrating.mdmigrating-from-ml-agents-toolkit-v03-to-v04) for more information. Highlighted changes to repository structure include:
* The `python` folder has been renamed `ml-agents.` It now contains a python package called `mlagents`.
* The `unity-environment` folder, containing the Unity project, has been renamed `UnitySDK`.
* The protobuf definitions used for communication have been added to a new `protobuf-definitions` folder.
* Example curricula and the trainer configuration file have been moved to a new `config` sub-directory.
Environments
To learn more about new and improved environments, see our [Example Environments page](../master/docs/Learning-Environment-Examples.md).
Improved
The following environments have been changes to use Multi Discrete Action:
* WallJump
* BananaCollector
The following environment has been modified to use Action Masking:
* GridWorld
New Features
* **[Gym]** New package `gym-unity` which provides gym interface to wrap `UnityEnvironment`. More information [here](../master/gym-unity/Readme.md).
* **[Training]** Can now run multiple concurent training sessions with the `--num-runs=<n>` [command line option](../master/docs/Training-ML-Agents.mdcommand-line-training-options). (Training sessions are independent, and do not improve learning performance.)
* **[Unity]** [Meta-Curriculum](../master/docs/Training-Curriculum-Learning.md). Supports curriculum learning in multi-brain environments.
* **[Unity]** [Action Masking for Discrete Control](../master/docs/Learning-Environment-Design-Agents.mdmasking-discrete-actions) - It is now possible to mask invalid actions each step to limit the actions an agent can take.
* **[Unity]** [Action Branches for Discrete Control](../master/docs/Learning-Environment-Design-Agents.mddiscrete-action-space) - It is now possible to define discrete action spaces which contain multiple branches, each with its own space size.
Changes
* Can now visualize value estimates when using models trained with PPO from Unity with `GetValueEstimate()`.
* It is now possible to specify which camera the `Monitor` displays to.
* Console summaries will now be displayed even when running inference mode from python.
* Minimum supported Unity version is now 2017.4.
Fixes & Performance Improvements
* Replaced some activation functions to `swish`.
* Visual Observations use PNG instead of JPEG to avoid compression losses.
* Improved python unit tests.
* Fix to enable multiple training sessions on single GPU.
* Curriculum lessons are now tracked correctly.
Known Issues
* Ending training early using `CTL+C` does not save the model on Windows.
* Sequentially opening and closing multiple instances of `UnityEnvironment` within a single process is not possible.
Acknowledgements
Thanks to everyone at Unity who contributed to v0.5.0, as well as: sterling000, bartlomiejwolk, Sohojoe, Phantomb.