- Isolate agents by processes instead of threads.
- Updated Documentation.
- New Environments: Battle Geese and Halite.
- Split timeout into: agent initialization, agent act, and episode duration.
- Reduced replay file size by sharing identical observations.
- Added Server Middleware.
- Configuration defaults and environment extensions.
- Training reward is now the diff between step rewards.
Bug Fixes
- ConnectX out of bounds.
- Prevent sharing state by reference to agents.
- Legend scrambled ascii characters.
- Stop storing invalid actions on replays.