Features 🔊
- Render environments to WanDB
- Grouping of runs in WanDB
- SampleToInsertRatio rate limiter
- Global Gradient Clipping to avoid exploding gradients
- Softplus for numerical stability
- YAML configuration file
- LogCosh instead of Huber loss
- Critic network with Add layer applied on state & action branches
- Custom uniform initializer
- XLA (Accelerated Linear Algebra) compiler
- Optimized Replay Buffer (https://github.com/deepmind/reverb/issues/90)
- split into **Agent**, **Learner**, **Tester** and **Server**
Bug fixes 🛠️
- Fixed creating of saving path for models
- Fixed model's `summary()`