Added
- Adds a `total_env_steps` counter to both `GymFitness` and `BraxFitness` for easier sample efficiency comparability with RL algorithms.
- Support for new strategies/genetic algorithms
- SAMR-GA (Clune et al., 2008)
- GESMR-GA (Kumar et al., 2022)
- SNES (Wierstra et al., 2014)
- DES (Lange et al., 2022)
- Guided ES (Maheswaranathan et al., 2018)
- ASEBO (Choromanski et al., 2019)
- CR-FM-NES (Nomura & Ono, 2022)
- MR15-GA (Rechenberg, 1978)
- Adds full set of BBOB low-dimensional functions (`BBOBFitness`)
- Adds 2D visualizer animating sampled points (`BBOBVisualizer`)
- Adds `Evosax2JAXWrapper` to wrap all evosax strategies
- Adds Adan optimizer (Xie et al., 2022)
Changed
- `ParameterReshaper` can now be directly applied from within the strategy. You simply have to provide a `pholder_params` pytree at strategy instantiation (and no `num_dims`).
- `FitnessShaper` can also be directly applied from within the strategy. This makes it easier to track the best performing member across generations and addresses issue 32. Simply provide the fitness shaping settings as args to the strategy (`maximize`, `centered_rank`, ...)
- Removes Brax fitness (use EvoJAX version instead)
- Add lrate and sigma schedule to strategy instantiation
Fixed
- Fixed reward masking in `GymFitness`. Using `jnp.sum(dones) >= 1` for cumulative return computation zeros out the final timestep, which is wrong. That's why there were problems with sparse reward gym environments (e.g. Mountain Car).
- Fixed PGPE sample indexing.
- Fixed weight decay. Falsely multiplied by -1 when maximization.