Added
- 3D animation post-processing helpers (`animate_3D_scatter` and `animate_3D_surface`) and test coverage for visualizations (static/dynamic).
- [`nevergrad`](https://facebookresearch.github.io/nevergrad/index.html) multi-objective hyperparameter optimization. Checkout the toy [example](examples/toy_multiobj).
- Adds `experiment` decorator for easy integration:
python
from mle_toolbox import experiment
experiment("configs/abc.json", model_config={"num_layers": 2})
def run(mle, a):
print(mle.model_config)
print(mle.log)
print(a)
if __name__ == "__main__":
run(a=2)
- Adds `combine_experiments` which loads different `meta_log` and `hyper_log` objects and makes them "dot"-accessible:
python
experiment_dirs = ["../tests/unit/fixtures/experiment_1",
"../tests/unit/fixtures/experiment_2"]
meta, hyper = combine_experiments(experiment_dirs, aggregate_seeds=False)
- Adds option to run grid search for multiple base configurations without having to create individual experiment configuration files.
Changed
- Configuration loading is now more toolbox specific. `load_json_config` and `load_yaml_config` are now part of `mle-logging`. The toolbox now has two "new" function `load_job_config` and `load_experiment_config`, which prepare the raw configs for future usage.
- The `job_config` file now no longer has to be a `.json` file, but can (and probably should) be a `.yaml` file. This makes formatting easier. The hyperoptimization pipeline will generate configuration files that are of the same file type.
- Moves core hyperparameter optimization functionality to [`mle-hyperopt`](https://github.com/RobertTLange/mle-hyperopt). At this point the toolbox wraps around the search strategies and handles the `mle-logging` log loading/data retrieval.
- Reduces test suite since all hyperopt strategy-internal tests are taken care of in `mle-hyperopt`.
Fixed
- Fixed unique file naming of zip files stored in GCS bucket. Now based on the time string.
- Grid engine monitoring now also tracks waiting/pending jobs.
- Fixes a bug in the random seed setting for synchronous batch jobs. Previously a new set of seeds was sampled for each batch. This lead to problems when aggregating different logs by their seed id. Now the first set of seeds is stored and provided as an input to all subsequent `JobQueue` startups.