Highlights
- **Collaborating local workers**: Workers now dynamically load experiments from the MongoDB, and even collaborate with Slurm
- Interactive **debugging** on Slurm (`--debug`) and remotely via debugpy (`--debug-server`; supported by VS Code)
- Advanced **config checking**: SEML now prevents duplicates, missing fixed/random/grid blocks, and invalid values in the `seml` and `slurm` blocks
- **Nested parameters** in dictionaries via dot-notation
- **`seml jupyter`** command to easily start a Jupyter notebook in Slurm
- Mattermost observer
- Slurm templates
Breaking changes
- `seml queue` is now `seml add`, and sets the experiment state to `STAGED`, in analogy to `git`. SEML still correctly handles old experiments in the `QUEUED` state.
- SEML now strictly disallows all forms of duplicate parameters and blocks in the config. The exception are only sub-configs, in which you can overwrite more general parameters.
- SEML now disallows configs and sub-configs that have no fixed/random/grid blocks and unrecognized values in the `seml` and `slurm` blocks
- parameter collections are deprecated in favor of dot-notation
- removed `--unobserved` option
- renamed `--dry-run` to `--print-command`
- renamed `--no-config-check` to `--no-sanity-check`
- `--output-to-console` now _also_ prints to console (disable file output via `--no-file-output`)
- removed backwards compatibility to specifying `db_collection_name` in config
Features
- Workers now dynamically load experiments from the MongoDB. `seml start --local` starts one worker. You can then spin up more workers with `seml launch-worker`. Workers can even take over pending experiments from Slurm with the `--steal-slurm` option (without race conditions). Moreover, we added GPU and CPU resource options to conveniently manage multiple workers on the same system.
- Interactive debugging in Slurm via `--debug`
- **`--debug-server`**: Attach your IDE to your experiment as a debug client via debugpy (supported by VS Code, PyCharm doesn't support anything comparable)
- **`seml jupyter`** command to easily start a Jupyter notebook in Slurm
- detect and prevent duplicates in config files. Warn about redefinitions in sub-configs.
- **Nested parameters** in dictionaries via dot-notation
- New advanced example with nested parameters, dictionaries and Sacred prefixes
- **Mattermost observer**: Send messages on experiment start, completion, failure, interruption, and/or periodically (e.g. every hour/day), including any experiment info (e.g. current accuracy, epoch)
- **Slurm templates**: Define custom templates for the sbatch options of differents kinds of jobs/experiments (Jupyter, GPU, CPU, long, high-priority, ...) in `settings.py` and easily use them in your SEML configs
- `filter_dict` and `mongodb_config` parameters in `get_results`
- `--no-code-checkpoint` option to prevent code storage. Useful e.g. for debugging.
- `--no-file-output` option to stop printing to an output file
- some first tests (with GitHub actions thanks to sigeisler)
Bug fixes
- fix distribution in zip files and Python eggs
- print full Slurm node list (thanks to jayargo)
- fix missing directory in seml configure
- switch to stdlib `random` for True/False samples (thanks to cqql)
- configuration files without Slurm block (thanks to johannesjung)
- correctly detect killed jobs in disjoint, long, and singleton job array ranges
- use `None` as defaults to avoid changing the Python-internal default values
- empty `logs` folder in example to avoid errors
- correctly handle numbers as collection names
- fix TF1 GPU memory stat (thanks to mnchsmn)
- catch invalid sub-configs that do not contain a fixed/random/grid block