This release includes breaking changes to how jobs and experiments are associated. Further, a lot of convenience commands have been added.
New Experiment - Slurm Job Association
Previously, each experiment was directly associated with a specific job (by default, 1:1 or 1:n if `experiments_per_job: n`). This was possible as each seml config may only contain a single slurm config. Now, the `slurm` block accepts an array of slurm configs. This change implies that at submission time, the experiment is unaware of which Slurm job may execute it. Each slurm job then greedily pulls the experiments. This allows the following configuration to better utilize larger GPUs:
yaml
slurm:
- experiments_per_job: 1
sbatch_options:
gres: gpu:1
partition: gpu_gtx1080
- experiments_per_job: 8
sbatch_options:
gres: gpu:1
partition: gpu_a100
This will now spawn two job arrays, one on the gtx1080 partition and one on the gpu_a100 partition, which both greedily pull jobs to be executed.
Breaking Changes
The document layout for an experiment in the MongoDB has been altered! Old collections will be automatically migrated to the new format but there is no going back! Make sure to call `seml --migration-backup <col>` if you want seml to first create a backup.
* The document layout for the MongoDB has been altered. Automatic migration is added, though we urge users to be careful!
* When running with multiple experiments per job, each job now gets its own log file.
* This is likely the last release supporting Python 3.8 and Python 3.9
Features
* seml now supports multi-process jobs (and multi-node jobs) (135)
* seml now supports multiple slurm configurations (137)
* The collection cache for autocompletion is automatically refreshed when calling `add`, `delete` or `drop`.
* We now support src-layouts by converting them to a flat-layout at runtime.
* Added `seml <col> restore-sources <path>` to restore source files form the MongoDB.
* Added `seml <col> hold` and `seml <col> release` to hold and release slurm jobs.
* Added `seml <col> print-experiment` to retrieve experiment documents from the database.
* Added `seml queue` to print the collection of slurm jobs (only works for jobs submitted with this version or newer)
* For debugging, seml now supports clickable vscode links (133)
* Experiment names may now contain `/` to create log directories.
* Cancel experiments by default when deleting them.
* Seml also suggests commands that do not need a collection like `drop`, `queue` or `list` in autocompletion.
* The CLI has been organized into groups.
* Autocompletion got faster.
Development
* Updated CI to use pre-commit and `uv` instead of `pip`.
* The repo has been restructured.
Fixes
* SSH connections are now handled in a separate process that tracks its health.
* Fixed multi-user SSH port forwarding.
* When encountering non-unicode symbols in reading the file output, we replace them with the Unicode replacement character.