Slurm Sweeps
Slurm Sweeps: parameter sweeps on SLURM clusters
In the words of David Carreto-Fidalgo, the main developer of this python tool:
The main motivation was to provide a lightweight ASHA implementation for SLURM clusters that is fully compatible with pytorch-lightning's ddp.
It is heavily inspired by tools like Ray Tune and Optuna. However, on a SLURM cluster, these tools can be complicated to set up and introduce considerable overhead.
Slurm sweeps is simple, lightweight, and has few dependencies. It uses SLURM Job Steps to run the individual trials.
I humbly contributed to this project by refactoring the Logger so it stores its results on a SQLite database rather than dumping them to txt files.
You can check out the repo here.