SLURM is a queue management system replacing the commercial LSF scheduler as the job manager on UAHPC.
SLURM is similar to LSF – below is a quick reference from HPC Wales comparing commands between the two. For those of you coming from an environment with a different scheduler or wanting more details, see this pdf for a comparison of commands between PBS/Torque, SLURM, LSF, SGE, and LoadLeveler: Scheduler Commands Cheatsheet
More documentation can be found at the SLURM website.
LSF to Slurm Quick Reference
Commands
LSF | Slurm | Description |
---|---|---|
bsub < script_file | sbatch script_file | Submit a job from script_file |
bkill 123 | scancel 123 | Cancel job 123 |
bjobs | squeue orslurmtop | List user’s pending and running jobs |
bqueues | sinfo | Cluster status with partition (queue) list |
bqueues | sinfo -s | With ‘-s’ a summarised partition list, which is shorter and simpler to interpret. |
Job Specification
LSF | Slurm | Description |
---|---|---|
#BSUB | #SBATCH | Scheduler directive |
-q queue_name | -p main --qos queue_name or -p owners --qos queue_name | Queue to ‘queue_name’ |
-n 64 | -n 64 | Processor count of 64 |
-W [hh:mm:ss] | -t [minutes] or -t [days-hh:mm:ss] | Max wall run time |
-o file_name | -o file_name | STDOUT output file |
-e file_name | -e file_name | STDERR output file |
-oo file_name | -o file_name --open-mode=append | Append to output file |
-J job_name | --job-name=job_name | Job name |
-M 128 | --mem-per-cpu=128M or --mem-per-cpu=2G | Memory requirement |
-R "span[ptile=16]" | --tasks-per-node=16 | Processes per node |
-P proj_code | --account=proj_code | Project account to charge job to |
-J "job_name[array_spec]" | --array=array_spec | Job array declaration |
Job Environment Variables
LSF | Slurm | Description |
---|---|---|
$LSB_JOBID | $SLURM_JOBID | Job ID |
$LSB_SUBCWD | $SLURM_SUBMIT_DIR | Submit directory |
$LSB_JOBID | $SLURM_ARRAY_JOB_ID | Job Array Parent |
$LSB_JOBINDEX | $SLURM_ARRAY_TASK_ID | Job Array Index |
$LSB_SUB_HOST | $SLURM_SUBMIT_HOST | Submission Host |
$LSB_HOSTS $LSB_MCPU_HOST | $SLURM_JOB_NODELIST | Allocated compute nodes |
$LSB_DJOB_NUMPROC | $SLURM_NTASKS (mpirun can automatically pick this up from Slurm, it does not need to be specified) | Number of processors allocated |
$SLURM_JOB_PARTITION | Queue |