Skip to content

Slurm

Pathfinder uses the Slurm batch scheduler. Slurm is a widely used open-source job scheduler and resource manager for high-performance computing (HPC) clusters. This section introduces the basics to help users get started with the scheduler for running code on the cluster. Typically, there are two types of jobs in SLURM: interactive sessions and batch jobs. Interactive sessions allow users to run code directly through the command line, while batch jobs are submitted to the scheduler and run without requiring an active session.

Interactive Jobs

Interactive sessions are useful for running short jobs, debugging code, or compiling software. There are two ways to launch an interactive session, the interactive command, and the srun command.

Interactive

The simplest way to launch an interactive session is to use the interactive command.

interactive

It will launch a bash session on a compute node with the following settings:

1 node (-N 1)
1 task (-n 1)
1 cpu cores (-c 1)
4gb ram (--mem=4gb)
1 hour run time ( --time=0-1:00:00 )

On a cluster that has sufficient resources free, this job should start almost at once and give you a CLI on a compute node. It is also possible to change the parameters that an interactive session uses by specifying them on the command line:

interactive -N 2 -n 8 -c 1 --mem=32gb --time=1-00:00:00

Srun

The following srun example will put a user in a bash session for a single task -n1 with access to 32g memory:

srun -p parallel -q normal -N1 -n1 -c 1  --mem=32g  --time=30:00 --pty bash

Batch Scripts

Batch scripts, or job submission scripts, are the most common mechanism by which a user configures and submits a job for execution. A batch script is simply a shell script that also includes directives to be interpreted by the batch scheduling software (e.g. Slurm).

Batch scripts are submitted to the batch scheduler, where they are then parsed for the scheduling configuration options. The batch scheduler then places the script in the appropriate queue, where it is designated as a batch job. Once the batch job makes its way through the queue, the script will be executed on the compute nodes.

Components of a Batch Script

Batch scripts are parsed into the following (3) sections:

  • Interpreter Line

The first line of a script can be used to specify the script’s interpreter; this line is optional. If not used, the submitter’s default shell will be used. The line uses the hash-bang syntax, i.e., #!/path/to/shell.

  • Slurm Submission Options

The Slurm submission options are preceded by the string #SBATCH, making them appear as comments to a shell. Slurm will look for #SBATCH options in a batch script from the script’s first line through the first non-comment line. A comment line begins with #. #SBATCH options entered after the first non-comment line will not be read by Slurm.

  • Shell Commands

The shell commands follow the last #SBATCH option and represent the executable content of the batch job. If any #SBATCH lines follow executable statements, they will be treated as comments only.

The execution section of a script will be interpreted by a shell and can contain multiple lines of executables, shell commands, and comments. when the job’s queue wait time is finished, commands within this section will be executed on the primary compute node of the job’s allocated resources. Under normal circumstances, the batch job will exit the queue after the last line of the script is executed.

Example Batch Script

The most common way to interact with the batch system is via batch scripts. A batch script is simply a shell script with added directives to request various resoruces from or provide certain information to the scheduling system. Aside from these directives, the batch script is simply the series of commands needed to set up and run your job.

Consider the following batch script:

#!/bin/bash
#SBATCH -p parallel
#SBATCH -q normal
#SBATCH --mem-per-cpu=2gb
#SBATCH --time=0-0:10:0
#SBATCH -c 1
#SBATCH -N 4
#SBATCH -n 168
#SBATCH -J Your_job_name
#SBATCH -o out/%x-%J.out
#SBATCH -e out/%x-%J.err

module load oneapi

cd $SLURM_SUBMIT_DIR
srun Your_application

In the script, Slurm directives are preceded by #SBATCH, making them appear as comments to the shell. Slurm looks for these directives through the first non-comment, non-whitespace line. Options after that will be ignored by Slurm (and the shell).

Batch scripts can be submitted for execution using the sbatch command. For example, the following will submit the batch script named test.slurm:

sbatch test.slurm

Note

Users must submit the batch job with the sbatch command. If you simply run it like a normal shell script (e.g. “./test.slurm”), it will run on the login node and will not properly allocate resources on the compute nodes.

If successfully submitted, a Slurm job ID will be returned. This ID can be used to track the job. It is also helpful in troubleshooting a failed job; make a note of the job ID for each of your jobs in case you must contact us for support.

Common Batch Options to Slurm

The following table summarizes frequently-used options to Slurm:

Option Use Description
-p #SBATCH -p partition_name Allocates resources on specified partition. This option is required by all jobs.
--mem #SBATCH --mem=32g Declare to use 32g memory of the node.
-N #SBATCH -N XXXXX Number of compute nodes to allocate.
-t #SBATCH -t XXXXX Maximum wall-clock time. The time is in the format HH:MM:SS.
-J #SBATCH -J Job_name Sets the job name
-o #SBATCH -o filename Writes standard output to your file.
-e #SBATCH -e filename Writes standard error to your file.

Further details and other Slurm options may be found through the sbatch man page.

Partition & Queue Information

Pathfinder’s scheduling policy is a modified Fair-Share with limits on maximum walltime of 24h. This policy will be updated as needed to keep thoughput for jobs moving.

Users can run and shownodes and showgpus commands to check partitions and GPU nodes.

Partition Name Description
-p serial Partition for running single node jobs
-p parallel Partition for running multiple node jobs
-p gpu Partition for running gpu jobs. Currently, Pathfinder offers NVIDIA A2 GPU nodes and only allows single-node GPU jobs. Users need to specify how many GPUs the job will use on the node using --gres=gpu:2 (up to 6).
Queue Name Description
-q normal Currently, Pathfinder has only one queue: normal

Useful Slurm Commands

The batch scheduler provides a number of utility commands for managing submitted jobs. See each utilities’ man page for more information.

scancel

Jobs in the queue in any state can be stopped and removed from the queue using the command scancel

scancel job_id

scontrol hold

Jobs in the queue in a non-running state may be placed on hold using the scontrol hold command. Jobs placed on hold will not be removed from the queue, but they will not be eligible for execution.

scontrol hold job_id

scontrol release

Once on hold the job will not be eligible to run until it is released to return to a queued state. The scontrol release command can be used to remove a job from the held state.

scontrol release job_id

squeue

The Slurm utility squeue can be used to view the batch queue.

To see all jobs currently in the queue:

squeue -l

To see all of your queued jobs:

squeue -l -u $USER

sacct

sacct can be used to view jobs currently in the queue and those completed within the last few days. The utility can also be used to see job steps in each batch job.

To see all of your jobs in the queue:

sacct -u $USER

To see all your jobs that completed on 2024-10-10:

sacct -S 2024-06-10T00:00:00 -E 2024-10-10T23:59:59 -o"jobid,user,account%16,cluster,AllocNodes,Submit,Start,End,TimeLimit" -X -P

scontrol show job

Provides additional details of given job:

scontrol show job job_id