SLURM - cheatsheet
SLURM (Simple Linux Utility for Resource Management) is a job scheduling system used on many HPC (High-Performance Computing) clusters. It allows users to submit, monitor, and manage jobs efficiently.
Basic Commands
Command | Description |
---|---|
sbatch <script> |
Submit a job script to the SLURM scheduler |
squeue |
View the status of all jobs in the queue |
sinfo |
Display information about SLURM nodes and partitions |
scancel <job_id> |
Cancel a specific job |
scontrol show job <job_id> |
Display detailed information about a job |
sacct |
View job accounting information |
salloc |
Allocate resources for interactive job sessions |
srun <command> |
Run a command or executable on SLURM |
sview |
Open a graphical interface to view SLURM status |
Job Script Directives
SLURM job scripts include directives that specify job parameters and resource requirements. Here are some common directives:
#SBATCH -J <job_name>
: Set the job name#SBATCH -N <num_nodes>
: Request a specific number of nodes#SBATCH -n <num_tasks>
: Request a specific number of tasks (cores)#SBATCH -p <partition>
: Specify the partition or queue to submit the job to#SBATCH -t <time_limit>
: Set the maximum time limit for the job#SBATCH --mem=<memory>
: Specify the memory requirements for the job#SBATCH -o <output_file>
: Redirect standard output#SBATCH -e <error_file>
: Redirect standard error
Example job script
#!/bin/bash #SBATCH -J myjob #SBATCH -N 1 #SBATCH -n 4 #SBATCH -p general #SBATCH -t 1:00:00
srun ./my_program