SLURM
SLURM
Overview
SLURM (Simple Linux Utility for Resource Management) is an open-source job scheduler and workload manager designed for high-performance computing (HPC) environments. It manages compute resources, schedules jobs, and provides a framework for parallel and distributed computing. SLURM is widely used in datacenters, research institutions, and enterprise environments to efficiently allocate computing resources among multiple users and applications.
Key Concepts
Job Scheduling
SLURM manages the allocation of compute resources (nodes, CPUs, GPUs, memory) to user jobs. It provides:
- Job queuing - Jobs wait in queues until resources become available
- Resource allocation - Automatic assignment of compute nodes and resources
- Job prioritization - Fair-share scheduling based on user quotas and priorities
- Preemption - Ability to suspend or terminate lower-priority jobs for higher-priority ones
Resource Management
SLURM tracks and manages:
- Compute nodes - Individual servers in the cluster
- Partitions - Logical groupings of nodes (e.g., “gpu”, “cpu-only”, “debug”)
- Accounts - User groups with resource quotas and limits
- Quality of Service (QoS) - Service levels that affect job priority and limits
SLURM CLI Tools
Core Commands
Job Submission:
# Submit a simple job
sbatch job_script.sh
# Submit with specific requirements
sbatch --nodes=4 --ntasks-per-node=8 --gres=gpu:4 job_script.sh
# Submit an interactive job
srun --pty --nodes=1 --ntasks=1 bashJob Management:
# List all jobs
squeue
# List jobs for specific user
squeue -u username
# Cancel a job
scancel job_id
# Hold a job (prevent from starting)
scontrol hold job_id
# Release a held job
scontrol release job_idResource Information:
# Show cluster status
sinfo
# Show detailed node information
scontrol show nodes
# Show partition information
scontrol show partitions
# Show account information
sacctmgr show accountsJob Scripts
SLURM job scripts are shell scripts with special directives:
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --gres=gpu:4
#SBATCH --time=02:00:00
#SBATCH --partition=gpu
#SBATCH --account=my_account
# Job commands
module load cuda/11.8
mpirun -np 32 ./my_applicationProlog and Epilog Functionality
Overview
SLURM’s prolog and epilog scripts provide hooks for custom actions before and after job execution. These scripts run on the compute nodes and can perform setup, cleanup, and integration tasks.
Prolog Scripts
Prolog scripts execute before a job starts on a compute node. Common uses include:
File System Operations:
#!/bin/bash
# Create job-specific directories
mkdir -p /scratch/job_${SLURM_JOB_ID}
ln -s /scratch/job_${SLURM_JOB_ID} $SLURM_SUBMIT_DIR/scratchEnvironment Setup:
#!/bin/bash
# Load required modules
module load cuda/11.8
module load openmpi/4.1.4
# Set environment variables
export CUDA_VISIBLE_DEVICES=0,1,2,3
export OMP_NUM_THREADS=4Epilog Scripts
Epilog scripts execute after a job completes on a compute node. Common uses include:
Cleanup Operations:
#!/bin/bash
# Remove job-specific files
rm -rf /scratch/job_${SLURM_JOB_ID}
rm -f $SLURM_SUBMIT_DIR/scratchLogging and Monitoring:
#!/bin/bash
# Log job completion
echo "$(date): Job ${SLURM_JOB_ID} completed on $(hostname)" >> /var/log/slurm/jobs.log
# Collect performance metrics
sacct -j ${SLURM_JOB_ID} --format=JobID,JobName,Elapsed,MaxRSS,MaxVMSize >> /var/log/slurm/metrics.logFurther Reading
- SLURM Documentation - Official SLURM documentation
- Resource Groups - DPS resource group management
- DPS Integration Guide - Detailed DPS-SLURM integration
- SLURM Prolog/Epilog Guide - Official prolog/epilog documentation