Sun Grid Engine (SGE) and SLURM job scheduler concepts are quite similar. Below is a table of some common SGE commands and their SLURM equivalent. Any questions? Contact us.
Also check out Getting started with SLURM on the Sherlock pages.
Some common commands and flags in SGE and SLURM with their respective equivalents:
User Commands | SGE | SLURM |
Interactive login | qlogin |
srun --pty bash or srun (-p "partition")--time=4:0:0 --pty bash For a quick dev node, just run "sdev" |
Job submission | qsub [script_file] | sbatch [script_file] |
Job deletion | qdel [job_id] | scancel [job_id] |
Job status by job | qstat -u \* [-j job_id] | squeue [job_id] |
Job status by user | qstat [-u user_name] |
squeue -u [user_name] |
Job hold | qhold [job_id] | scontrol hold [job_id] |
Job release | qrls [job_id] | scontrol release [job_id] |
Queue list | qconf -sql | squeue |
List nodes | qhost | sinfo -N OR scontrol show nodes |
Cluster status | qhost -q | sinfo |
GUI | qmon | sview |
Environmental | ||
Job ID | $JOB_ID | $SLURM_JOBID |
Submit directory | $SGE_O_WORKDIR | $SLURM_SUBMIT_DIR |
Submit host | $SGE_O_HOST | $SLURM_SUBMIT_HOST |
Node list | $PE_HOSTFILE | $SLURM_JOB_NODELIST |
Job Array Index | $SGE_TASK_ID | $SLURM_ARRAY_TASK_ID |
Job Specification | ||
Script directive | #$ | #SBATCH |
queue | -q [queue] | -p [queue] |
count of nodes | N/A | -N [min[-max]] |
CPU count | -pe [PE] [count] | -n [count] |
Wall clock limit | -l h_rt=[seconds] | -t [min] OR -t [days-hh:mm:ss] |
Standard out file | -o [file_name] | -o [file_name] |
Standard error file | -e [file_name] | e [file_name] |
Combine STDOUT & STDERR files | -j yes | (use -o without -e) |
Copy environment | -V | --export=[ALL | NONE | variables] |
Event notification | -m abe | --mail-type=[events] |
send notification email | -M [address] | --mail-user=[address] |
Job name | -N [name] | --job-name=[name] |
Restart job | -r [yes|no] | --requeue OR --no-requeue (NOTE: configurable default) |
Set working directory | -wd [directory] | --workdir=[dir_name] |
Resource sharing | -l exclusive | --exclusive OR--shared |
Memory size | -l mem_free=[memory][K|M|G] | --mem=[mem][M|G|T] OR --mem-per-cpu= [mem][M|G|T] |
Charge to an account | -A [account] | --account=[account] |
Tasks per node | (Fixed allocation_rule in PE) | --tasks-per-node=[count] |
--cpus-per-task=[count] | ||
Job dependancy | -hold_jid [job_id | job_name] | --depend=[state:job_id] |
Job project | -P [name] | --wckey=[name] |
Job host preference | -q [queue]@[node] OR -q [queue]@@[hostgroup] |
--nodelist=[nodes] AND/OR --exclude= [nodes] |
Quality of service | --qos=[name] | |
Job arrays | -t [array_spec] | --array=[array_spec] (Slurm version 2.6+) |
Generic Resources | -l [resource]=[value] | --gres=[resource_spec] |
Lincenses | -l [license]=[count] | --licenses=[license_spec] |
Begin Time | -a [YYMMDDhhmm] | --begin=YYYY-MM-DD[THH:MM[:SS]] |
SGE | SLURM |
---|---|
qstatqstat -u username |
squeuesqueue -u username |
qsubqsub -N jobname |
sbatchsbatch -J jobname sbatch --mem=4000 |
# Interactive run, one core | # Interactive run, one core |
qrsh -l h_rt=8:00:00 | salloc -t 8:00:00 interactive -p core -n 1 -t 8:00:00 |
qdel |
scancel |
SGE for a single-core application | SLURM for a single-core application |
---|---|
#!/bin/bash # # #$ -N test #$ -j y #$ -o test.output #$ -cwd #$ -M $USER@yourschool.edu #$ -m bea # Request 5 hours run time #$ -l h_rt=5:0:0 #$ -P your_project_id_here # #$ -l mem=4G # <call your app here> |
#!/bin/bash # #SBATCH -J test #SBATCH -o test."%j".out #SBATCH -e test."%j".err # Default in slurm #SBATCH --mail-user $USER@yourschool.edu #SBATCH --mail-type=ALL #SBATCH -t 5:0:0 # Request 5 hours run time #SBATCH --mem=4000 #SBATCH -p normal <load modules, call your app here> |
Comparison of some parallel environments set by sge and slurm
SGE | SLURM |
---|---|
$JOB_ID | $SLURM_JOB_ID |
$NSLOTS | $SLURM_NPROCS |