This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
sbatch [2014/04/23 11:47] root |
sbatch [2021/10/22 12:05] (current) root |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== sbatch ====== | ====== sbatch ====== | ||
- | **sbatch** submits a script to be executed to the slurm controller and returns immediately. The script may be queued for running later if there are currently no resources (cores) available to run it. The script is " | + | **sbatch** submits a script to be executed to the slurm controller and returns immediately. The script may be queued for running later if there are currently no resources (cores |
The working directory for the script will be set to the current working directory when you submit the job. | The working directory for the script will be set to the current working directory when you submit the job. | ||
Line 15: | Line 15: | ||
< | < | ||
#!/bin/bash | #!/bin/bash | ||
+ | #SBATCH -p bigmem | ||
#SBATCH -o gwas-test-%j.out | #SBATCH -o gwas-test-%j.out | ||
gwas-program filename1 filename2 | gwas-program filename1 filename2 | ||
Line 21: | Line 22: | ||
The "# | The "# | ||
- | **sbatch** will only run scripts (i.e. not object/ | + | **sbatch** will only run scripts (i.e. not object/ |
**ex1.bash** | **ex1.bash** | ||
Line 34: | Line 35: | ||
You can probably do everything you want to do with this sort of command. | You can probably do everything you want to do with this sort of command. | ||
+ | |||
+ | **ex2.bash** | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
+ | #SBATCH -o ex2-%j.out | ||
+ | / | ||
+ | pwd | ||
+ | echo Arg1 is $1 | ||
+ | echo Arg2 is $2 | ||
+ | </ | ||
==== sbatch options ==== | ==== sbatch options ==== | ||
sbatch -N4 ex1.bash one two | sbatch -N4 ex1.bash one two | ||
+ | |||
+ | ** Unless you intend to use srun from within your script, the -N option is probably not what you want. ** | ||
The " | The " | ||
- | This does not reserve all 4 nodes for you exclusively: | + | This does not reserve all 4 nodes for you exclusively: |
+ | ==== Array Jobs ==== | ||
+ | |||
+ | < | ||
+ | sbatch --array 1-22 one_chromo.bash | ||
+ | </ | ||
+ | |||
+ | **ex3.bash** | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
+ | echo Array job id = $SLURM_ARRAY_JOB_ID | ||
+ | echo Array task id = $SLURM_ARRAY_TASK_ID | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | sbatch --array 1-3 ex3.bash | ||
+ | </ | ||
+ | |||
+ | This produces 3 output files by default: slurm_NNNN_1.out, | ||
+ | |||
+ | You can also submit an "array job" like this: | ||
+ | |||
+ | sbatch --array=" | ||
+ | |||
+ | This command will run your script 20 times (possibly on different nodes) with at most 5 copies running at any one time. (The " | ||
+ | |||
+ | In the output of squeue you would see one line for the array job, and (in the example above) up to five lines for the currently running jobs. | ||
+ | |||
+ | You do have to do some work to make each job generated by the array do something different. There is an environment variable, SLURM_ARRAY_TASK_ID, | ||
+ | |||
+ | This allows you to submit a large number of jobs quickly and easily and allows easy control of the resources you are using. The output from each job goes, by default, to a file named slurm-JJJJ_AAA.txt where JJJJ is the slurm job id, and AAA is the array index value. | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
+ | |||
+ | echo Job $SLURM_JOB_ID | ||
+ | echo Array index $SLURM_ARRAY_TASK_ID | ||
+ | |||
+ | # Read the list of files to be processed from array_files.txt. | ||
+ | mapfile -t files < array_files.txt | ||
+ | index=$(( SLURM_ARRAY_TASK_ID - 1 )) | ||
+ | echo File to be processed: ${files[$index]} | ||
+ | </ | ||
+ | |||
+ | If you have submitted an array job and change your mind about how many of the array tasks should be running at any one time (perhaps the cluster gets less busy so it seems reasonable to increase the number of jobs you are running at one time), you can use scontrol to do that. | ||
+ | |||
+ | < | ||
+ | scontrol update JobId=NNNNNN ArrayTaskThrottle=10 | ||
+ | </ | ||
+ | |||
+ | You can cancel all of an array job in one go by using scancel on the job number of the array job entry. You can also cancel the individual tasks submitted by the array job. | ||
+ | |||
+ | To cancel the entire job (including all running tasks): | ||
+ | |||
+ | < | ||
+ | scancel NNNNNNN | ||
+ | </ | ||
+ | |||
+ | To cancel an individual task from the array job (in this case task number 2): | ||
+ | |||
+ | < | ||
+ | scancel NNNNNNN_2 | ||
+ | </ | ||
+ | |||
+ | You can also cancel a range of the individual tasks within the array as follows. This will work whether the tasks are already running or not. | ||
+ | |||
+ | < | ||
+ | scancel NNNNNNN_[9-19] | ||
+ | </ | ||