Job submission

SLURM Cluster (available on 16/03/2018)

 

Waiting for training slides, please use these documents  :

[BATCH]

sbatch: submit a batch job to slurm (default workq partition).
sarray: submit a batch job-array to slurm.

[INTERACTIVE]

srun --pty bash : submit an interactive session with a compute node (default workq partition).

runVisuSession.sh: submit a TurboVNC / VirtualGL session with the graphical node (interq partition). Just for graphics jobs.

srun
-J job name  -> for change the jobname
-p partition  -> which partition(~ queue) to use
--time=HH:MM:SS -> max time of the job

-o (--output) = output_filename : to specify the stdout redirection. If -e (--error) is not specified both stdout ans stderr will be directed to the file name specified.
-e (--error) = error_filename : if specified, stderr will be redirected to different location as stdout

Without any parameter, on any partition, each job is limited to 1 cpu, 4G ram (cpus-per-task=1, mem=4G)

1 - First write a script (ex: myscript.sh) with the command line as following:

#!/bin/bash
#SBATCH -J test
#SBATCH -o output.out
#SBATCH -e error.out
#SBATCH -t 01:00:00
#SBATCH --mem=8G
#SBATCH --mail-type=BEGIN,END,FAIL
(the email address is automatically LDAP account's one)
#Purge any previous modules
module purge

#Load the application
module load bioinfo/ncbi-blast-2.2.29+

# My command lines I want to run on the cluster
blastall ...

2 - To submit the job, use the sbatch command line as following:

sbatch myscript.sh

To change memory reservation, add this option to the submission command (sbatch, srun, sarray):

--mem=XG (default value is 4G)

With default parameters, each job is limited to 1 cpu.
To book more, use the following options:

# Book n cpus on the same node (up to 64)
-c ncpus (--cpus-per-task=ncpus)

# Book n cpus on any nodes in case of MPI jobs
-N nnodes (--nodes=nnodes)

-n ntasks (--ntasks=ntasks)

-c ncpus (--cpus-per-task)ncpus

Each job is submitted to a specific partition (the default one is the workq).
Each partition has a different priority considering the maximum time of execution allowed.

 

QueueAccessPriorityMax timeMax slots
workqeveryone1004 days (96h)3072
unlimitqeveryone1180 days500
interq (runVisuSession.sh)on demand1 day (24h)32
smpqon demand180 days96
wflowqspecific software180 days3072

To submit an array of jobs, use sarray command (same sbatch options):

sarray [sbatch options] shell_command_file

To know your quota, use the command:

squota_cpu

Academic account quota: 100 000 h/per calendar year
Beyond these 100,000 hours, you will need to submit a science project (by the resources request form) to estimate the real needs of the bioinformatics environment.

According to results from this evaluation, but also their geographical and institutional origin, users can then either continue their treatments or be invited to contribute financially to infrastructure, or be redirected to regional or national mésocentres calculation.

Non-academic account quota:  500 h/per calendar year for testing the infrastructure.
Overtime calculation will be charged (price on request).

Use the following command line (on genologin server):

mmlsquota -u username --block-size G

Example of a full bash script :
#!/bin/bash
#SBATCH -J mpi_job
#SBATCH --nodes=2
#SBATCH --tasks-per-node=6
#SBATCH --time=00:10:00
cd $SLURM_SUBMIT_DIR
module purge
module load compiler/intel-2018.0.128 mpi/openmpi-1.8.8-intel2018.0.128
mpirun -n $SLURM_NTASKS -npernode $SLURM_NTASKS_PER_NODE ./hello_world

To do so, you can use the squeue command, following are some usefull options:

squeue -u username : list only the specified user's jobs.
squeue -j job_id : provide several informations on the specified job.

(see squeue --help or man squeue for more options)

For more detail:

scontrol show job job_id

You can also have access to a graphical user interface which provides the same informations.
This interface is accessible with the sview command.

To do so, use the sacct command line as following:

sacct -j job_id

(see sacct --help or man sacct for more options)

To do so, you can use the scancel command, following are some usefull options:

# Kill the specified job
scancel job_id


# Kill all job launched by the specified user
scancel -u username

SGE Cluster

 

See our training slides: Cluster

Open Grid Scheduler (http://gridscheduler.sourceforge.net), version GE2011.11p1 .

A full documentation is available as .pdf here.

[BATCH]

qsub: submit a batch job to Sun Grid Engine (default queue: workq).
qarray: submit a batch job-array to Sun Grid Engine.

[INTERACTIVE]

qlogin: submit an interactive X-windows session to Sun Grid Engine (automaticly queue interq). Just for graphics jobs.
qrsh: submit an interactive login session to Sun Grid Engine.

Without any parameters, on any queue, all jobs are limited to mem=1Gb, h_vmem=8Gb of memory, 1 CPU.

1 - First write a script (ex: myscript.sh) with the command line as following:

#$ -o /work/.../output.t
#$ -e /work/.../error.txt
#$ -q workq
#$ -m bea
# My command lines I want to run on the cluster
blastall -d swissprot -p blastx -i /save/.../z72882.fa

2 - To submit the job, use the qsub command line as following:

qsub myscript.sh

To change memory reservation, add this options to the submision command (qsub, qarray, qrsh or qlogin):

-l mem=XG -l h_vmem=YG                                 with X < Y. (Default value are X=1G, Y=8G)

where:

  • mem is the amount of memory per slot (in megabytes M, or gigabytes G) that your job will require
  • h_vmem is the upper bound on the amount of memory per slot your job is allowed to use

 

Example:

qsub -l mem=8G -l h_vmem=10G myscript.sh

With default parameters, each job is limited to 1 core (slot).
To book more, use the following options:

# Book n slots on the same node (up to 40 on intel node, to 48 on amd node)
qsub -pe parallel_smp n mysqcript.sh

# Book n slots on any nodes (could be the same) in case of MPI jobs
qsub -pe parallel_fill n mysqcript.sh

# Book n slots on strictly different nodes in case of MPI jobs
qsub -pe parallel_rr n mysqcript.sh

Each jobs are submitted to a specific queue (the default one is the workq).
Each queue has a different priority considering the maximum time of execution allowed.

 

QueueAccessPriorityMax timeMax slots
workqeveryone1004 days (96h)3072
unlimitqeveryone1180 days500
interq (runVisuSession.sh)on demand1 day (24h)32
smpqon demand180 days96
wflowqspecific software180 days3072

To submit an array of jobs, use qarray command (same qsub options):

qarray [qsub options] shell_command_file

To know your quota, use the command:

qquota_cpu login

Academic account quota: 100 000 h/per calendar year
Beyond these 100,000 hours, you will need to submit a science project (by the resources request form) to estimate the real needs of the bioinformatics environment.

According to results from this evaluation, but also their geographical and institutional origin, users can then either continue their treatments or be invited to contribute financially to infrastructure, or be redirected to regional or national mésocentres calculation.

Non-academic account quota:  500 h/per calendar year for testing the infrastructure.
Overtime calculation will be charged (price on request).

Use the following command line (on genotoul server):

mmlsquota -u login

To show availables environments :

qconf -spl

To show a specific environment (example):

qconf -sp parallel_smp

1. First of all, the parallel environment has to be booked on the cluster using the -pe option. As example for a qsub:

#!/bin/bash
#$ -pe parallel_rr 100

2. Then the environment variables and the MPI compiler wanted have to be loaded as following :

module load compiler/intel-2013.3.174
module load mpi/openmpi-1.8.1
mpirun cmd_name

To do so, you can use the qstat command, following are some usefull options:

qstat -u login : list only the specified user's jobs.
qstat -j job_id : provide several information on the specified job.
qstat -s r : list only the running jobs.

You can also have access to a graphical user interface which provides the same informations.
This interface is accessible with the qmon command.

To do so, use the qacct command line as following:

qacct -j job_id

This command line allow as well to make some SGE usage statistics.

To do so, you can use the qdel command, following are some usefull options:

# Kill the specified job
qdel -j job_id


# Kill all job launched by the specified user
qdel -u login

 

SGE to SLURM
SGESLURMComments
qsub script.shsbatch script.shsbatch is only for script
qsub -l mem=XG -l h_vmem=YG -b y srun --mem=YGNo h_vmem parameters with Slurm.
qsub -m beasbatch/srun --mail-type=BEGIN,END,FAILNotify user by email when certain event types occur.
qsub -b y "command"sbatch --wrap="command"submit command line
qsub -sync y "command"srun "command"submit a job in real time
qsub -pe parallel_smp 8sbatch/srun -c 8 (--cpus-per-task=8)by default, job are on one node (-N=1 <--> --nodes=nnodes)
qsub -pe parallel_fill n or
qsub -pe parallel_rr n
sbatch/srun

-N=nnodes (--nodes=nnodes)

-n=ntasks (--ntasks=ntasks)
-c=ncpus (--cpus-per-task)ncpus
No parallel environnement with slurm
qstat -u loginsqueue -u loginSee all your submitted jobs
qstat -j job_idscontrol show job job_idRunning job details
qacct -j job_idsacct --unit=G --format JobID,jobname,NTasks,nodelist,CPUTime,ReqMem,MaxVMSize,Elapsed job_idFinished job details.
qquota_cpu loginsquota_cpuSee your CPU time quota
qdel -j job_idscancel job_idKill a job