CIT - TORQUE/Maui

The batch system

The batch system is used to distribute computing jobs to the cluster. It can be used in two ways, via scripts and interactively. A typical script could look as following:

#PBS -o /home/u/username/output.dat #PBS -l walltime=2:00:00,nodes=1 #PBS -M username@uni-muenster.de #PBS -m ae #PBS -q default #PBS -N Jobname #PBS -j oe cd $PBS_O_WORKDIR ./a.out

The lines stand for:

Name of the standard output file
Approximated walltime of the computation ( 2 hours), number of nodes (here: 1 core on an arbitrary node)
Email address of the user
Email notification, if the job aborts and finishes
Name of the queue.
Name of the job
Put standard output and standard error messages in a single file
Change directory to the place, where the script has been submitted
Call a program

Put these commands in a file and submit it via "qsub filename" to enqueue your job in the batch system.

Job monitoring

For a graphical overview of all running jobs, the utility pbstop can be used.

Another possibility is the command line tool "qstat". The following options are useful:

qstat -a: Shows all queued and jobs
qstat -u username: Shows only the jobs of the specified user
qstat -n PID: Shows the nodes that are running the specified job
qstat -f PID: Shows the full information of the specified job

Choosing the compute nodes

Switch in submit script	Compute nodes
Useful statements
-l nodes=1:smp:ppn=6	6 Cores on zivsmp001
-l nodes=1:hpc:ppn=8	8 Cores of a single node (e.g. node016). Only with this method it is ensured that you will get a nodes for your own so that your job does not interfere with other jobs!
-l nodes=node016:ppn=8	Explicit selection of node016 with all of its cores
Not recommended statements
-l nodes=10	10 arbitrary cores of the HPC nodes or ZIVSMP.
-l nodes=2:smp:ppn=4	The jobs will not start since there is only one node with the property "smp" (ZIVSMP)

The maximum walltime that can be used is 160 hours.