Commit c5c50cbd authored by root's avatar root
Browse files

July 2020 update

parent 0026b2b1
Last Updated - 2019-09-02
This is a collection of sample test jobs for Spartan, written with SLURM for job submission, across various applications, and also with small sample MPI and OpenMP programs.
** HOW TO EXECUTE JOBS **
Executing a job on the cluster involves submitting a slurm job request script using the `sbatch` command, or by attaching your terminal session to a compute node directly by using the `sinteractive` command. Both of these commands, and the sbatch scripts used to define a job request accept a list of flags that can be seen by doing `sbatch --help`.
Functionally, this works by creating a new user session on a compute node (or in the case of an interactive job, connecting your current session to a compute node).
The compute nodes on Spartan are divided into partitions based on their intended use. This also influences the type of hardware we employ. Each node in a partition has the same amount of CPUs and RAM. Configurations are as follows:
cloud - 12 CPU - 100GB RAM - Intended for single node calculations of any type, or loosely coupled multi-node applications. Cloud nodes, 1:1 (non-oversubscribed) vCPUs. No fast interconnect, standard TCP networking.
physical - 12-72 CPU - 256GB-1.5TB RAM - Intended for MPI multi-node calculations. Fast interconnect.
bigmem - 32 CPU - 1.5TB RAM - Intended for single node operations that require a large amount of RAM but cannot easily be distributed over multiple nodes.
Note there are several queues that are accessible only to specific project groups:
gpgpu - 24 CPU - 127GB RAM - 4xP100 GPU - LIEF GPGPU, accessible to specified GPGPU projects only. Please see the GPU directory for details.
deeplearn - 28 CPU - 256GB RAM - 4xV100 GPU - Intended for use by MSE and associated groups, for neural network development.
** HOW TO LOAD SOFTWARE **
We use LMod to manage existing software installations. Because Spartan is a general purpose HPC cluster with variying requirements and uses we sometimes need to make different versions of the same software package available, and we need to segregate them so they don't interfere with each other.
This is particularly true of applications that are a language in their own right (e.g. Python, R, Matlab) or that need to be matched to those types of applications (e.g. Tensorflow needs to be compiled against a specific Python version, which then needs to be loaded alongside it when it is used).
LMod allows us to assert that certain modules are necessary to make other modules run, and then load them automatically. As such, you can (in almost all standard cases) simply load the application you wish to use and all required components will be loaded as part of that.
Typing `module avail` will show a complete list of currently installed software. You can do a simple search with this command by adding the search term to the end (e.g. `module avail Tensorflow`), or can do basic regex searches with the '-r' flag (e.g. `module -r avail '^Python'`). Note that this is limited to your compiler options. To view all available software and adopt a mix of compilers you will need to source the old configuration system (`source /usr/local/module/spartan_old.sh`).
Typing `module list` will show a complete list of software currently loaded into your user environment. We load a few little modules into your environment when you log in by default, none of these are specifically necessary to use the cluster.
You can use `module purge` to remove all currently loaded modules from your environment, and (of course), `module load packagename/version` to load a specific module.
Looking at the names of modules, we can see they have two parts: The package name and a version number and toolchain, sometimes with additional details... but these can be confusing when you're not used to interpreting them.
Here's several examples of existing packages, each with slighty more complexity than the last:
GCC/6.2.0 - GNU Compiler suite, version 6.2.0. The lack of any other designation after the version number marks this as either a binary (i.e. built and distributed externally) or a Toolchain. Those of you already tasked with writing and building C or Fortran code will almost certainly recognise this.
intel/2016.u3 - the Intel Compiler suite, version 2016, update 3. A toolchain that we have used to build a large amount of software over the last few years. The Intel Compiler Suite is a set of highly optimised code additions that work alongside the GCC compiler suite (if you use `module list` after loading this module, you will see that it also loads GCC/4.9.2 automatically), but also contains Intel's MPI and specialised Math libraries. This is a simple example of a 'bundle' module, which exists to load subcomponents (e.g. icc, ifort, impi, imkl).
FFTW/3.3.4-GCC-6.2.0 - The FFTW fourier transform library, version 3.3.4, built with GCC-6.2.0. A math library we use as an underlying dependency to build higher level applications. You can also utilize these if you are building software, but must remember to load them before the build and then before you attempt to use the software.
NAMD/2.12-intel-2017.u2-mpi-CUDA - NAMD Scalable Molecular Dynamics Simulator, version 2.12, built with the Intel Compiler Suite, 2017 version, update 2. Note the extra suffixes on this toolchain version... these indicate that specific flags were used in the build. This particular version has MPI specifically enabled (for multi-node jobs) and can utilize some version of CUDA.
Tensorflow/1.4.0-intel-2017.u2-GCC-5.4.0-CUDA8-Python-3.5.2-GPU - Tensorflow 1.4.0 built for Python 2.5.2 using a toolchain comprising of the Intel Compiler Suite 2017 update 2, GCC 5.4.0 and CUDA v8. It has a flag to show it has been built with specific build settings enabling gpu usage, though that's perhaps a little redundant here.
You can see some details about most modules by doing `module whatis packagename`, and can see more details about what the module does when it loads (particularly, what paths and environmental variables it adds) by doing `module show packagename`.
Note that it is important that toolchains align. If you load several modules in sequence and see modules
being changed in the process it's possible you'll see a conflict at runtime... Different compilers can build the same applications with different features, and different versions of libraries can perform tasks in different ways which can mess with the way those libraries interface with the program that's using them.
Additionally, some toolchains are built with specific goals in mind. Most notably, the CUDA suffixed toolchains (e.g. intel-2017.u2-GCC-5.4.0-CUDA8) are intended for use on the GPGPU parititon and are built with the specific settings required by the hardware present in those partitions.
** USEFUL COMMANDS **
Slurm comes packaged with the following command line tools, all of which have their own flags and options. You can see information on the usage of each by adding ' --help' to each.
`sinfo` - Show all Slurm queues & summarise the states of nodes in them. This is useful for determining how busy the cluster currently is. Note that some queues are restricted.
`squeue` - Show all currently executing and pending jobs (for all users) and their current states.
`sacct` - Report data on running and completed jobs. Note that, by default, searches start from 00:00:00 on the current day. To view older jobs you will need to use the -S flag and specify a date.
`sstat` - Show the run status of a currently running job.
`scancel` - Cancel a queued job. Of course, you will only be able to cancel jobs you have permission to control.
`scontrol` - A general, but more complex tool. Can perform many functions of the above commands from a single tool. Some useful subcommands of scontrol are `scontrol hold` and `scontrol resume` which will allow you to prevent a job from starting and release said job to allow it to start respectively. Note that many of scontrol's capabilities are also permission-locked.
** OTHER USEFUL INFO **
Our '/data/projects', '/home', and '/scratch' trees are all on CephFS, allowing for a shared parallel filesystem across the entire cluster. Ceph generally doesn't have good performance when dealing with small writes, and appends in particular. Some applications are specifically bad at dealing with this, with in-place bioinformatics and other processes that write, append, or alter lots of small existing files in quick succession. This includes build configuration tools like CMake.
In many cases, particularly for small compilations and the like, it is worth using the location '/var/local/tmp'. This location exists on every node in the cluster, and is always on the local disk of the node, isolated from others. It is configured so that files copied to it retain the ownership of the person who copied or created them, so anything you put in there will be owned by you and not viewable by anyone else using the node by default. By implication, this area is not shared across nodes.
** TASK AND APPLICATION SPECIFIC EXAMPLES **
Each directory contains examples of submission scripts and/or relevant input for a specific set of tasks or a commonly used application. Here is a list of directories and their contents:
array/ - example array batch jobs, using Slurm's Job Array functionality.
BLAST/ - example submission script and input data for RNA/DNA alignment using BLAST
depend/ - example scripts featuring Slurm's dependency functionality, allowing for automation flow control and conditional execution based on prior job results.
FDS/ - example submission script for FDS - Fire Dynamics Simulator
FreeSurfer/ - example scripts and input for FreeSurfer MRI toolkit
FSL/ - example scripts and input files for the FMRIB Software Library (FSL)
Gaussian/ - example scripts and input for Gaussian/G09 (and other iterations)
GPU/ - example submissions scripts, CUDA code and compiled CUDA enabled executables.
GROMACS/ - example input for GROningen MAchine for Chemical Simulations (GROMACS). Scripts pending.
Gurobi/ - License information for Gurobi optimization toolkit
HPCshells/ - A variety of shell scripting examples for those looking to learn new tricks. Course material for our HPC Shells course.
interact/ - A quick guide on invoking interactive sessions using the sinteractive command
IntroLinux/ - Course material from our introductory linux shells course
MATLAB/ - example scripts and input files for simple single node Matlab jobs.
NAMD/ - simple example scripts for basic NAMD jobs. MPI & CUDA enabled job guides pending.
Octave/ - example scripts and code for GNU Octave.
OpenMP/ - example code for OpenMP local multicore applications
OpenMPI/ - example code and submission scripts for OpenMPI multi-node applications
ORCA/ - example scripts and inputs for the ORCA ab initio quantum chemistry package
Python/ - example code and scripts for using Python on the cluster, including MPI4py
Qchem/ - example submissions script for Qchem Quantum Chemistry Package
R/ - example code and submissions scripts for using R on the cluster.
Singularity/ - example submission script and machine image for Singularity container system.
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --time=0:05:00
#SBATCH --GRES=abaqus+5
module purge
source /usr/local/module/spartan_old.sh
module load ABAQUS/6.14.2-linux-x86_64
# Run the job 'Door'
abaqus job=Door
#!/bin/bash
#SBATCH --partition=physical
#SBATCH --time=1:00:00
module purge
source /usr/local/module/spartan_old.sh
module load ABINIT/8.0.8b-intel-2016.u3
abinit < tbase1_x.files >& log
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=2015ABRicate-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
source /usr/local/module/spartan_old.sh
module load ABRicate/0.8.7-spartan_intel-2017.u2
# The command to actually run the job
abricate ecoli_rel606.fasta
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=ABRicate-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
module load spartan_2019
module load foss/2019b
module load abricate/0.9.9-perl-5.30.0
# The command to actually run the job
abricate ecoli_rel606.fasta
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=ABySS-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
module load ABySS/2.0.2-goolf-2015a
# Assemble a small synthetic data set
tar xzvf test-data.tar.gz
sleep 20
abyss-pe k=25 name=test in='test-data/reads1.fastq test-data/reads2.fastq'
# Calculate assembly contiguity statistics
abyss-fac test-unitigs.fa
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=ABySS-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
module load spartan_2019
module load foss/2019b
module load abyss/2.1.5
# Assemble a small synthetic data set
tar xzvf test-data.tar.gz
sleep 20
abyss-pe k=25 name=test in='test-data/reads1.fastq test-data/reads2.fastq'
# Calculate assembly contiguity statistics
abyss-fac test-unitigs.fa
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=2015ADMIXTURE-test.slurm
# Run with two threads
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module load ADMIXTURE/1.3.0
# Untar sample files, run application
# See admixture --help for options.
tar xvf hapmap3-files.tar.gz
admixture -j2 hapmap3.bed 1
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=2019ADMIXTURE-test.slurm
# Run with two threads
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
module load spartan_2019
module load admixture/1.3.0
# Untar sample files, run application
# See admixture --help for options.
tar xvf hapmap3-files.tar.gz
admixture -j2 hapmap3.bed 1
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=2015AFNI-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
source /usr/local/module/spartan_old.sh
module load AFNI/linux_openmp_64-spartan_intel-2017.u2-20190219
# Untar dataset and run script
tar xvf ARzs_data.tgz
./@ARzs_analyze
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=AFNI-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
module load spartan_2019
module load foss/2019b
module load afni/18.3.00-python-3.7.4
# Untar dataset and run script
tar xvf ARzs_data.tgz
./@ARzs_analyze
This is incomplete; still getting the tgz files organised. LL20200707
#!/bin/bash
#SBATCH --job-name="2015ANSYStest"
# Note this order has to be kept. It's horrible, but it works.
module purge
source /usr/local/module/spartan_old.sh
module load X11/20190311-spartan_gcc-6.2.0
module load motif/2.3.5-goolf-2015a
module load libXpm/3.5.11-goolf-2015a
module load ANSYS_CFD/19.0
ansys190 -b < OscillatingPlate.inp > OscillatingPlate.db
#!/bin/bash
# Job name and partition
#SBATCH --job-name=ARAGORN-test.slurm
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
source /usr/local/module/spartan_old.sh
module load ARAGORN/1.2.36-GCC-4.9.2
# Run the application
aragorn -o results sample.fa
#!/bin/bash
# Add your project account details here.
# SBATCH --account=XXXX
#SBATCH --partition=gpgpu
#SBATCH --ntasks=4
#SBATCH --time=1:00:00
module purge
source /usr/local/module/spartan_old.sh
module load Amber/16-gompi-2017b-CUDA-mpi
srun /usr/local/easybuild/software/Amber/16-gompi-2017b-CUDA-mpi/amber16/bin/pmemd.cuda_DPFP.MPI -O -i mdin -o mdout -inf mdinfo -x mdcrd -r restrt
Array job indices can be specified in various number of ways.
A job array with index values between 0 and 31:
#SBATCH --array=0-31
A job array with index values of 1, 2, 5, 19, 27:
#SBATCH --array=1,2,5,19,27
A job array with index values between 1 and 7 with a step size of 2 (i.e. 1, 3, 5, 7):
#SBATCH --array=1-7:2
As with all Slurm directives, the SBATCH command can be applied within the batch script or on the command line.
To convert a directory of files to include an array ID see the following example:
$ touch aaa.fastq.gz bbb.fastq.gz ccc.fastq.gz ddd.fastq.gz
$ ls
aaa.fastq.gz bbb.fastq.gz ccc.fastq.gz ddd.fastq.gz
$ n=1; for f in *fastq.gz; do mv "$f" "$((n++))$f"; done
$ ls
1aaa.fastq.gz 2bbb.fastq.gz 3ccc.fastq.gz 4ddd.fastq.gz
See also the Octave array example in /usr/local/common/Octave.
#!/bin/bash
#SBATCH --job-name="file-array"
#SBATCH --ntasks=1
#SBATCH --time=0-00:15:00
#SBATCH --array=1-5
# Note: SLURM defaults to running jobs in the directory
# where they are submitted, no need for $PBS_O_WORKDIR
mkdir ${SLURM_ARRAY_TASK_ID}
#!/bin/bash
# Job name and partition
#SBATCH --job-name=2015BAMM-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Speciation-extinction analyses
# You must have an ultrametric phylogenetic tree.
# Load the environment variables
module purge
source /usr/local/module/spartan_old.sh
module load BAMM/2.5.0-spartan_intel-2017.u2
# Example from: `http://bamm-project.org/quickstart.html`
# To run bamm you must specify a control file.
# The following is for diversification.
# You may wish to use traits instead
# bamm -c template_trait.txt
bamm -c template_diversification.txt
#!/bin/bash
# Job name and partition
#SBATCH --job-name=BAMM-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Speciation-extinction analyses
# You must have an ultrametric phylogenetic tree.
# Load the environment variables
module purge
module load spartan_2019
module load foss/2019b
module load bamm/2.5.0
# Example from: `http://bamm-project.org/quickstart.html`
# To run bamm you must specify a control file.
# The following is for diversification.
# You may wish to use traits instead
# bamm -c template_trait.txt
bamm -c template_diversification.txt
#!/bin/bash
# To give your job a name, replace "MyJob" with an appropriate name
#SBATCH --job-name=2015BBMap-test.slurm
# Run on single CPU
#SBATCH --ntasks=1
# set your minimum acceptable walltime=days-hours:minutes:seconds
#SBATCH -t 0:15:00
# Specify your email address to be notified of progress.
# SBATCH --mail-user=youreamiladdress@unimelb.edu
# SBATCH --mail-type=ALL
# Load the environment variables
module purge
source /usr/local/module/spartan_old.sh
module load BBMap/36.62-intel-2016.u3-Java-1.8.0_71
# See examples at:
# http://seqanswers.com/forums/showthread.php?t=58221
reformat.sh in=sample1.fq out=processed.fq
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment