Commit 0780dff0 authored by root's avatar root

Update for May

parent 2f8c1efb
......@@ -44,3 +44,4 @@ Trimmomatic/.backup
*.csv
*.data
*.inp
*.fast5
#!/bin/bash
# function Extract for common file formats
function extract {
if [ -z "$1" ]; then
# display usage if no parameters given
echo "Usage: extract <path/file_name>.<zip|rar|bz2|gz|tar|tbz2|tgz|Z|7z|xz|ex|tar.bz2|tar.gz|tar.xz>"
else
if [ -f "$1" ] ; then
NAME=${1%.*}
#mkdir $NAME && cd $NAME
case "$1" in
*.tar.bz2) tar xvjf ./"$1" ;;
*.tar.gz) tar xvzf ./"$1" ;;
*.tar.xz) tar xvJf ./"$1" ;;
*.lzma) unlzma ./"$1" ;;
*.bz2) bunzip2 ./"$1" ;;
*.rar) unrar x -ad ./"$1" ;;
*.gz) gunzip ./"$1" ;;
*.tar) tar xvf ./"$1" ;;
*.tbz2) tar xvjf ./"$1" ;;
*.tgz) tar xvzf ./"$1" ;;
*.zip) unzip ./"$1" ;;
*.Z) uncompress ./"$1" ;;
*.7z) 7z x ./"$1" ;;
*.xz) unxz ./"$1" ;;
*.exe) cabextract ./"$1" ;;
*) echo "extract: '$1' - unknown archive method" ;;
esac
else
echo "'$1' - file does not exist"
fi
fi
}
\ No newline at end of file
......@@ -6,6 +6,6 @@ cat <<- EOF > job${a}
#SBATCH -N ${a}
#SBATCH --nodes=1
#SBATCH --tasks-per-node=1
#echo $(pwd) >> results.txt
echo $(pwd) >> results.txt
EOF
done
......@@ -14,7 +14,8 @@
module purge
source /usr/local/module/spartan_old.sh
# Select a version of Amber
# Select a version of Amber - see https://ambermd.org/GetAmber.php for details on the
# differences between Amber and AmberTools
# module load Amber/16-GCC-6.2.0
# module load Amber/16-GCC-6.2.0-CUDA
......
#!/bin/bash
#Always set nodes=1 for Gaussian as it cannot use more than one node
#SBATCH --nodes=1
#SBATCH --ntasks=1
#Increase cpus-per-tasks to scale beyond 1 CPU
#SBATCH --cpus-per-task=1
#SBATCH --job-name="Gaussian Test"
# Change these as appropriate
......@@ -9,5 +14,8 @@ module purge
module load pgi/18.10-gcc-8.3.0-2.32
module load gaussian/g16c01
#Sets the number of processors/cores used by gaussian. This can be overriden by %NProcShared directive in the input files
export GAUSS_PDEF=${SLURM_CPUS_PER_TASK}
g16 < $INPUT_FILE > $OUTPUT_FILE
......@@ -10,11 +10,17 @@ do
cat <<- EOF > job${test}.slurm
#!/bin/bash
#SBATCH --job-name="Gaussian Test ${test}"
#Always set nodes=1 for Gaussian as it cannot use more than one node
#SBATCH --nodes=1
#SBATCH --ntasks=1
#Increase cpus-per-tasks to scale beyond 1 CPU
#SBATCH --cpus-per-task=1
#SBATCH --time=12:00:00
module purge
module load pgi/18.10-gcc-8.3.0-2.32
module load gaussian/g16c01
#Sets the number of processors/cores used by gaussian. This can be overriden by %NProcShared directive in the input files
export GAUSS_PDEF=${SLURM_CPUS_PER_TASK}
g16 < test${test}.com > test${test}.log
EOF
done
#!/bin/bash
# Include your project ID here to access GPGPU resources
# SBATCH -A punimXXXX
#SBATCH --partition=gpgpu
#SBATCH --gres=gpu:4
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
#SBATCH --time=0-10:00:00
#SBATCH --job-name="2019guppy-barcode"
# Have a look at the README.md file!
# Copy the guppy directory
cp -r /usr/local/common/Guppy .
source /home/$(whoami)/ont-guppy/setup_guppy.sh
# Insert path for data, output and actions
# Sample data from Fannana Rafa. Thank you!
ont-guppy-4.4.2/bin/guppy_barcoder --input_path ~/Guppy/basecall --save_path ~/Guppy/barcode2/ --trim_barcodes -t 8 --device 'auto' --verbose_logs --detect_mid_strand_barcodes
# Introduction
Guppy software supports MinIT and MinION instruments from Nanopore Technologies.
# Modules
You will need to load the following modules to run guppy/3.6.1
fosscuda/2019b
guppy/3.6.1
The module fosscuda/2019b will load: gcc/8.3.0 cuda/10.1.243 openmpi/3.1.4
Or, if you want to use even older versions from the 2015 to 2019 build system:
source /usr/local/module/spartan_old.sh
module av guppy
--------------------------------------------------------- /usr/local/easybuild/modules/all ----------------------------------------------------------
Guppy/2.3.1-cpu Guppy/2.3.1-rpm Guppy/3.2.4 (D) ont-guppy/3.1.5
# New Versions and Installation
The guppy licencese has changed since 2018. The license is between the user and ONT.
https://nanoporetech.com/sites/default/files/s3/terms/Nanopore-product-terms-and-conditions-nov2018-v2.pdf
As a result, we cannot perform a centralised installation and have it accessed through our modules system. It has to be installed in the
user's home directory. We can assist in the installation. The following steps may suffice.
Firstly, register yourself as a guppy user with ONT and download the Linux64 tarball to your home directory.
Extract the tarball and edit and run our setup script (setup_guppy.sh) from your home directory when using it with the command:
source /home/$(whoami)/ont-guppy/setup_guppy.sh
GUPPY_HOME=/home/$(whoami)/ont-guppy/ont-guppy
export LD_LIBRARY_PATH=$GUPPY_HOME/lib:$LD_LIBRARY_PATH
export PATH=$GUPPY_HOME/bin:$PATH
#!/bin/bash
#SBATCH --ntasks=8
module purge
module load OpenMPI/1.10.0-GCC-4.9.2
time srun mpi-helloworld
# Add job monitor snipped to determine resource usage
JOBID=$SLURM_JOB_ID
if [ ! -z $SLURM_ARRAY_JOB_ID ]; then
JOBID="${SLURM_ARRAY_JOB_ID}_${SLURM_ARRAY_TASK_ID}"
fi
my-job-stats -a -j $JOBID
#!/bin/bash
module purge
module load spartan_2019
module load julia/1.3.1-linux-x86_64
julia simple.jl
## Parallel Processing
## Multithreaded Julia
Julia supports multithreaded and message passing computation.
For multithreaded applications, as with C, Fortran etc, you need to set the number of threads in the shell. This export command must be used in
Slurm scripts and on the compute node for interactive jobs.
For example, to set 4 cores for threading in an interactive job, and launch Julia with 4 threads.
```
sinteractive --time=6:0:0 --ntasks=4
module load julia/1.5.1-linux-x86_64
export JULIA_NUM_THREADS=4
julia --threads 4
```
Within the Julia environment the number of threads can be confirmed with the Threads.nthreads() function, and the function Threads.threadid()
will identify which thread one is currently on.
```
julia> Threads.nthreads()
4
julia> Threads.threadid()
1
```
As with other multithreading environments (c.f., OpenMP) the programmer is responsible for protecting against race conditions, including the
possibility that variables can be written and read from multiple threads.
As a simple example, the following will create an array of 0s, and then runs a multithreaded command where each thread writes its ID to a member of the array.
```
julia> a = zeros(10)
julia> Threads.@threads for i = 1:10
a[i] = Threads.threadid()
end
julia> a
```
## Distributed Julia
In addition to multithreaded applications, Julia also supports message passing parallel computing. In this case the `--procs` option
determines how many cores are in the communication world.
The following commands launch the job with four processors, invokes the Distributed package, and then checks the ID numbers of the master and
worker processes.
```
julia --procs 4
julia> using Distributed
julia> Distributed.myid()
julia> workers()
@everywhere println("hello world")
```
This diff is collapsed.
Julia is a high-level, high-performance dynamic programming language for technical computing. Homepage: http://julialang.org/
A simple example is provided of a Slurm script which reads in a Julia file (simple.jl).
Julia users sometimes require additional packages to be installed. Unfortunately these are not available system-wide, but rather have to be installed in each user's home directory. This is a multiple-step process, which requires downloading the metadata, followed by the packages required. For example:
......
Julia is a high-level, high-performance dynamic programming language for technical computing. Homepage: http://julialang.org/
A simple example is provided of a Slurm script which reads in a Julia file (simple.jl).
In addition, there is a Packages.md file, describing the use of extension in Julia, a Julia_Basics.md file which covers to the core aspects to the language, and Julia_Advanced.md for parallel processing.
using DelimitedFiles
using Pkg
Pkg.add("SharedArrays")
@everywhere using SharedArrays
res = SharedArray(zeros(10))
@distributed for x in 1:10
res[x] = my_func(x)
end
writelm("results.txt", res)
#=
Julia Ecology Example.
This is inspired from Timothée Poisot's (abandoned, only partially initiated) julia-ecology-lesson, itself a fork from the Software Carpentry
equivalient that uses Python and has been converted to a Slurm script. It can, of course, be run in an interactive mode to show the
functions.
https://github.com/tpoisot/julia-ecology-lesson
The example uses the Portal Teaching data, a subset of the data from Ernst *et al.* "Long-term monitoring and experimental manipulation of a
Chihuahuan Desert ecosystem near Portal, Arizona, USA.
http://www.esapubs.org/archive/ecol/E090/118/default.htm
We are studying the species and weight of animals caught in plots in our study area.
The dataset is stored as a `.csv` file: each row holds information for a single animal, and the columns represent:
| Column | Description |
|:------------------|:------------------------------|
| `record_id` | Unique id for the observation |
| `month` | month of observation |
| `day` | day of observation |
| `year` | year of observation |
| `plot_id` | ID of a particular plot |
| `species_id` | 2-letter code |
| `sex` | sex of animal ("M", "F") |
| `hindfoot_length` | length of the hindfoot in mm |
| `weight` | weight of the animal in grams |
=#
## Change your directory, otherwise it will download, read, and write to $HOME! e.g.,
cd("$(homedir())/Julia/ecology")
## download function requires a URL and a name of the file to write to.
download("https://ndownloader.figshare.com/files/2292172", "surveys.csv")
# One of the best options for working with tabular data in Julia is to use the DataFrames.jl package. It provides data structures, and
# integrates nicely with other tools like Gadfly for plotting, and SQLite packages.
# Add the package if it hasn't already been installed.
Pkg.add("DataFrames")
using DataFrames
# Read the dataframe. A handy command!
# Adding the semi-colon at the end of readtable() function prevents the output from being displayed to standard output.
# Remove this if running the example in interactive mode.
surveys_df = readtable("surveys.csv");
# Determine the type, names of titles
# Display the species field, determineunique species
# See the following "cheat sheet" for Dataframes
# https://jcharistech.wordpress.com/julia-dataframes-cheat-sheets/
typeof(surveys_df)
names(surveys_df)
surveys_df[:,:species_id]
species=unique(surveys_df,:species_id)
# Write out the table
writetable("species.csv", species)
#!/bin/bash
#SBATCH -p cloud
#SBATCH ntasks=1
module load Julia/0.6.0-binary
julia simple.jl
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
module load julia/1.5.1-linux-x86_64
export JULIA_NUM_THREADS=4
julia -4
# A distributed version of Rule 30 cellular automaton.
# Derived from Przemysław Szufel, Bogumił Kamiński
# Julia 1.0 Programming Cookbook, 2018
# Invoke distributed paralellism
using Distributed
addprocs(4)
import Pkg; Pkg.add("ParallelDataTransfer")
@everywhere using ParallelDataTransfer
# Define Rule 30
@everywhere function rule30(ca::Array{Bool})
lastv = ca[1]
for i in 2:(length(ca)-1)
current = ca[i]
ca[i] = xor(lastv, ca[i] || ca[i+1])
lastv = current
end
end
# Define the function that can be used by an individual worker to acquire data from its neighbors
@everywhere function getsetborder(ca::Array{Bool},
neighbours::Tuple{Int64,Int64})
ca[1] = (@fetchfrom neighbours[1] caa[end-1])
ca[end] = (@fetchfrom neighbours[2] caa[2])
end
# function to visualize the cellular automaton state
function printsimdist(workers::Array{Int})
for w in workers
dat = @fetchfrom w caa
for b in dat[2:end-1]
print(b ? "#" : " ")
end
end
println()
end
# function for iterating over the cellular automaton state
function runca(steps::Int, visualize::Bool)
@sync for w in workers()
@async @fetchfrom w fill!(caa, false)
end
@fetchfrom wks[Int(nwks/2)+1] caa[2]=true
visualize && printsimdist(workers())
for i in 1:steps
@sync for w in workers()
@async @fetchfrom w getsetborder(caa, neighbours)
end
@sync for w in workers()
@async @fetchfrom w rule30(caa)
end
visualize && printsimdist(workers())
end
end
# define the simulation state variables for each worker node, along with information about its neighbors
wks = workers()
nwks = length(wks)
for i in 1:nwks
sendto(wks[i],neighbours=(i==1 ? wks[nwks] : wks[i-1],
i==nwks ? wks[1] : wks[i+1]))
fetch(@defineat wks[i] const caa = zeros(Bool,15+2));
end
# run the distributed cellular automaton
runca(20,true)
using Distributed
addprocs(4)
@everywhere begin
using ParallelDataTransfer
using Random
Random.seed!(100)
end
using Test
# creates an integer x and Matrix y on processes 1 and 2
sendto([1, 2], x=100, y=rand(2, 3))
abs(remotecall_fetch(getindex,2,y,1,1) - .260) < 1e-2
# create a variable here, then send it everywhere else
z = randn(10, 10); sendto(workers(), z=z)
#@everywhere println(z)
# get an object from named x from Main module on process 2. Name it x
x = @getfrom(2, z)
@test x==z
y = getfrom(2, :z)
@test y==z
sendtosimple(2,:x,3)
y = @getfrom 2 x
@test y == 3
# pass variable named x from process 2 to all other processes
@spawnat 2 eval(:(x=1))
passobj(2, filter(x->x!=2, procs()), :x)
@test x==1
@defineat 3 x=3
xhome = @getfrom(3, x)
@test xhome == 3
@passobj 3 filter(x->x!=3, procs()) x
@test x==3
@defineat 3 x=5
@passobj 3 1 x
@test x==5
# broadcast needs to be fixed
@broadcast x=6
@passobj 4 1 x
@test x==6
# pass variables t, u, v from process 3 to process 1
@spawnat 3 eval(:(t=1))
@spawnat 3 eval(:(u=2))
@spawnat 3 eval(:(v=3))
passobj(3, 1, [:t, :u, :v])
@test [t;u;v] == [1;2;3]
@everywhere module Foo
foo = 1
end
passobj(3, 1, :foo, from_mod=Foo)
@test foo == 1
# Pass a variable from the `Foo` module on process 1 to Main on workers
passobj(1, workers(), :foo, from_mod=Foo)
#### @getfrom test ####
@everywhere mutable struct Bar
a
b
c
end
Random.seed!(3)
bar_vec = [Bar(rand(3),rand(3),rand(3)) for n in 1:3]
sendto(workers(),bar_vec=bar_vec)
@test @getfrom(2,bar_vec[3].c) == bar_vec[3].c
remotecall(()->Main.bar_vec[3].c=ones(3),2)
mybar_3c1 = remotecall_fetch(()->Main.bar_vec[3].c,2)
@test mybar_3c1 == ones(3)
mybar_3c2 = @getfrom(2,bar_vec[3].c)
@test mybar_3c2 == ones(3)
path, io = mktemp() # Create temp file and store some definitions
println(io, "__f(x) = x")
println(io, "__g(x) = x")
close(io)
w = workers()
include_remote(path, w[1]) # Include file at remote
@test remotecall_fetch(()->@isdefined(__f), w[1])
@test remotecall_fetch(()->@isdefined(__g), w[1])
@test remotecall_fetch(()->!@isdefined(__f), w[2])
include_remote(path, w) # Include on all remotes
for w in w
@test remotecall_fetch(()->@isdefined(__f), w)
@test remotecall_fetch(()->@isdefined(__g), w)
end
rm(path)
......@@ -21,10 +21,8 @@ function quadratic2(a::Float64, b::Float64, c::Float64)
end
vol = sphere_vol(3)
# @printf allows number formatting but does not automatically append the \n to statements, see below
# @printf "volume = %0.3f\n" vol
# @printf deprecated, removed from example, 202007LL
quad1, quad2 = quadratic2(2.0, -2.0, -12.0)
println("result 1: ", quad1)
......
#!/bin/bash
#SBATCH --ntasks=4
module load foss/2019b
module load python/3.7.4
module load numpy/1.18.0-python-3.7.4
mpirun -np 4 python3 broadcast.py
#!/bin/bash
#SBATCH --ntasks=4
module load foss/2019b
module load python/3.7.4
srun -n 4 python3 helloworld.py
#!/bin/bash
#SBATCH --ntasks=4
module load foss/2019b
module load python/3.7.4
module load numpy/1.18.0-python-3.7.4
mpirun -np 4 python3 sendrecv.py
#!/bin/bash
#SBATCH --ntasks=4
module load foss/2019b
module load python/3.7.4
module load numpy/1.18.0-python-3.7.4
srun -n 4 python3 testmpi.py
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --ntasks-per-node=8
#SBATCH --time=0-12:00:00
# Load required modules
......
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if rank == 0:
data = {'key1' : [7, 2.72, 2+3j],
'key2' : ( 'abc', 'xyz')}
else:
data = None
data = comm.bcast(data, root=0)
"""
Some utility functions useful for MPI parallel programming
"""
from mpi4py import MPI
#=============================================================================
# I/O Utilities
def pprint(str="", end="\n", comm=MPI.COMM_WORLD):
"""Print for MPI parallel programs: Only rank 0 prints *str*."""
if comm.rank == 0:
print str+end,
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if rank == 0:
data = {'a': 7, 'b': 3.14}
req = comm.isend(data, dest=1, tag=11)
req.wait()
elif rank == 1:
req = comm.irecv(source=0, tag=11)
data = req.wait()
Searching for mentions
The top ten users mentioned are:
@5SOS : 3409
@Calum5SOS : 2059
@Luke5SOS : 86
@MTV : 67
@ITSPUNKROCK : 45
@itunesfestival : 43
@AltPress : 42
@JohnFeldy : 42
@charliesimo : 40
@JackAllTimeLow : 40
real 0m6.219s
user 0m4.434s
sys 0m0.142s
Searching for topics
The top ten users mentioned are:
@5SOS : 3409
@Calum5SOS : 2059
@Luke5SOS : 86
@MTV : 67
@ITSPUNKROCK : 45
@itunesfestival : 43
@AltPress : 42
@JohnFeldy : 42
@JackAllTimeLow : 40
@charliesimo : 40
real 0m4.498s
user 0m4.358s
sys 0m0.100s
Searching for the keyword 'jumping'
The top ten users mentioned are:
@5SOS : 3409
@Calum5SOS : 2059
@Luke5SOS : 86
@MTV : 67
@ITSPUNKROCK : 45
@itunesfestival : 43
@JohnFeldy : 42
@AltPress : 42
@JackAllTimeLow : 40
@charliesimo : 40
real 0m4.487s
user 0m4.342s
sys 0m0.103s
This diff is collapsed.
......@@ -56,3 +56,10 @@ llafayette@unimelb.edu.au@9770l-133895-l:~$ ssh -X lev@spartan.hpc.unimelb.edu.a
[lev@spartan-rc168 ~]$ kill %1
Another common use-case is to view PDFs, using the inbuilt viewer in x11. For example;
llafayette@unimelb.edu.au@9770l-133895-l:~$ ssh -X lev@spartan.hpc.unimelb.edu.au
[lev@spartan-login3 R]$ module load x11/20201008
[lev@spartan-login3 R]$ xpdf Rplots.pdf
Note that xpdf is only available on login nodes.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment