Commit dfc317f1 authored by Research Platforms's avatar Research Platforms

Update for Sep 2 2019

parent 0d726dc8
Pipeline #2159 canceled with stages
......@@ -8,6 +8,7 @@ FSL/preCourse.tar.gz
FSL/fmri
Gaussian/g16
Gaussian/tests
Genomics
HPCshells/NAMD
NAMD/apoa1
NAMD/NAMD_BENCHMARKS_SPARTAN
......
File added
This diff is collapsed.
File added
Abaqus example modified from Lev Lafayette, "Supercomputing with Linux", Victorian Partnership for Advanced Computing, 2015
The Abaqus FEA suite is commonly used in automatic engineering problems using a common model data structure and integrated solver technology. As licensed software it requires a number of license tokens based on the number of cores required, which can be calculated by the simple formula int(5 x N^0.422), where N is the number of cores. Device Analytics offers an online calculator at http://deviceanalytics.com/abaqus-token-calculator .
The case study here is a car door being propelled into a pole. This is analogous to the EURONCAP pole test, in which a car is propelled sideways into a rigid pole of diameter 254 mm at 29 km/h. While a crash event generally lasts for around 100 milliseconds, the time step of the case study has been reduced to 10 milliseconds to reduce the job time.
`Door.cae Door.inp abaqus.slurm abaqus-mpi.slurm`
The cae file is "complete abaqus environment", the inp file is for input. The output files will be Door.odb and Door.jnl ("output database" and "journal")
Submit the job using the following command: `sbatch abaqus.slurm`
The status of the job can be queried using the following command: `tail -f door.sta`
Once the job has completed, all files, with the exception of the output database (.ODB) file can be deleted. By default, ABAQUS/CAE writes the results of the analysis to the ODB file. When one creates a step, ABAQUS/CAE generates a default output request for the step, which in the case of this analysis is Energy Output. Check the output files for the job to ensure it has run correctly.
Use the Field Output Requests Manager to request output of variables that should be written at relatively low frequencies to the output database from the entire model or from a large portion of the model. The History Output Requests Manager is used to request output of variables that should be written to the output database at a high frequency from a small portion of the model; for example, the displacement of a single node.
The results will be visualised using ABAQUS/CAE. It should be noted that ABAQUS/Viewer is a subset of ABAQUS/CAE that contains only the post-processing capabilities of the Visualization module. The procedure discussed in this tutorial also applies to ABAQUS/Viewer. Copy the files to your local machine and run the Abaqus CAE. Do not do this in Trifid itself if at all possible. One should have Abaqus on your desktop machine for ease of visualisation.
It is almost always better do conduct computational intensive tasks on the cluster, and visualisation locally.
From the local command: `abaqus cae`
The following procedure is used to open the ODB file;
* Select [Open Database] in the Session Start window.
* The Open Database dialog will appear. Select Output Database from the File Filter dropdown menu.
* Select Door.odb and click [OK].
By default, ABAQUS/CAE will plot the undeformed shape with exterior edges visible. For clarity (if the mesh density is high) it may be necessary to make feature edges visible. The following procedure is used:
* Select [Common Plot Options] in the Toolbox Area.
* In the Basic Tab, check Feature edges in the Visible Edges section.
* Select [OK]. The door assembly undeformed shape plot is shown in the following figure. Both exterior edges and feature edges are shown.
The following procedure can be used to plot the crash models deformed shape:
* Select [Plot Deformed Shape] in the Toolbox area. By default, the final step is displayed. It should be noted that the Deformation Scale Factor is 1 by default in explicit analyses.
* Select [Animate: Time History] to animate the crash event. The frame rate can be adjusted by clicking [Animation Options] and moving the slider in the Player tab to the desired speed.
#!/bin/bash
#SBATCH --ntasts=1
#SBATCH --time=0:05:00
#SBATCH --GRES=abaqus+5
module load ABAQUS/6.14.2-linux-x86_64
# Run the job 'Door'
abaqus job=Door
Description: ABINIT is a package whose main program allows one to find the total energy, charge density and electronic structure of systems made of electrons and nuclei (molecules and periodic solids) within Density Functional Theory (DFT), using pseudopotentials and a planewave or wavelet basis. - Homepage: http://www.abinit.org/
Sample job script based on: https://www.abinit.org/sites/default/files/last/tutorial/generated_files/lesson_base1.html
TUTORIAL IS UNDER DEVELOPMENT
#!/bin/bash
#SBATCH --partition=physical
#SBATCH --time=1:00:00
#SBATCH --ntasks=8
module load ABINIT/8.0.8b-intel-2016.u3
abinit < tbase1_x.files >& log
# H2 molecule in a big box
#
# In this input file, the location of the information on this or that line
# is not important : a keyword is located by the parser, and the related
# information should follow.
# The "#" symbol indicates the beginning of a comment : the remaining
# of the line will be skipped.
#Definition of the unit cell
acell 10 10 10 # The keyword "acell" refers to the
# lengths of the primitive vectors (in Bohr)
#rprim 1 0 0 0 1 0 0 0 1 # This line, defining orthogonal primitive vectors,
# is commented, because it is precisely the default value of rprim
#Definition of the atom types
ntypat 1 # There is only one type of atom
znucl 1 # The keyword "znucl" refers to the atomic number of the
# possible type(s) of atom. The pseudopotential(s)
# mentioned in the "files" file must correspond
# to the type(s) of atom. Here, the only type is Hydrogen.
#Definition of the atoms
natom 2 # There are two atoms
typat 1 1 # They both are of type 1, that is, Hydrogen
xcart # This keyword indicates that the location of the atoms
# will follow, one triplet of number for each atom
-0.7 0.0 0.0 # Triplet giving the cartesian coordinates of atom 1, in Bohr
0.7 0.0 0.0 # Triplet giving the cartesian coordinates of atom 2, in Bohr
#Definition of the planewave basis set
ecut 10.0 # Maximal plane-wave kinetic energy cut-off, in Hartree
#Definition of the k-point grid
kptopt 0 # Enter the k points manually
nkpt 1 # Only one k point is needed for isolated system,
# taken by default to be 0.0 0.0 0.0
#Definition of the SCF procedure
nstep 10 # Maximal number of SCF cycles
toldfe 1.0d-6 # Will stop when, twice in a row, the difference
# between two consecutive evaluations of total energy
# differ by less than toldfe (in Hartree)
# This value is way too large for most realistic studies of materials
diemac 2.0 # Although this is not mandatory, it is worth to
# precondition the SCF cycle. The model dielectric
# function used as the standard preconditioner
# is described in the "dielng" input variable section.
# Here, we follow the prescriptions for molecules
# in a big box
## After modifying the following section, one might need to regenerate the pickle database with runtests.py -r
#%%<BEGIN TEST_INFO>
#%% [setup]
#%% executable = abinit
#%% [files]
#%% files_to_test =
#%% tbase1_1.out, tolnlines= 0, tolabs= 0.000e+00, tolrel= 0.000e+00
#%% psp_files = 01h.pspgth
#%% [paral_info]
#%% max_nprocs = 1
#%% [extra_info]
#%% authors = Unknown
#%% keywords =
#%% description = H2 molecule in a big box
#%%<END TEST_INFO>
tbase1_1.in
tbase1_1.out
in_tbase1
out_tbase1
tmp_tbase1
# LL 20190805
Amber (originally Assisted Model Building with Energy Refinement) is software for performing molecular dynamics and structure prediction.
TUTORIAL NOT YET COMPLETE
#!/bin/bash
# Add your project account details here.
# SBATCH --account=XXXX
#SBATCH --partition=gpgpu
#SBATCH --ntasks=4
#SBATCH --time=1:00:00
module load Amber/16-gompi-2017b-CUDA-mpi
mpiexec /usr/local/easybuild/software/Amber/16-gompi-2017b-CUDA-mpi/amber16/bin/pmemd.cuda_DPFP.MPI -O -i mdin -o mdout -inf mdinfo -x mdcrd -r restrt
#include <stdio.h>
void init(int *a, int N)
{
int i;
for (i = 0; i < N; ++i)
{
a[i] = i;
}
}
__global__
void doubleElements(int *a, int N)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
int stride = gridDim.x * blockDim.x;
/*
* The previous code (now commented out) attempted
* to access an element outside the range of `a`.
*/
// for (int i = idx; i < N + stride; i += stride)
for (int i = idx; i < N; i += stride)
{
a[i] *= 2;
}
}
bool checkElementsAreDoubled(int *a, int N)
{
int i;
for (i = 0; i < N; ++i)
{
if (a[i] != i*2) return false;
}
return true;
}
int main()
{
int N = 10000;
int *a;
size_t size = N * sizeof(int);
cudaMallocManaged(&a, size);
init(a, N);
/*
* The previous code (now commented out) attempted to launch
* the kernel with more than the maximum number of threads per
* block, which is 1024.
*/
size_t threads_per_block = 1024;
/* size_t threads_per_block = 2048; */
size_t number_of_blocks = 32;
cudaError_t syncErr, asyncErr;
doubleElements<<<number_of_blocks, threads_per_block>>>(a, N);
/*
* Catch errors for both the kernel launch above and any
* errors that occur during the asynchronous `doubleElements`
* kernel execution.
*/
syncErr = cudaGetLastError();
asyncErr = cudaDeviceSynchronize();
/*
* Print errors should they exist.
*/
if (syncErr != cudaSuccess) printf("Error: %s\n", cudaGetErrorString(syncErr));
if (asyncErr != cudaSuccess) printf("Error: %s\n", cudaGetErrorString(asyncErr));
bool areDoubled = checkElementsAreDoubled(a, N);
printf("All elements were doubled? %s\n", areDoubled ? "TRUE" : "FALSE");
cudaFree(a);
}
#include <stdio.h>
void init(int *a, int N)
{
int i;
for (i = 0; i < N; ++i)
{
a[i] = i;
}
}
__global__
void doubleElements(int *a, int N)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
int stride = gridDim.x * blockDim.x;
for (int i = idx; i < N + stride; i += stride)
{
a[i] *= 2;
}
}
bool checkElementsAreDoubled(int *a, int N)
{
int i;
for (i = 0; i < N; ++i)
{
if (a[i] != i*2) return false;
}
return true;
}
int main()
{
int N = 10000;
int *a;
size_t size = N * sizeof(int);
cudaMallocManaged(&a, size);
init(a, N);
size_t threads_per_block = 2048;
size_t number_of_blocks = 32;
cudaError_t syncErr, asyncErr;
doubleElements<<<number_of_blocks, threads_per_block>>>(a, N);
/*
* Catch errors for both the kernel launch above and any
* errors that occur during the asynchronous `doubleElements`
* kernel execution.
*/
syncErr = cudaGetLastError();
asyncErr = cudaDeviceSynchronize();
/*
* Print errors should they exist.
*/
if (syncErr != cudaSuccess) printf("Error: %s\n", cudaGetErrorString(syncErr));
if (asyncErr != cudaSuccess) printf("Error: %s\n", cudaGetErrorString(asyncErr));
bool areDoubled = checkElementsAreDoubled(a, N);
printf("All elements were doubled? %s\n", areDoubled ? "TRUE" : "FALSE");
cudaFree(a);
}
#include <stdio.h>
void init(int *a, int N)
{
int i;
for (i = 0; i < N; ++i)
{
a[i] = i;
}
}
__global__
void doubleElements(int *a, int N)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
int stride = gridDim.x * blockDim.x;
for (int i = idx; i < N + stride; i += stride)
{
a[i] *= 2;
}
}
bool checkElementsAreDoubled(int *a, int N)
{
int i;
for (i = 0; i < N; ++i)
{
if (a[i] != i*2) return false;
}
return true;
}
int main()
{
/*
* Add error handling to this source code to learn what errors
* exist, and then correct them. Googling error messages may be
* of service if actions for resolving them are not clear to you.
*/
int N = 10000;
int *a;
size_t size = N * sizeof(int);
cudaMallocManaged(&a, size);
init(a, N);
size_t threads_per_block = 2048;
size_t number_of_blocks = 32;
doubleElements<<<number_of_blocks, threads_per_block>>>(a, N);
cudaDeviceSynchronize();
bool areDoubled = checkElementsAreDoubled(a, N);
printf("All elements were doubled? %s\n", areDoubled ? "TRUE" : "FALSE");
cudaFree(a);
}
\ No newline at end of file
#include <stdio.h>
/*
* Refactor firstParallel so that it can run on the GPU.
*/
__global__ void firstParallel()
{
printf("This should be running in parallel.\n");
}
int main()
{
/*
* Refactor this call to firstParallel to execute in parallel
* on the GPU.
*/
firstParallel<<<5,5>>>();
/*
* Some code is needed below so that the CPU will wait
* for the GPU kernels to complete before proceeding.
*/
cudaDeviceSynchronize();
}
#include <stdio.h>
void init(int *a, int N)
{
int i;
for (i = 0; i < N; ++i)
{
a[i] = i;
}
}
__global__
void doubleElements(int *a, int N)
{
int i;
i = blockIdx.x * blockDim.x + threadIdx.x;
if (i < N)
{
a[i] *= 2;
}
}
bool checkElementsAreDoubled(int *a, int N)
{
int i;
for (i = 0; i < N; ++i)
{
if (a[i] != i*2) return false;
}
return true;
}
int main()
{
int N = 1000;
int *a;
size_t size = N * sizeof(int);
/*
* Use `cudaMallocManaged` to allocate pointer `a` available
* on both the host and the device.
*/
cudaMallocManaged(&a, size);
init(a, N);
size_t threads_per_block = 256;
size_t number_of_blocks = (N + threads_per_block - 1) / threads_per_block;
doubleElements<<<number_of_blocks, threads_per_block>>>(a, N);
cudaDeviceSynchronize();
bool areDoubled = checkElementsAreDoubled(a, N);
printf("All elements were doubled? %s\n", areDoubled ? "TRUE" : "FALSE");
/*
* Use `cudaFree` to free memory allocated
* with `cudaMallocManaged`.
*/
cudaFree(a);
}
#include <stdio.h>
/*
* Refactor firstParallel so that it can run on the GPU.
*/
__global__ void firstParallel()
{
printf("This should be running in parallel.\n");
}
int main()
{
/*
* Refactor this call to firstParallel to execute in parallel
* on the GPU.
*/
firstParallel<<<5,5>>>();
/*
* Some code is needed below so that the CPU will wait
* for the GPU kernels to complete before proceeding.
*/
cudaDeviceSynchronize();
}
#include <stdio.h>
/*
* Refactor firstParallel so that it can run on the GPU.
*/