Commit 0d726dc8 authored by Research Platforms's avatar Research Platforms

Add Tensorflow and Digits

parent 72479b77
BLAST/dbs
BLAST/rat-ests
digits/digits.img
FreeSurfer/buckner_data
FreeSurfer/buckner_data-tutorial_subjs.tar.gz
FSL/intro
......
## TensorFlow Benchmark Example
This example runs the TensorFlow benchmarks (for V1.8) on the Spartan GPGPU partition. By default, it uses ResNet, a batch size of 64, and a whole node (4 GPUS and 24 CPUs), but this can be varied as needed.
As of 18 July 2018, this particular configuration was achieving about 730 images/second across 4 GPUs.
Benchmark: https://www.tensorflow.org/performance/benchmarks
Source: https://github.com/tensorflow/benchmarks/
benchmarks @ 3b90c14f
Subproject commit 3b90c14fb2bf02ca5d27c188aee878663229a0a7
#!/bin/bash
#SBATCH --nodes 1
#SBATCH --partition gpgpu
#SBATCH --gres=gpu:p100:4
#SBATCH --time 01:00:00
#SBATCH --cpus-per-task=24
module load Tensorflow/1.8.0-intel-2017.u2-GCC-6.2.0-CUDA9-Python-3.5.2-GPU
cd benchmarks/scripts/tf_cnn_benchmarks
python tf_cnn_benchmarks.py --num_gpus=4 --batch_size=64 --model=resnet50 --variable_update=parameter_server
## TensorFlow Example
This is a very simple example which shows how to use TensorFlow with the Spartan GPGPU partition. It requests a single CPU and NVidia P100 GPU, multiplies together two small matrices on the GPU, and prints the result. It will also print a little debug info showing that the calculation is being performed on the GPU (rather than CPU).
It can be submitted with the command `sbatch tensor_flow.slurm`.
You'll need access to the GPGPU partition before this example will work, see https://dashboard.hpc.unimelb.edu.au/gpu/ for details.
N.B. If you belong to multiple projects, and the default one doesn't have access to the gpgpu partition, you might have to explictly specify the project with `sbatch -A <project name> tensor_flow.slurm`.
This example is based on: https://www.tensorflow.org/guide/using_gpu
# Based on https://www.tensorflow.org/guide/using_gpu
import tensorflow as tf
# Creates a graph -- force it to run on the GPU
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
#!/bin/bash
#SBATCH --nodes 1
#SBATCH --partition gpgpu
#SBATCH --gres=gpu:p100:1
#SBATCH --time 00:05:00
#SBATCH --cpus-per-task=1
module load Tensorflow/1.8.0-intel-2017.u2-GCC-6.2.0-CUDA9-Python-3.5.2-GPU
python tensor_flow.py
## DIGITS Spartan Example
DIGITS is a deep-learning package from Nvidia with a web-based GUI. This example shows you how to run it on Spartan.
1. As DIGITS makes use of GPUs, you'll first need access to our GPGPU partition. See: https://dashboard.hpc.unimelb.edu.au/gpu/
2. Submit the job using `sbatch digits.slurm`. The example uses a whole node with 4 GPUs, with a wall time of 2 hours, but you can adjust to suit your needs.
3. Check the job status using `squeue -u your_username`. Once it starts (which might take some time if the queue is busy), take note of the node your job is running on, e.g. `spartan-gpgpu025`
4. As this is an interactive web application, we need to open a tunnel to the compute node so we can interact with it. You can do this with: `ssh -vNL 5000:spartan-gpgpu025:5000 your_username@spartan.hpc.unimelb.edu.au`
5. Navigate to `localhost:5000` in your browser, and start playing with DIGITS.
N.B. DIGITS is running in a container, which means that the filesystem available to DIGITS will vary from that of the host (i.e. Spartan). Your home directory (i.e. `/home/your_username`) is mapped across however, so you can access your training data and models from there.
#!/bin/bash
#SBATCH --nodes 1
#SBATCH --cpus-per-task=12
#SBATCH --partition gpgpu
#SBATCH --gres=gpu:4
#SBATCH --time 02:00:00
module load Singularity
singularity exec --nv -B /tmp:/jobs -B /tmp:/scratch digits.img bash -c "export DIGITS_JOBS_DIR=/jobs && python -m digits"
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment