... | ... | @@ -58,3 +58,76 @@ cd io-500-dev |
|
|
```
|
|
|
|
|
|
## Preparing scripts
|
|
|
|
|
|
### Slurm
|
|
|
|
|
|
We run IO500 benchmark through slurm. The slurm script looks like this:
|
|
|
|
|
|
mytest_10n_16t_ssd.sh
|
|
|
```bash
|
|
|
#!/bin/bash
|
|
|
|
|
|
#SBATCH -p debug
|
|
|
#SBATCH -w spartan-gpgpu[001-002,013-014,024-025,035-036,047-048]
|
|
|
#SBATCH --ntasks=160
|
|
|
#SBATCH --tasks-per-node=16
|
|
|
#SBATCH --mem=100G
|
|
|
#SBATCH --cpus-per-task=1
|
|
|
#SBATCH --time=06:00:00
|
|
|
|
|
|
module load OpenMPI/3.1.3-GCC-6.2.0-ucx
|
|
|
|
|
|
./io500_10n_16t_ssd.sh
|
|
|
```
|
|
|
|
|
|
And is called by running: `sbatch mytest_10n_16t_ssd.sh`
|
|
|
|
|
|
Output of IO500 is captured in `slurm-${job_id}.out` in the same directory.
|
|
|
|
|
|
`io500_10n_16t_ssd.sh` is a copy of the provided `io500.sh` with parameters modified as officially instructed, details below.
|
|
|
|
|
|
### Tests
|
|
|
|
|
|
Each test has the following IO500 runs:
|
|
|
|
|
|
```bash
|
|
|
io500_run_ior_easy="True" # does the write phase and enables the subsequent read
|
|
|
io500_run_md_easy="True" # does the creat phase and enables the subsequent stat
|
|
|
io500_run_ior_hard="True" # does the write phase and enables the subsequent read
|
|
|
io500_run_md_hard="True" # does the creat phase and enables the subsequent read
|
|
|
io500_run_find="True"
|
|
|
io500_run_ior_easy_read="True"
|
|
|
io500_run_md_easy_stat="True"
|
|
|
io500_run_ior_hard_read="True"
|
|
|
io500_run_md_hard_stat="True"
|
|
|
io500_run_md_hard_read="True"
|
|
|
io500_run_md_easy_delete="True" # turn this off if you want to just run find by itself
|
|
|
io500_run_md_hard_delete="True" # turn this off if you want to just run find by itself
|
|
|
io500_run_mdreal="False" # this one is optional
|
|
|
io500_cleanup_workdir="False" # this flag is currently ignored. You'll need to clean up your data files manually if you want to.
|
|
|
io500_stonewall_timer=300 # Stonewalling timer, stop with wearout after 300s with default test, set to 0, if you never want to abort...
|
|
|
io500_find_mpi="True"
|
|
|
```
|
|
|
|
|
|
Note: we use the parallel mpi find command.
|
|
|
|
|
|
We run each test twice, once on NLSAS, once on SSD. The tests are:
|
|
|
|
|
|
| Tests | 2n1t | 4n8t | 10n16t | 32n1t |
|
|
|
| ----------------------- | ------ | ------ | ------- | ------ |
|
|
|
| Clients | 2 | 4 | 10 | 32 |
|
|
|
| Threads per client | 1 | 8 | 16 | 1 |
|
|
|
| mpirun args | -np 2 | -np 32 | -np 160 | -np 32 |
|
|
|
| ior_easy_size per t | 200G | 20G | 20G | 200G |
|
|
|
| ior_easy_size total | 400G | 640G | 3.2T | 6.4T |
|
|
|
| ior_easy bs | 1M | 1M | 1M | 1M |
|
|
|
| mdtest easy files per t | 600K | 12.5K | 12.5K | 600K |
|
|
|
| mdtest easy files total | 1200K | 400K | 2000K | 19200K |
|
|
|
| ior hard writes per t | 100K | 10K | 25K | 100K |
|
|
|
| ior hard writes total | 200K | 320K | 4000K | 3200K |
|
|
|
| mdtest hard files per t | 500K | 62.5K | 100K | 500K |
|
|
|
| mdtest hard files total | 1000K | 2000K | 16000K | 16000K |
|
|
|
|
|
|
|
|
|
|
|
|
2n1t was chosen to fit a simple MPI job. 4n8t was requested by HPC team to match one of their bigger jobs. 10n16t is to match with certain known IO500 submissions from vendors. 32n1t is to match some non-IO500 benchmarks done by vendors. |
|
|
\ No newline at end of file |