add: networking and ceph cluster infra authored by Linh Vu's avatar Linh Vu
During 2019-07-[08-09] maintenance window for Spartan HPC, we ran extensive benchmarking with IO500 on our CephFS cluster. Here are the details.
## Infrastructure details
### Networking
Mellanox leaf switches SN2100 for Ceph nodes
Mellanox leaf switches SN2700 for client gpgpu nodes
Mellanox spine switches SN2700
2x100G from leaf to spine, 4x100G between spines
### Ceph cluster
RHEL 7.6, kernel-lt elrepo 4.4.135-1.el7.elrepo.x86_64, Mellanox OFED 4.3-3.0.2.1
* mon[1-5]: x10 cores Xeon v4 2.4GHz, 64GB of RAM, 2x25Gbe Mellanox
* mds[1-3]: 2 active, 1 standby, each is: 1x6-cores Xeon v4 3.4GHz, 512GB of RAM, 2x25Gbe Mellanox
* NLSAS data pool:
* 36 OSD nodes, 16 drives each (576 drives in total), mix of 8TB and 10TB NLSAS drives.
* Each node has 1xNVMe card Intel P3700 or Optane 900P for WAL (2GB) and RocksDB (10GB) per OSD.
* 1x10-cores Xeon v4 2.4GHz, 128GB of RAM, 2x25Gbe Mellanox
* Replicated 3:1 ratio
* Fullness: ~60%
* SSD data pool:
* 16 OSD nodes, 8 Sandisk BSSD 8TB drives each over 12Gb SAS (IF150 unit), 128 drives in total
* Each node has 2x NVMe cards (Optane 900P) for WAL (4GB) and RocksDB( 40GB) per OSD
* 2x16-cores Xeon v4 2.6GHz, 128GB of RAM, 2x25Gbe Mellanox
* Erasure Code 4:2 ratio
* Fullness: ~73%
* Metadata pool:
* On 10 of the 16 SSD OSD nodes
* Each node has 1x NVMe (Optane 900p 480GB) partitioned into 4, each becomes an OSD (so 40 NVMe OSDs in total)
## Compiling IO500
We run IO500 via Spartan Slurm, and compile it through Spartan modules.
......@@ -14,3 +45,4 @@ cd io-500-dev
./utilities/prepare.sh
```
## Preparing scripts