Benchmarking Gridstore enterprise storage array (2)
This is another post in a series of posts in the process of performance testing and benchmarking Gridstore enterprise storage array.
Gridstore Array components:
6x H-Nodes. Each has 1x Xeon E5-2403 processor at 1.8 GHz with 4 cores (no hyper-threading) and 10 MB L3 cache, 32 GB DDR3 1333 MHz DIMM, 4x 3TB 7200 RPM SAS disks and a 550 GB PCIe Flash card.
One compute node with 2x Xeon E5-2430L CPUs at 2 GHz with 6 cores each (12 Logical processors) and 15 MB L3 cache, 96 GB RAM
Pre-test network bandwidth verification:
Prior to testing array disk IO, I tested the availability of bandwidth on the Force 10 switch used. I used NTttcp Version 5.28 tool. One of the array nodes was the receiver:
The HV-LAB-01 compute node was the sender:
I configured the tool to use 4 processor cores only since the Gridstore storage nodes have only 4 cores.
The result was usable bandwidth of 8.951 Gbps (1,18.9 MB/s) – Testing was done using standard 1,500 MTU frames not 9,000 MTU jumbo frames.
8 vLUNs were configured for this test.
Each vLUN is configured as follows:
- Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive single node failure)
- Optimized for: IOPS
- QoS: Platinum
- Unmasked: to 1 server
- File system: NTFS
- Block size: 64 KB
- Size: 5 TB (3 segments, 2.5 TB each)
This configuration utilizes each all 24 disks in this grid. (6 nodes * 4 disks each = 24 disks = 8 vLUNs * 3 disks each). It provides optimum array throughput.
Intel’s IOMeter version 2006.07.27
24 workers, each configured to target all 8 vLUNs – 32 outstanding I/Os
IO profile: 50% read/50% write, 10% random, 8k alignment:
Test duration: 10 minutes
IOMeter showed upwards of 17.7k IOPS:
Disk performance details on the compute node:
CPU performance details on the compute node:
Network performance on one of the storage nodes:
Disk performance on one of the storage nodes:
CPU performance on one of the storage nodes:
Overall summary performance on one of the storage nodes:
CPU utilization on the storage nodes as shown from the GridControl snap-in:
Final test result:
and test details.
Conclusion and important points to note:
- Network utilization maxed out beyond the 10 Gbps single NIC used on both the compute and storage nodes. This suggests that the array is likely to deliver more IOPS if more network bandwidth is available. Next test will use 2 teamed NICs on the compute node as well as 3 storage nodes with teamed 10 Gbps NICs as well.
- CPU is maxed on the storage nodes during the test. Storage nodes have 4 cores. This suggests that CPU may be a bottleneck on storage nodes. It also leads me to believe that a) more processing power is needed on the storage nodes, and b) RDMA NICs are likely to enhance performance greatly. The Mellanox ConnectX-3 VPI dual port PCIe8 card may be just what the doctor ordered. In a perfect environment, I would have that coupled with the Mellanox Infiniband MSX6036F-1BRR 56Gbps Switch.
- Disk IO performance on the storage nodes during the test showed about 240 MB/s data transfer, or about 60 MB/s per each of the disks in the node. This corresponds to the native IO performance of the SAS disks. This suggests minimal/negligible boost from the 550 GB PCIe flash card in the storage node.