Benchmarking Gridstore enterprise storage array (1)


Gridstore provides an alternative to traditional enterprise storage. Basic facts about Gridstore storage technology include:

  • It provides storage nodes implemented as 1 RU servers that function collectively as a single storage array.
  • Connectivity between the nodes and the storage consumers/compute nodes occurs over one or more 1 or 10 Gbps Ethernet connections.
  • NIC teaming can be setup on the Gridstore nodes to provide additional bandwidth and fault tolerance
  • It utilizes a virtual controller to present storage to Windows servers

IO testing tool and its settings is detailed in this post.

vLUNs can be easily created using the GridControl snap-in. This testing is done with a Gridstore array composed of 6 H-nodes. Click node details to see more.

Prior to testing array disk IO, I tested the availability of bandwidth on the Force 10 switch used. I used NTttcp Version 5.28 tool. One of the array nodes was the receiver:

GS-002

 

The HV-LAB-01 compute node was the sender:

GS-003

I configured the tool to use 4 processor cores only since the Gridstore storage nodes had only 4 cores.

The result was usable bandwidth of 8.951 Gbps (1,18.9 MB/s) – Testing was done using standard 1,500 MTU frames not 9,000 MTU jumbo frames.

Test details::

On the receiver Gridstore storage node:
C:\Support>ntttcp.exe -r -m 4,*,10.5.19.30 -rb 2M -a 16 -t 120
Copyright Version 5.28
Network activity progressing…
Thread Time(s) Throughput(KB/s) Avg B / Compl
====== ======= ================ =============
0 120.011 311727.158 60023.949
1 120.011 233765.293 53126.468
2 120.011 306670.676 56087.990
3 120.011 293592.705 52626.788
##### Totals: #####
Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)
================ =========== ============== ================
134280.569568 120.011 1457.709 1118.902
Throughput(Buffers/s) Cycles/Byte Buffers
===================== =========== =============
17902.435 3.864 2148489.113
DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)
============= ============= =============== ==============
17388.114 46.288 26563.098 30.300
Packets Sent Packets Received Retransmits Errors Avg. CPU %
============ ================ =========== ====== ==========
4634562 96592255 599 0 62.960

On the sender compute node: HV-LAB-05
C:\Support>ntttcp.exe -s -m 4,*,10.5.19.30 -rb 2M -a 16 -t 120
Copyright Version 5.28
Network activity progressing…
Thread Time(s) Throughput(KB/s) Avg B / Compl
====== ======= ================ =============
0 120.003 311702.607 65536.000
1 120.003 233765.889 65536.000
2 120.003 306669.667 65536.000
3 120.003 293592.660 65536.000
##### Totals: #####
Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)
================ =========== ============== ================
134268.687500 120.004 1457.441 1118.868
Throughput(Buffers/s) Cycles/Byte Buffers
===================== =========== =============
17901.895 2.957 2148299.000
DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)
============= ============= =============== ==============
25915.561 1.504 71032.291 0.549
Packets Sent Packets Received Retransmits Errors Avg. CPU %
============ ================ =========== ====== ==========
96601489 4677580 22698 1 7.228


Test 1:

Compute node(s): 1 physical machine with 2x Xeon E5-2430L CPUs at 2 GHz with 6 cores each (12 Logical processors) and 30 MB L3 cache, 96 GB RAM, 2x 10 Gbps NICs

GS-001

vLUN:

  • Protect Level: 0 (no fault tolerance, striped across 4 Gridstore nodes)
  • Optimized for: N/A
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (4 segments, 512 GB each)

GS-A05

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 10.43k IOPS

GS-A01

 

In the above image you can see the read/write activity to the 4 nodes that make up this vLUN listed under Network Activity in the Resource Monitor/Network tab.

GS-A02

At the same time, the 4 nodes that make up this vLUN showed average CPU utilization around 40%. This dropped down to 0% right after the test.

GS-A03

The 4 nodes’ memory utilization averaged around 25% during the test. It’s baseline is 20%

GS-A04

 


Test 2: The same single compute node above

vLUN:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive single node failure)
  • Optimized for: IOPS
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments, 1 TB each)

GS-B04

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 11.32k IOPS

GS-B01

 

GS-B02

 

GS-B03

 


Test 3: The same single compute node above

vLUN:

  • Protect Level: 1 (striped across 5 Gridstore nodes, fault tolerant to survive single node failure)
  • Optimized for: Throughput
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (5 segments, 512 GB each)

GS-C01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 9.28k IOPS
GS-C02

GS-C03

 

GS-C04


Test 4: The same single compute node above

vLUN:

  • Protect Level: 2 (striped across 6 Gridstore nodes, fault tolerant to survive 2 simultaneous node failures)
  • Optimized for: Throughput
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (6 segments, 512 GB each)

GS-D01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 4.56k IOPS

GS-D02

GS-D03

GS-D04

 


Test 5: The same single compute node above

2 vLUNs:

1. The same Grid Protection Level 1 vLUN from test 1 above with Platinum QoS setting +

2. Identical 2nd vLUN except that QoS is set to Gold:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive 1 node failure)
  • Optimized for: IOPS
  • QoS: Gold
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments,  1 TB each)

GS-E01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 10.52k IOPS

GS-E02

GS-E03

GS-E04

 


 

Test 6: The same single compute node above

3 vLUNs:

All the same:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive 1 node failure)
  • Optimized for: IOPS
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments,  1 TB each)

GS-F1

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 9.94k IOPS

GS-F6

GS-F5

GS-F4

GS-F3

GS-F2


 Summary:

GS-004

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s