Posts tagged “disk IO benchmarking

Get-Busy.ps1 powershell script to create files on many PCs and collect metrics


This script uses Busy.ps1 which is a script that I posted earlier. This script can be downloaded from the Microsoft TechNet Gallery. To use this script you need to edit the 4 data entry lines on top:

GetBusy

  • $WorkFolder = “e:\support” # Folder on each VM where test files will be created
  • $MaxSpaceToUseOnDisk = 20GB # Maximum amount of disk space to be used on each VM’s $WorkFolder during testing
  • $VMPrefix = “V-2012R2-LAB” # We’ll use that to scope this script (only running VMs with name matching this Prefix will be processed)
  • $LocalAdmin = “administrator” # This is the local admin account on the VMs. You can also use a domain account here if you like.

The script requires 1 positional parameter to indicate whether you wish to start the testing on the VMs or stop it. For example to start:

.\get-busy.ps1 start

GS-017e40

 To end the testing, use:

 .\get-busy.ps1 stop

The script will reach out to the VMs being tested and stop the busy.ps1 script, collect the test results, and cleanup the $Workfolder on each VM.

The script generates 2 log files:

  • A general log file that contains the messages displayed on the screen about each action attempted.
  • A CSV file that contains the compiled test results from all the CSV files generated by each busy.ps1 script on each of the tested VMs

Here’s an example of the compiled CSV file Get-busy_20140714_071428PM.csv


Benchmarking Gridstore enterprise storage array (1)


Gridstore provides an alternative to traditional enterprise storage. Basic facts about Gridstore storage technology include:

  • It provides storage nodes implemented as 1 RU servers that function collectively as a single storage array.
  • Connectivity between the nodes and the storage consumers/compute nodes occurs over one or more 1 or 10 Gbps Ethernet connections.
  • NIC teaming can be setup on the Gridstore nodes to provide additional bandwidth and fault tolerance
  • It utilizes a virtual controller to present storage to Windows servers

IO testing tool and its settings is detailed in this post.

vLUNs can be easily created using the GridControl snap-in. This testing is done with a Gridstore array composed of 6 H-nodes. Click node details to see more.

Prior to testing array disk IO, I tested the availability of bandwidth on the Force 10 switch used. I used NTttcp Version 5.28 tool. One of the array nodes was the receiver:

GS-002

 

The HV-LAB-01 compute node was the sender:

GS-003

I configured the tool to use 4 processor cores only since the Gridstore storage nodes had only 4 cores.

The result was usable bandwidth of 8.951 Gbps (1,18.9 MB/s) – Testing was done using standard 1,500 MTU frames not 9,000 MTU jumbo frames.

Test details::

On the receiver Gridstore storage node:
C:\Support>ntttcp.exe -r -m 4,*,10.5.19.30 -rb 2M -a 16 -t 120
Copyright Version 5.28
Network activity progressing…
Thread Time(s) Throughput(KB/s) Avg B / Compl
====== ======= ================ =============
0 120.011 311727.158 60023.949
1 120.011 233765.293 53126.468
2 120.011 306670.676 56087.990
3 120.011 293592.705 52626.788
##### Totals: #####
Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)
================ =========== ============== ================
134280.569568 120.011 1457.709 1118.902
Throughput(Buffers/s) Cycles/Byte Buffers
===================== =========== =============
17902.435 3.864 2148489.113
DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)
============= ============= =============== ==============
17388.114 46.288 26563.098 30.300
Packets Sent Packets Received Retransmits Errors Avg. CPU %
============ ================ =========== ====== ==========
4634562 96592255 599 0 62.960

On the sender compute node: HV-LAB-05
C:\Support>ntttcp.exe -s -m 4,*,10.5.19.30 -rb 2M -a 16 -t 120
Copyright Version 5.28
Network activity progressing…
Thread Time(s) Throughput(KB/s) Avg B / Compl
====== ======= ================ =============
0 120.003 311702.607 65536.000
1 120.003 233765.889 65536.000
2 120.003 306669.667 65536.000
3 120.003 293592.660 65536.000
##### Totals: #####
Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)
================ =========== ============== ================
134268.687500 120.004 1457.441 1118.868
Throughput(Buffers/s) Cycles/Byte Buffers
===================== =========== =============
17901.895 2.957 2148299.000
DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)
============= ============= =============== ==============
25915.561 1.504 71032.291 0.549
Packets Sent Packets Received Retransmits Errors Avg. CPU %
============ ================ =========== ====== ==========
96601489 4677580 22698 1 7.228


Test 1:

Compute node(s): 1 physical machine with 2x Xeon E5-2430L CPUs at 2 GHz with 6 cores each (12 Logical processors) and 30 MB L3 cache, 96 GB RAM, 2x 10 Gbps NICs

GS-001

vLUN:

  • Protect Level: 0 (no fault tolerance, striped across 4 Gridstore nodes)
  • Optimized for: N/A
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (4 segments, 512 GB each)

GS-A05

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 10.43k IOPS

GS-A01

 

In the above image you can see the read/write activity to the 4 nodes that make up this vLUN listed under Network Activity in the Resource Monitor/Network tab.

GS-A02

At the same time, the 4 nodes that make up this vLUN showed average CPU utilization around 40%. This dropped down to 0% right after the test.

GS-A03

The 4 nodes’ memory utilization averaged around 25% during the test. It’s baseline is 20%

GS-A04

 


Test 2: The same single compute node above

vLUN:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive single node failure)
  • Optimized for: IOPS
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments, 1 TB each)

GS-B04

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 11.32k IOPS

GS-B01

 

GS-B02

 

GS-B03

 


Test 3: The same single compute node above

vLUN:

  • Protect Level: 1 (striped across 5 Gridstore nodes, fault tolerant to survive single node failure)
  • Optimized for: Throughput
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (5 segments, 512 GB each)

GS-C01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 9.28k IOPS
GS-C02

GS-C03

 

GS-C04


Test 4: The same single compute node above

vLUN:

  • Protect Level: 2 (striped across 6 Gridstore nodes, fault tolerant to survive 2 simultaneous node failures)
  • Optimized for: Throughput
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (6 segments, 512 GB each)

GS-D01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 4.56k IOPS

GS-D02

GS-D03

GS-D04

 


Test 5: The same single compute node above

2 vLUNs:

1. The same Grid Protection Level 1 vLUN from test 1 above with Platinum QoS setting +

2. Identical 2nd vLUN except that QoS is set to Gold:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive 1 node failure)
  • Optimized for: IOPS
  • QoS: Gold
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments,  1 TB each)

GS-E01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 10.52k IOPS

GS-E02

GS-E03

GS-E04

 


 

Test 6: The same single compute node above

3 vLUNs:

All the same:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive 1 node failure)
  • Optimized for: IOPS
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments,  1 TB each)

GS-F1

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 9.94k IOPS

GS-F6

GS-F5

GS-F4

GS-F3

GS-F2


 Summary:

GS-004


Benchmarking enterprise storage


I will be bench marking and testing different use cases for new emerging enterprise storage platforms. This post outlines the standardized benchmark testing that will be used.

Benchmark tool: Intel’s IOMeter version 2006.07.27

Settings:

  • Max disk size: 2,048,000 (8 GB iobw.tst file is generated at the root of the tested drive)
    x16c
  • Starting disk sector: 0
  • # of Outstanding I/Os: 32 (important)
  • IO profile: 32k; 50% read; (50% write)
    x16d
  • Run time: 10 minutes