GridStore

Azure, Veeam, and Gridstore, a match made in heaven!?


Microsoft Azure currently provides the best quality public cloud platform available. In a 2013 industry report benchmark comparison of performance, availability and scalability, Azure came out on top in terms of read, and write performance of Blob storage. 

AzurePS18

AzurePS19Veeam is a fast growing software company that provides a highly scalable, feature-rich, robust backup and replication solution that’s built for virtualized workloads from the ground up including VMWare and Hyper-V virtual workloads. Veeam has been on the cutting edge of backup and replication technologies with unique features like SureBackup/verified protection, Virtual labs, Universal Application Item Recovery, and well-developed reporting. Not to mention a slew of ‘Explorers‘ like SQL, Exchange, SharePoint, and Active Directory Explorer. Veeam Cloud Connect is a feature added in version 8 in December 2014 that allows independent public cloud providers the ability to provide off-site backup for Veeam clients.

AzurePS20Gridstore provides a hardware storage array optimized for Microsoft workloads. At its core, the array is composed of standard rack-mount servers (storage nodes) running Windows OS and Gridstore’s proprietary vController which is a driver that uses local storage on the Gridstore node and presents it to storage-consuming servers over IP.

The problem:

Although a single Azure subscription can have 100 Storage Accounts, each can have 500 TB of Blob storage, a single Azure VM can have a limited number of data disks attached to it. Azure VM disks are implemented as Page Blobs which have a 1TB limit as of January 2015. As a result, an Azure VM can have a maximum of 32 TB of attached storage.

AzurePS21

Consequently, an Azure VM is currently not fully adequate for use as a Veeam Cloud Connect provider for Enterprise clients who typically need several hundred terabytes of offsite DR/storage.

Possible solution:

If Gridstore is to use Azure VMs as storage nodes, the following architecture may provide the perfect solution to aggregate Azure storage:

(This is a big IF. To my knowledge, Gridstore currently do not offer their product as a software, but only as an appliance)

AzurePS24

  • 6 VMs to act as Gridstore capacity storage nodes. Each is a Standard A4 size VM with 8 cores, 14 GB RAM, and 16x 1TB disks. I picked Standard A4 to take advantage of a 500 IOPS higher throttle limit per disk as opposed to 300 IO/disk for A4 Basic VM.
  • A single 80 TB Grid Protect Level 1 vLUN can be presented from the Gridstore array to a Veeam Cloud Connect VM. This will be striped over 6 nodes and will survive a single VM failure.
  • I recommend having a maximum of 40 disks in a Storage Account since a Standard Storage account has a maximum 20k IOPS.
  • One A3 VM to be used for Veeam Backup and Replication 8, its SQL 2012 Express, Gateway, and WAN Accelerator. The WAN Accelerator cache disk can be built as a single simple storage space using 8x 50 GB disks, 8-columns providing 480 MB/s or 4K IOPS. This VM can be configured with 2 vNICs which a long awaited feature now available to Azure VMs.
  • Storage capacity scalability: simply add more Gridstore nodes. This can scale up to several petabytes.

Azure cost summary:

In this architecture, Azure costs can be summarized as:

AzurePS25

That’s $8,419/month for 80 TB payload, or $105.24/TB/month. $4,935 (59%) of which is for Page Blob LRS 96.5 TB raw storage at $0.05/GB/month, and $3,484 (41%) is for compute resources. The latter can be cut in half if Microsoft offers 16 disks for Standard A3 VMs instead of a maximum of 8.

This does not factor in Veeam or Gridstore costs.

Still, highly scalable, redundant, fast storage at $105/TB/month is pretty competitive.


Using Powershell to create many Hyper-V Virtual Machines on many LUNs asynchronously


In a prior post I put out a powershell script to create many Hyper-V virtual machines on multiple LUNs using the Create-VM.ps1 script. That script ran in a single thread creating one VM before moving on to the next. In another post I manually ran 8 scripts to achieve the goal of running multiple jobs at the same time. In this script I use Start-Job with ScriptBlock and -ArgumentList to run multiple jobs at the same time.

GS-015j

The script can be downloaded from the Microsoft TechNet Gallery.

Hardware:

Compute node (Hyper-V host): 2x Xeon E5-2430L CPUs at 2 GHz with 6 cores each (12 Logical processors each) and 15 MB L3 cache, 96 GB RAM, 2x 10Gbps NICs that are not RDMA capable, setup in a NIC team (Teaming mode: Switch Independent, Load balancing mode: Hyper-V Port). Storage is 4x LUNs from a 3-node Gridstore array. Each LUN is configured as IOPS 2+1 LUN. Each Gridstore storage node has  1x Xeon E5-2403 processor at 1.8GHz with 4 cores (no hyper-threading) and 10 MB L3 cache, 32GB DDR3 1333 MHz DIMM, 4x 3TB 7200 RPM SAS disks, a 550GB PCIe Flash card, and 2x 10Gbps NICs that are not RDMA capable, setup in a NIC team (Broadcom Smart Load Balancing and Failover = switch-independent, no LACP support needed on switch).

The script took 9 minutes and 45 seconds to create the 40 VMs. During that time the Hyper-V host resource monitor showed:

GS-015a

GS-015b

GridControl snap-in showed:

GS-015c

GS-015d

GS-015e

I cal also see script logs piling up during script execution:

GS-015f

I started the 40 VMs manually after the script finished:

GS-015g

This excel file is based on the script output summary CSV file.

GS-015h

GS-015i


Conclusion and important points to note:

  • Hyper-V performance summary: No CPU or memory bottleneck observed during the test.
  • Array performance summary:
    • Files copies: 40
    • File size: 8.93GB
    • Concurrent operations: 4 copy operations at the same time
    • Total vLUNs used: 4
    • Average file copy duration: 10.18 seconds
    • Average throughput: 902.86 MB/s (6 Gbps)
    • Using the formula IOPS = BytesPerSec / TransferSizeInBytes
      LUNs are formatted as 64 KB blocks
      Average IOPS = (902.86*1024)/64 = 14.45k IOPS
  • Although 20 Gbps aggregate bandwidth is available to each of the compute node and the 3 storage nodes, I have not been able to produce network traffic above 10 Gbps.
  • CPU on the storage nodes was around 90% during the copy. The storage nodes can benefit from additional processing capacity.

Using Powershell to create many Hyper-V Virtual Machines in parallel on Gridstore array


In a prior post I put out a powershell script to create many Hyper-V virtual machines on multiple LUNs using the Create-VM.ps1 script. That script ran in a single thread creating one VM before moving on to the next. In this test of the Gridstore array I ran 8 scripts at the same time in parallel.

GS-013a

# Script to create 8 test VMs on a specific LUN on current HyperV Host at a fixed point in time.
# Uses Create-VM.ps1 from https://superwidgets.wordpress.com/category/powershell/
# Sam Boutros – 7/1/2014
#
$GoTime = 40 # minute
$VMPrefix = “V-2012R2-LABk”
$VMrootPath = “k:\VMs”
$NumberofVMs = 8 # Per target LUN
#
# Common variables
$VMG = 2
$VMMemoryType = “Dynamic”
$VMStartupRAM = 1GB
$VMminRAM = 512MB
$VMmaxRAM = 1GB
$vSwitch = “Gridstore_vSwitch”
$VMCores = 2
$VLAN = 19
$AdditionalDisksTotal = 2
$AdditionalDisksSize = 1TB
$GoldenImageDiskPath = “E:\Golden\V-2012R2-3-C.VHDX”
$CSV = (Get-Location).path + “\IOPS_” + (Get-Date -format yyyyMMdd_hhmmsstt) + “.csv”
#
Write-Output “Waiting to start at the $GoTime minute”
do {start-sleep 1} until ((Get-Date).Minute -eq $GoTime) # Wait until GoTime
For ($j=1; $j -lt $NumberofVMs+1; $j++) {
$VMName = $VMPrefix + $j
$VMFolder = $VMRootPath + “\” + $VMName
.\Create-VM.ps1 -VMName $VMName -VMFolder $VMFolder -VMG $VMG -VMMemoryType $VMMemoryType -VMStartupRAM $VMStartupRAM -VMminRAM $VMminRAM -VMmaxRAM $VMmaxRAM -vSwitch $vSwitch -VMCores $VMCores -VLAN $VLAN -AdditionalDisksTotal $AdditionalDisksTotal -AdditionalDisksSize $AdditionalDisksSize -GoldenImageDiskPath $GoldenImageDiskPath -CSV $CSV
Start-VM -Name $VMName
}

The other 7 scripts were identical, except for the last letter of the $VMPrefix and the drive letter of the $VMRootPath.

The line do {start-sleep 1} until ((Get-Date).Minute -eq $GoTime) ensures that all 8 scripts kick off within 1 second of each other, essentially starting at the same time.

GS-013b

GS-013c

The Hyper-V host was clearly stressed:

GS-013d

GS-013e

GS-013f

The GridControl snap-in showed CPU utilization on the Gridstore storage nodes was holding around 60%:

GS-013g

Write process on the storage nodes hits all 6 nodes:

GS-013h

Read requests hit 4 storage nodes:

GS-013i

Here’s a copy of the metrics summary script output.

One of the 8 scripts had an error:

GS-013j

GS-013k

The error was that the script tried to create VM too fast. The interesting thing here is the magenta error message in this screen shot. It’s from an error handling code in the Create-VM.ps1 script. The script log file Create-VM_V-2012R2-LABk2_20140701_034214PM.txt showed:

2014.07.01 03:42:14 PM: Starting with paramters:
2014.07.01 03:42:14 PM:
2014.07.01 03:42:14 PM: VMName: V-2012R2-LABk2
2014.07.01 03:42:14 PM: VMFolder: k:\VMs\V-2012R2-LABk2
2014.07.01 03:42:14 PM: vSwitch: Gridstore_vSwitch
2014.07.01 03:42:14 PM: GoldenImageDiskPath: E:\Golden\V-2012R2-3-C.VHDX
2014.07.01 03:42:14 PM: VMG: 2
2014.07.01 03:42:14 PM: VMMemoryType: Dynamic
2014.07.01 03:42:14 PM: VMStartupRAM: 1073741824 – VMStartupRAM (calculated):
2014.07.01 03:42:14 PM: VMminRAM: 536870912 – VMminRAM (calculated):
2014.07.01 03:42:14 PM: VMmaxRAM: 1073741824 – VMmaxRAM (calculated):
2014.07.01 03:42:14 PM: VMCores: 2 – VMCores (calculated):
2014.07.01 03:42:14 PM: VLAN: 19 – VLAN (Calculated):
2014.07.01 03:42:14 PM: AdditionalDisksTotal: 2 – AdditionalDisksTotal (calculated):
2014.07.01 03:42:14 PM: AdditionalDisksSize: 1099511627776 – AdditionalDisksSize (Calculated): 1099511627776
2014.07.01 03:42:14 PM: .
2014.07.01 03:42:14 PM: Creating VM with the following paramters:
2014.07.01 03:42:14 PM: VM Name: ‘V-2012R2-LABk2’
2014.07.01 03:42:14 PM: VM Folder: ‘k:\VMs\V-2012R2-LABk2’
2014.07.01 03:42:14 PM: VM Generation: ‘2’
2014.07.01 03:42:14 PM: VM Memory: Dynamic, Startup memory: 1,024 MB, Minimum memory: 512 MB, Maximum memory: 1,024 MB
2014.07.01 03:42:14 PM: vSwitch Name: ‘Gridstore_vSwitch’
2014.07.01 03:42:14 PM: VM CPU Cores: ‘2’
2014.07.01 03:42:14 PM: Copying disk image k:\VMs\V-2012R2-LABk2\V-2012R2-LABk2-C.VHDX from goldem image E:\Golden\V-2012R2-3-C.VHDX
2014.07.01 03:42:41 PM: Copy time = 0 : 27 : 762 (Minutes:Seconds:Milliseconds) – 00:00:26.7623983 Milliseconds
2014.07.01 03:42:41 PM: File k:\VMs\V-2012R2-LABk2\V-2012R2-LABk2-C.VHDX size: 9,148 MB (9592373248 bytes)
2014.07.01 03:42:41 PM: Throughput: 342 MB/s (341.822877660408 MB/s)
2014.07.01 03:42:41 PM: or: 2,735 Mbps (2734.58302128326 Mbps)
2014.07.01 03:43:32 PM:
2014.07.01 03:43:32 PM: Error: Failed to create VM ‘V-2012R2-LABk2’.. stopping
2014.07.01 03:43:32 PM: Error details: The operation cannot be performed while the virtual machine is in its current state.[The operation cannot be performed while the virtual machine is in its current state..count-1]

Windows event viewer search showed:

GS-013l

So it appears that this “k” script working to create 8 VMs on drive “K” sent New-VM request at 3:43:32 (line 303 of CreateVM.ps1). That request took 11 seconds to complete as shown in event ID 13002 above. In the mean time, the script’s subsequent error checking lines 304, 305 detected that the VM has not bean created and aborted execution as designed.

On another note, I’m sure there are better ways to run powershell commands or scripts in parallel. I’ve looked into The [RunspaceFactory] and [PowerShell] Accelerators while running powershell in Multithreaded Apartment (powerhsell.exe -mta). I’ve also tried Updated Type Accelerator functions for PowerShell 3.0 and 4.0. I have tried the Start-Job/Get-Job cmdlets with script blocks, but had hard time passing parameters to Create-VM.ps1 with $ScriptBlock and -ArgumentList. I’ve also looked into ForEach-Parallel powershell workflow command. I simply needed to get this testing done quickly and did not have time to find a better way to run tasks in parallel in powershell. I welcome any comments or suggestions on better ways to run powershell scripts in parallel.


Conclusion and important points to note:

  • This set of scripts ran tasks that attempted to create 64 VMs on 8 Gridstore vLUNs. It did so by running 8 scripts in parallel, each script attempted to create 8 VMs on a different vLUN. 55 VMs were created successfully. Script failures were tracked down to busy/overwhelmed physical host.
  • Similar to the last test, each of the 8 vLUNs is configured as an IOPS (2+1) LUN
  • Network bandwidth utilization was completely maxed out on the testing Hyper-V host (10 Gbps Ethernet) I will use NIC teams to provide 20 Gbps Ethernet bandwidth
  • Storage nodes’ CPU utilization was at around 60% in this test, which is not a bottleneck.
  • This test is essentially a disk copy test as well and not a Hyper-V virtual machine performance test.
  • Summary comparison between this and the last test:

GS-013m

The throughput values under Gbps and GB/Min. are meant to compare parallel versus serial execution testing and not as metrics for the storage array, because they include time to create and configure virtual machines in addition to time taken by disk operations.


Using Powershell to create 64 Hyper-V Virtual Machines on Gridstore array


This script uses another script Create-VM.PS1 from a prior post. It creates 64 Hyper-V virtual machines on 8 different LUNs. It creates them synchronously (one after the other). The 8 LUNs are on a 6 node GridStore array using H-nodes. Details of the Gridstore array setup are in this post.

GS-012i

# Script to create 64 test VMs on 8 diffent LUNs on current HyperV Host
# Uses Create-VM.ps1 from https://superwidgets.wordpress.com/category/powershell/
# Sam Boutros – 7/1/2014
#
$VMPrefix = “V-2012R2-LAB”
$VMG = 2
$VMMemoryType = “Dynamic”
$VMStartupRAM = 1GB
$VMminRAM = 512MB
$VMmaxRAM = 1GB
$vSwitch = “Gridstore_vSwitch”
$VMCores = 2
$VLAN = 19
$AdditionalDisksTotal = 2
$AdditionalDisksSize = 1TB
$GoldenImageDiskPath = “E:\Golden\V-2012R2-3-C.VHDX”
$CSV = (Get-Location).path + “\IOPS_” + (Get-Date -format yyyyMMdd_hhmmsstt) + “.csv”
$VMFirstNumber = 1 # Starting number
$TargetLUNs = @(“e”,”g”,”h”,”i”,”j”,”k”,”l”,”m”)
$NumberofVMs = 8 # Per target LUN
#
foreach ($LUN in $TargetLUNs) {
$VMrootPath = $LUN + “:\VMs”
For ($j=$VMFirstNumber; $j -lt $NumberofVMs+$VMFirstNumber; $j++) {
$VMName = $VMPrefix + $j
$VMFolder = $VMRootPath + “\” + $VMName
.\Create-VM.ps1 -VMName $VMName -VMFolder $VMFolder -VMG $VMG -VMMemoryType $VMMemoryType -VMStartupRAM $VMStartupRAM -VMminRAM $VMminRAM -VMmaxRAM $VMmaxRAM -vSwitch $vSwitch -VMCores $VMCores -VLAN $VLAN -AdditionalDisksTotal $AdditionalDisksTotal -AdditionalDisksSize $AdditionalDisksSize -GoldenImageDiskPath $GoldenImageDiskPath -CSV $CSV
Start-VM -Name $VMName
}
$VMFirstNumber += $NumberofVMs
}

During the script execution the Hyper-V resource monitor showed the following:

GS-012a

GS-012b

The GridControl snap-in showed:

GS-012c

GS-012d

GS-012e

I can also see the VMs popping up in Hyper-V Manager:

GS-012f

and their files in the file system:

GS-012g

GS-012h

This file is derived from the script’s CSV file output.. It shows:

GS-012j
9 GB file average copy time: 10.83 seconds
GS-012k

9 GB file copy average throughput 852.5 MB/s

GS-012l

9 GB file copy average throughput 6,820.1 Gbps


Conclusion and important points to note:

  • Script ran in serial, creating 1 VM on 1 LUN, and after that’s done moving on to the next VM on the same LUN, then moving on to the next LUN.
  • Each LUN is configured as an IOPS (2+1) LUN. So, every write process is writing to 3 disks out of the array 24 total disks (whereas a read process reads from 2 disks in the LUN). Additional throughput is likely to be achieved by testing scenarios where we’re writing to all 8 LUNs simultaneously hitting 18 of the array disks at the same time.  
  • Network bandwidth utilization is at about 68.2% capacity of the 10 Gbps Ethernet used in this test. for the next test (in parallel) I will use NIC teams to provide 20 Gbps Ethernet bandwidth
  • Storage nodes’ CPU utilization was at around 50% in this test, which is not a bottleneck.
  • This test is essentially a disk copy test and not a Hyper-V virtual machine performance test. Hyper-V testing with Gridstore will be shown in a future post.
  • Using the formula IOPS = BytesPerSec / TransferSizeInBytes
    LUNs are formatted as 64 KB blocks
    Average IOPS = (852.5*1024)/64 = 13.64k IOPS

Benchmarking Gridstore enterprise storage array (2)


This is another post in a series of posts in the process of performance testing and benchmarking Gridstore enterprise storage array.

Gridstore Array components:

6x H-Nodes. Each has 1x Xeon E5-2403 processor at 1.8 GHz with 4 cores (no hyper-threading) and 10 MB L3 cache, 32 GB DDR3 1333 MHz DIMM, 4x 3TB 7200 RPM SAS disks and a 550 GB PCIe Flash card.

GS-009k

Testing environment:

One compute node with 2x Xeon E5-2430L CPUs at 2 GHz with 6 cores each (12 Logical processors) and 15 MB L3 cache, 96 GB RAM

Pre-test network bandwidth verification:

Prior to testing array disk IO, I tested the availability of bandwidth on the Force 10 switch used. I used NTttcp Version 5.28 tool. One of the array nodes was the receiver:

GS-002

The HV-LAB-01 compute node was the sender:

GS-003

I configured the tool to use 4 processor cores only since the Gridstore storage nodes have only 4 cores.

The result was usable bandwidth of 8.951 Gbps (1,18.9 MB/s) – Testing was done using standard 1,500 MTU frames not 9,000 MTU jumbo frames.


vLUNs:

8 vLUNs were configured for this test.

GS-009l

Each vLUN is configured as follows:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive single node failure)
  • Optimized for: IOPS
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 64 KB
  • Size: 5 TB (3 segments, 2.5 TB each)

This configuration utilizes each all 24 disks in this grid. (6 nodes * 4 disks each = 24 disks = 8 vLUNs * 3 disks each). It provides optimum array throughput.


Testing tool:

Intel’s IOMeter version 2006.07.27

24 workers, each configured to target all 8 vLUNs – 32 outstanding I/Os

GS-009m

IO profile: 50% read/50% write, 10% random, 8k alignment:

GS-009n

Test duration: 10 minutes


Test result:

IOMeter showed upwards of 17.7k IOPS:

GS-009a

Disk performance details on the compute node:

GS-009b

CPU performance details on the compute node:

GS-009c

Network performance on one of the storage nodes:

GS-009d

Disk performance on one of the storage nodes:

GS-009e

CPU performance on one of the storage nodes:

GS-009f

Overall summary performance on one of the storage nodes:

GS-009g

CPU utilization on the storage nodes as shown from the GridControl snap-in:

GS-009h

Bytes received:

GS-009i

Final test result:

GS-009j

and test details.


Conclusion and important points to note:

  • Network utilization maxed out beyond the 10 Gbps single NIC used on both the compute and storage nodes. This suggests that the array is likely to deliver more IOPS if more network bandwidth is available. Next test will use 2 teamed NICs on the compute node as well as 3 storage nodes with teamed 10 Gbps NICs as well.
  • CPU is maxed on the storage nodes during the test. Storage nodes have 4 cores. This suggests that CPU may be a bottleneck on storage nodes. It also leads me to believe that a) more processing power is needed on the storage nodes, and b) RDMA NICs are likely to enhance performance greatly. The Mellanox ConnectX-3 VPI dual port PCIe8 card may be just what the doctor ordered. In a perfect environment, I would have that coupled with the Mellanox Infiniband MSX6036F-1BRR 56Gbps Switch.
  • Disk IO performance on the storage nodes during the test showed about 240 MB/s data transfer, or about 60 MB/s per each of the disks in the node. This corresponds to the native IO performance of the SAS disks. This suggests minimal/negligible boost from the 550 GB PCIe flash card in the storage node.

 

 


Benchmarking Gridstore enterprise storage array (1)


Gridstore provides an alternative to traditional enterprise storage. Basic facts about Gridstore storage technology include:

  • It provides storage nodes implemented as 1 RU servers that function collectively as a single storage array.
  • Connectivity between the nodes and the storage consumers/compute nodes occurs over one or more 1 or 10 Gbps Ethernet connections.
  • NIC teaming can be setup on the Gridstore nodes to provide additional bandwidth and fault tolerance
  • It utilizes a virtual controller to present storage to Windows servers

IO testing tool and its settings is detailed in this post.

vLUNs can be easily created using the GridControl snap-in. This testing is done with a Gridstore array composed of 6 H-nodes. Click node details to see more.

Prior to testing array disk IO, I tested the availability of bandwidth on the Force 10 switch used. I used NTttcp Version 5.28 tool. One of the array nodes was the receiver:

GS-002

 

The HV-LAB-01 compute node was the sender:

GS-003

I configured the tool to use 4 processor cores only since the Gridstore storage nodes had only 4 cores.

The result was usable bandwidth of 8.951 Gbps (1,18.9 MB/s) – Testing was done using standard 1,500 MTU frames not 9,000 MTU jumbo frames.

Test details::

On the receiver Gridstore storage node:
C:\Support>ntttcp.exe -r -m 4,*,10.5.19.30 -rb 2M -a 16 -t 120
Copyright Version 5.28
Network activity progressing…
Thread Time(s) Throughput(KB/s) Avg B / Compl
====== ======= ================ =============
0 120.011 311727.158 60023.949
1 120.011 233765.293 53126.468
2 120.011 306670.676 56087.990
3 120.011 293592.705 52626.788
##### Totals: #####
Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)
================ =========== ============== ================
134280.569568 120.011 1457.709 1118.902
Throughput(Buffers/s) Cycles/Byte Buffers
===================== =========== =============
17902.435 3.864 2148489.113
DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)
============= ============= =============== ==============
17388.114 46.288 26563.098 30.300
Packets Sent Packets Received Retransmits Errors Avg. CPU %
============ ================ =========== ====== ==========
4634562 96592255 599 0 62.960

On the sender compute node: HV-LAB-05
C:\Support>ntttcp.exe -s -m 4,*,10.5.19.30 -rb 2M -a 16 -t 120
Copyright Version 5.28
Network activity progressing…
Thread Time(s) Throughput(KB/s) Avg B / Compl
====== ======= ================ =============
0 120.003 311702.607 65536.000
1 120.003 233765.889 65536.000
2 120.003 306669.667 65536.000
3 120.003 293592.660 65536.000
##### Totals: #####
Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)
================ =========== ============== ================
134268.687500 120.004 1457.441 1118.868
Throughput(Buffers/s) Cycles/Byte Buffers
===================== =========== =============
17901.895 2.957 2148299.000
DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)
============= ============= =============== ==============
25915.561 1.504 71032.291 0.549
Packets Sent Packets Received Retransmits Errors Avg. CPU %
============ ================ =========== ====== ==========
96601489 4677580 22698 1 7.228


Test 1:

Compute node(s): 1 physical machine with 2x Xeon E5-2430L CPUs at 2 GHz with 6 cores each (12 Logical processors) and 30 MB L3 cache, 96 GB RAM, 2x 10 Gbps NICs

GS-001

vLUN:

  • Protect Level: 0 (no fault tolerance, striped across 4 Gridstore nodes)
  • Optimized for: N/A
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (4 segments, 512 GB each)

GS-A05

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 10.43k IOPS

GS-A01

 

In the above image you can see the read/write activity to the 4 nodes that make up this vLUN listed under Network Activity in the Resource Monitor/Network tab.

GS-A02

At the same time, the 4 nodes that make up this vLUN showed average CPU utilization around 40%. This dropped down to 0% right after the test.

GS-A03

The 4 nodes’ memory utilization averaged around 25% during the test. It’s baseline is 20%

GS-A04

 


Test 2: The same single compute node above

vLUN:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive single node failure)
  • Optimized for: IOPS
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments, 1 TB each)

GS-B04

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 11.32k IOPS

GS-B01

 

GS-B02

 

GS-B03

 


Test 3: The same single compute node above

vLUN:

  • Protect Level: 1 (striped across 5 Gridstore nodes, fault tolerant to survive single node failure)
  • Optimized for: Throughput
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (5 segments, 512 GB each)

GS-C01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 9.28k IOPS
GS-C02

GS-C03

 

GS-C04


Test 4: The same single compute node above

vLUN:

  • Protect Level: 2 (striped across 6 Gridstore nodes, fault tolerant to survive 2 simultaneous node failures)
  • Optimized for: Throughput
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (6 segments, 512 GB each)

GS-D01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 4.56k IOPS

GS-D02

GS-D03

GS-D04

 


Test 5: The same single compute node above

2 vLUNs:

1. The same Grid Protection Level 1 vLUN from test 1 above with Platinum QoS setting +

2. Identical 2nd vLUN except that QoS is set to Gold:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive 1 node failure)
  • Optimized for: IOPS
  • QoS: Gold
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments,  1 TB each)

GS-E01

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 10.52k IOPS

GS-E02

GS-E03

GS-E04

 


 

Test 6: The same single compute node above

3 vLUNs:

All the same:

  • Protect Level: 1 (striped across 3 Gridstore nodes, fault tolerant to survive 1 node failure)
  • Optimized for: IOPS
  • QoS: Platinum
  • Unmasked: to 1 server
  • File system: NTFS
  • Block size: 32 KB
  • Size: 2 TB (3 segments,  1 TB each)

GS-F1

Result:

Testing with 24 vCores, 10 Gbps NIC, 1 compute node, 32k block size, 50% read/50% write IO profile => 9.94k IOPS

GS-F6

GS-F5

GS-F4

GS-F3

GS-F2


 Summary:

GS-004


Creating vLUNs on a Gridstore array


Gridstore provides an alternative to traditional enterprise storage. Basic facts about Gridstore storage technology include:

  • It provides storage nodes implemented as 1 RU servers that function collectively as a single storage array.
  • Connectivity between the nodes and the storage consumers/compute nodes occurs over one or more 1 or 10 Gbps Ethernet connections.
  • NIC teaming can be setup on the Gridstore nodes to provide additional bandwidth and fault tolerance
  • It utilizes a virtual controller to present storage to Windows servers

The following is an overview of available vLUN options and features. The lab used consists of 6x Gridstore “H” storage nodes. Gridstore storage nodes are of 2 types: H-nodes and C-nodes. C-nodes are capacity nodes and typically include 4x 3TB 7200 RPM SAS disks. H-nodes are hybrid nodes that include a 550 GB PCIe Flash card. Each node has:

  • CPU: 1x Xeon E5-2403 processor at 1.8 GHz with 4 cores (no hyper-threading) and 10 MB L3 cache
  • Memory: 32 GB DDR3 1333 MHz DIMM
  • Disks (not counting boot/system disk(s)): 4x 3TB 7200 RPM SAS disks and a 550 GB PCIe Flash card

To create a vLUN, in GridControl snap-in, vPools=>(vPool_Name)=>right-click on vLUNs and click Create vLUN
GS-v1
GridProtect level 0: This setting provides no protection against any disk loss in any storage node, or against any node loss in the grid. This option is strongly discouraged.
GS-v2

 

The next step is optional. It includes the selection of QoS (Bronze/Gold/Platinum), which compute node(s) to unmask this vLUN to, and how to format it.
GS-v3

If you skip this step, the GridStore software will create the vLUN but not unmask it to any host:
GS-v4

In this view note:

  1. vLUN protect level 0 is created on 4 storage nodes listed in the Hostname column (this is the node’s NetBios name)
  2. The “Disk” and “slot” columns show the actual disks on which this vLUN resides. The following view shows the same information under Storage Nodes:
    GS-v5
  3. vLUNs are thick-provisioned. vLUN entire space is dedicated/reserved on the disks.

To unmask the newly created vLUN and present it to a compute node, right-click on it, and click Add vLUN to Server:

GS-v6

Pick the desired server from the drop down list and select the desired QoS level (Bronze/Gold/Platinum):

GS-v7

The unmasked vLUN becomes visible to the selected compute node:

GS-v8

As any regular disk, we now can bring it online, initialize it (GPT recommended), and format it. I recommend using 32k block size not the 4k default. and naming the volume the same name as the vLUN for consistency. 32k block size enhances IOPS at the expense of potentially wasting disk space if the average file size is under 32k. Given that most workloads will have files larger than 32KB, I recommend using 32KB clock size.

GS-v9

 


Upgrading Gridstore software – step by step


Gridstore provides an alternative to traditional enterprise storage. Basic facts about Gridstore storage technology include:

  • It provides storage nodes implemented as 1 RU servers that function collectively as a single storage array.
  • Connectivity between the nodes and the storage consumers/compute nodes occurs over one or more 1 or 10 Gbps Ethernet connections.
  • NIC teaming can be setup on the Gridstore nodes to provide additional bandwidth and fault tolerance
  • It utilizes a virtual controller to present storage to Windows servers

Gridstore releases software updates from time to time. The following is step-by-step overview of upgrading Gridstore software. This process upgrades the software on all Gridstore storage nodes as well as all management/compute nodes. All vLUNs must be stopped before upgrading the software on the storage nodes.

  1. Stop all vLUNs. In GridControl snap-in, vPools=>(vPool_Name)=>vLUNs=> right-click on each vLUN and click STOP
  2. You can view your current version in GridControl, Help=> About GridControl – this shows the software version on the local management/compute node
    GS-u2
  3. You can also view the software version on all storage nodes in GridControl under Storage Nodes
    GS-u4
  4. To kick off the process of upgrading the software on all storage nodes in the grid, in GridControl right-click on The Grid, and click Upgrade GridGS-u3
  5. Browse to the location of the Gridstore.msi file provided by Gridstore technical support and click next. GS-u5
  6. The installer goes about upgrading the software on each storage node in the grid
    GS-u6
  7. This went quickly, and completed the 6 nodes in this configuration in a matter of a few minutes
    GS-u7
  8. Next we need to upgrade each compute node that uses the Gridstore storage. This can be done centrally from any management node. In GridControl, click vController Manager. On the right side you will see the list of your storage-consumers/compute nodes
    GS-ua
  9. Right-click on each node that you need to upgrade and click Upgrade
    Note: This option is only available for nodes that are Online.

 


Migrating IP settings from one NIC to another using Powershell


Here’s an example scenario where the following script may be particularly useful:

GridStore array where each node currently uses one 1 Gbps NIC. After adding 10 Gbps NIC to each node, we’d like to migrate the IP settings from the 1 Gbps NIC to the 10 Gbps NIC on each node. GridStore utilizes commodity rack mount servers and hardware and a robust software driver to present scalable, high performance, fully redundant vLUNs. More detailed posts on GridStore will follow.

This diagram shows network connectivity before adding the 10 Gbps NICs:

Before

 

After adding the 10 Gbps NICs:

After

Steps:

  1. You will need administrative credentials to the nodes from GridStore technical support
  2. From Server1, using GridControl snapin, stop all vLUNs:
    Stop-vLUNs
  3. RDP to each node.
    logon
  4. Currently nodes run Windows 7 Embedded and the RDP session will bring up a command prompt.
  5. Run Control to show Control Panel, double-click Network and Sharing Center, click Change Adapter Settings to view/confirm that you have 2 connected NICs:
    NICs
  6. Start Powershell ISE:
    start-ps-ise
  7. Copy/paste the following script and run it on each node:

    GS-2

# Script to move network configuration from one NIC to another on a GridStore node
# Sam Boutros
# 6/16/2014
# Works with Powershell 2.0
#
Set-Location “c:\support”
$Loc = Get-Location
$Date = Get-Date -format yyyymmdd_hhmmsstt
$logfile = $Loc.path + “\Move-GSNIC_” + $env:COMPUTERNAME + “_” + $Date + “.txt”
function log($string) {
Write-Host $string; $temp = “: ” + $string
$string = Get-Date -format “yyyy.mm.dd hh:mm:ss tt”; $string += $temp
$string | out-file -FilePath $logfile -append
}
#
log “Switching NIC configuration on $env:COMPUTERNAME”
$ConnectedNICs = Get-WmiObject Win32_NetworkAdapterConfiguration -Filter ‘IPEnabled=”true”‘
If ($ConnectedNICs.Count -lt 2) {log “Error: Less than 2 connected NICs:”; log $ConnectedNICs}
else {
if ($ConnectedNICs.Count -gt 2) {log “Error: More than 2 connected NICs:”; log $ConnectedNICs}
else { # 2 connected NICs
Stop-Service GridStoreManagementService
# Storing NICs details in variables for later use
$NIC0Index = $ConnectedNICs[0].index; log “NIC 0 Index: $NIC0Index”
$NIC0Desc = $ConnectedNICs[0].description; log “NIC 0 Description: $NIC0Desc”
$NIC0IPv4 = $ConnectedNICs[0].IPAddress[0]; log “NIC 0 IPv4: $NIC0IPv4”
$NIC0Mask = $ConnectedNICs[0].IPSubnet[0]; log “NIC 0 Subnet Mask: $NIC0Mask”
$Nic0 = Get-WmiObject win32_networkadapter -filter “DeviceId = $NIC0Index”
$NIC0ConnID = $Nic0.NetConnectionID; log “NIC 0 NetConnectionID: $NIC0ConnID”
#
$NIC1Index = $ConnectedNICs[1].index; log “NIC 1 Index: $NIC1Index”
$NIC1Desc = $ConnectedNICs[1].description; log “NIC 1 Description: $NIC1Desc”
$NIC1IPv4 = $ConnectedNICs[1].IPAddress[0]; log “NIC 1 IPv4: $NIC1IPv4”
$NIC1Mask = $ConnectedNICs[1].IPSubnet[0]; log “NIC 1 Subnet Mask: $NIC1Mask”
$Nic1 = Get-WmiObject win32_networkadapter -filter “DeviceId = $NIC1Index”
$NIC1ConnID = $Nic1.NetConnectionID; log “NIC 1 NetConnectionID: $NIC1ConnID”
# Identify Target NIC and Source NIC
if ($ConnectedNICs[0].IPAddress[0] -match “169.254”) {$TargetNIC = 0} else {$TargetNIC = 1}
$SourceNIC = 1 – $TargetNIC; log “Source NIC: NIC $SourceNIC”
$SourceIP = $ConnectedNICs[$SourceNIC].IPAddress[0]
$SourceMask = $ConnectedNICs[$SourceNIC].IPSubnet[0]
log “Source IP: $SourceIP”
# Setting IP address for Source NIC to DHCP
log “Changing IP address of source NIC to DHCP”
if ($ConnectedNICs[$SourceNIC].EnableDHCP().ReturnValue -eq 0) {
log “==> IP setting change was successful”} else {log “==> IP setting change failed”}
$ConnectedNICs[$SourceNIC].SetDNSServerSearchOrder()
# Need to disable and re-enable the Source NIC for the settings to take effect (!?)
$Nic0.Disable(); Start-Sleep -s 2
$Nic0.Enable(); Start-Sleep -s 2
# Setting IP address for Target NIC to the same values previousely held by the Source NIC
$TargetIP = $ConnectedNICs[$TargetNIC].IPAddress[0]
log “Target IP: $TargetIP”
log “Changing IP address of Target NIC to $SourceIP with subnet mask $SourceMask”
if ($ConnectedNICs[$TargetNIC].EnableStatic($SourceIP,$SourceMask).ReturnValue -eq 0) {
log “==> IP address change was successful”} else {log “==> IP address change failed”}
Remove-Item -Path “HKLM:\SOFTWARE\Wow6432Node\Gridstore\NetworkAdapter”
Start-Sleep -s 2
Start-Service GridStoreManagementService
}
}
Invoke-Expression “$env:windir\system32\Notepad.exe $logfile”