In a prior post I put out a powershell script to create many Hyper-V virtual machines on multiple LUNs using the Create-VM.ps1 script. That script ran in a single thread creating one VM before moving on to the next. In another post I manually ran 8 scripts to achieve the goal of running multiple jobs at the same time. In this script I use Start-Job with ScriptBlock and -ArgumentList to run multiple jobs at the same time.
The script can be downloaded from the Microsoft TechNet Gallery.
Compute node (Hyper-V host): 2x Xeon E5-2430L CPUs at 2 GHz with 6 cores each (12 Logical processors each) and 15 MB L3 cache, 96 GB RAM, 2x 10Gbps NICs that are not RDMA capable, setup in a NIC team (Teaming mode: Switch Independent, Load balancing mode: Hyper-V Port). Storage is 4x LUNs from a 3-node Gridstore array. Each LUN is configured as IOPS 2+1 LUN. Each Gridstore storage node has 1x Xeon E5-2403 processor at 1.8GHz with 4 cores (no hyper-threading) and 10 MB L3 cache, 32GB DDR3 1333 MHz DIMM, 4x 3TB 7200 RPM SAS disks, a 550GB PCIe Flash card, and 2x 10Gbps NICs that are not RDMA capable, setup in a NIC team (Broadcom Smart Load Balancing and Failover = switch-independent, no LACP support needed on switch).
The script took 9 minutes and 45 seconds to create the 40 VMs. During that time the Hyper-V host resource monitor showed:
GridControl snap-in showed:
I cal also see script logs piling up during script execution:
I started the 40 VMs manually after the script finished:
Conclusion and important points to note:
- Hyper-V performance summary: No CPU or memory bottleneck observed during the test.
- Array performance summary:
- Files copies: 40
- File size: 8.93GB
- Concurrent operations: 4 copy operations at the same time
- Total vLUNs used: 4
- Average file copy duration: 10.18 seconds
- Average throughput: 902.86 MB/s (6 Gbps)
- Using the formula IOPS = BytesPerSec / TransferSizeInBytes
LUNs are formatted as 64 KB blocks
Average IOPS = (902.86*1024)/64 = 14.45k IOPS
- Although 20 Gbps aggregate bandwidth is available to each of the compute node and the 3 storage nodes, I have not been able to produce network traffic above 10 Gbps.
- CPU on the storage nodes was around 90% during the copy. The storage nodes can benefit from additional processing capacity.