Posts tagged “StorSimple

StorSimple 8k software release 4.0


Around mid February 2017, Microsoft released StorSimple software version 4.0 (17820). This is a release that includes firmware and driver updates that require using Maintenance mode and the serial console.

Using this PowerShell script to save the Version 4.0 cmdlets and compare them to Version 3.0, I got:

storsimple40-a

Trying the new cmdlets, the Get-HCSControllerReplacementStatus cmdlet returns a message like:

storsimple40-b

The Get-HCSRehydrationJob returns no output (no restore jobs are running)

The Invoke-HCSDisgnostics seems pretty useful and returns output similar to:

storsimple40-c

The cmdlet takes a little while to run. In this case it took 14 minutes and 38 seconds:

storsimple40-d

It returns data from its several sections like;

System Information section:

storsimple40-e

This is output similar to what we get from the Get-HCSSystem cmdlet for both controllers.

Update Availability section:

storsimple40-f

This is output similar to Get-HCSUpdateAvailability cmdlet, although the MaintenanceModeUpdatesTitle property is empty !!??

storsimple40-g

Cluster Information section:

storsimple40-h

This is new exposed information. I’m guessing this is the output of some Get-HCSCluster cmdlet, but this is pure speculation on my part. I’m also guessing that this is a list of clustered roles in a traditional Server 2012 R2 failover cluster.

Service Information section:

storsimple40-i

This is also new exposed information. Get-Service is not an exposed cmdlet.

Failed Hardware Components section:

storsimple40-j

This is new exposed information. This device is in good working order, so this list may be false warnings.

Firmware Information section:

storsimple40-k

This output is similar to what we get from Get-HCSFirmwareVersion cmdlet

Network Diagnostics section:

storsimple40-l

Most of this information is not new, but it’s nicely bundled into one section.

Performance Diagnostics section:

storsimple40-m

Finally, this section provides new information about read and write latency to the configured Azure Storage accounts.

The full list of exposed cmdlets in Version 4.0 is:

Clear-DnsClientCache
Set-CloudPlatform
Select-Object
Restart-HcsController
Resolve-DnsName
Out-String
Out-Default
Set-HcsBackupApplianceMode
Measure-Object
Invoke-HcsmServiceDataEncryptionKeyChange
Invoke-HcsDiagnostics
Get-History
Get-Help
Get-HcsWuaVersion
Get-HcsWebProxy
Invoke-HcsSetupWizard
Set-HcsDnsClientServerAddress
Set-HcsNetInterface
Set-HcsNtpClientServerAddress
Test-HcsNtp
Test-HcsmConnection
Test-Connection
Sync-HcsTime
Stop-HcsController
Start-Sleep
Start-HcsUpdate
Start-HcsPeerController
Start-HcsHotfix
Start-HcsFirmwareCheck
Set-HcsWebProxy
Set-HcsSystem
Set-HcsRemoteManagementCert
Set-HcsRehydrationJob
Set-HcsPassword
Get-HcsUpdateStatus
Trace-HcsRoute
Get-HcsUpdateAvailability
Get-HcsSupportAccess
Enable-HcsRemoteManagement
Enable-HcsPing
Enable-HcsNetInterface
Disable-HcsWebProxy
Disable-HcsSupportAccess
Disable-HcsRemoteManagement
Enable-HcsSupportAccess
Disable-HcsPing
Test-NetConnection
Test-HcsStorageAccountCredential
TabExpansion2
Reset-HcsFactoryDefault
prompt
Get-NetAdapter
Disable-HcsNetInterface
Enable-HcsWebProxy
Enter-HcsMaintenanceMode
Enter-HcsSupportSession
Get-HcsRoutingTable
Get-HcsRemoteManagementCert
Get-HcsRehydrationJob
Get-HcsNtpClientServerAddress
Get-HcsNetInterface
Get-HcsFirmwareVersion
Get-HcsDnsClientServerAddress
Get-HCSControllerReplacementStatus
Get-HcsBackupApplianceMode
Get-Credential
Get-Command
Export-HcsSupportPackage
Export-HcsDataContainerConfig
Exit-PSSession
Exit-HcsMaintenanceMode
Get-HcsSystem
Update-Help


StorSimple 8k series as a backup target?


19 December 2016

After a conference call with Microsoft Azure StorSimple product team, they explained:

  •  “The maximum recommended full backup size when using an 8100 as a primary backup target is 10TiB. The maximum recommended full backup size when using an 8600 as a primary backup target is 20TiB”
  • “Backups will be written to array, such that they reside entirely within the local storage capacity”

Microsoft acknowledge the difficulty resulting from the maximum provisionable space being 200 TB on an 8100 device, which limits the ability to over-provision thin-provisioned tiered iSCSI volumes when expecting significant deduplication/compression savings with long term backup copy job Veeam files for example.

Conclusion

  • When used as a primary backup target, StorSimple 8k devices are intended for SMB clients with backup files under 10TB/20TB for the 8100/8600 models respectively
  •  Compared to using an Azure A4 VM with attached disks (page blobs), StorSimple provides 7-22% cost savings over 5 years

15 December 2016

On 13 December 2016, Microsoft announced the support of using StorSimple 8k devices as a backup target. Many customers have asked for StorSimple to support this workload. StorSimple hybrid cloud storage iSCSI SAN features automated tiering at the block level from its SSD to SAS to Azure tiers. This makes it a perfect fit for Primary Data Set for unstructured data such as file shares. It also features cloud snapshots which provide the additional functionality of data backup and disaster recovery. That’s primary storage, secondary storage (short term backups), long term storage (multiyear retention), off site storage, and multi-site storage, all in one solution.

However, the above features that lend themselves handy to the primary data set/unstructured data pose significant difficulties when trying to use this device as a backup target, such as:

  • Automated tiering: Many backup software packages (like Veeam) would do things like a forward incremental, synthetic full, backup copy job for long term retention. All of which would scan/access files that are typically dozens of TB each. This will cause the device to tier data to Azure and back to the local device in a way that slows things down to a crawl. DPM is even worse; specifically the way it allocates/controls volumes.
  • The arbitrary maximum allocatable space for a device (200TB for an 8100 device for example), makes it practically impossible to use the device as backup target for long term retention.
    • Example: 50 TB volume, need to retain 20 copies for long term backup. Even if change rate is very low and actual bits after deduplication and compression of 20 copies is 60 TB, we cannot provision 20x 50 TB volumes, or a 1 PB volume. Which makes the maximum workload size around 3TB if long term retention requires 20 recovery points. 3TB is way too small of a limit for enterprise clients who simply want to use Azure for long term backup where a single backup file is 10-200 TB.
  • The specific implementation of the backup catalog and who (the backup software versus StorSimple Manager service) has it.
  • Single unified tool for backup/recovery – now we have to use the backup software and StorSimple Manager, which do not communicate and are not aware of each other
  • Granular recoveries (single file/folder). Currently to recover a single file from snapshot, we must clone the entire volume.

In this article published 6 December 2016, Microsoft lays out their reference architecture for using StorSimple 8k device as a Primary Backup Target for Veeam

primarybackuptargetlogicaldiagram

There’s a number of best practices relating to how to configure Veeam and StorSimple in this use case, such as disabling deuplication, compression, and encryption on the Veeam side, dedicating the StorSimple device for the backup workload, …

The interesting part comes in when you look at scalability. Here’s Microsoft’s listed example of a 1 TB workload:

ss-backup-target03

This architecture suggests provisioning 5*5TB volumes for the daily backups and a 26TB volume for the weekly, monthly, and annual backups:

ss-backup-target04

This 1:26 ratio between the Primary Data Set and Vol6 used for the weekly, monthly, and annual backups suggests that the maximum supported Primary Data Set is 2.46 TB (maximum volume size is 64 TB) !!!???

ss-backup-target05

This reference architecture suggests that this solution may not work for a file share that is larger than 2.5TB or may need to be expanded beyond 2.5TB

Furthermore, this reference architecture suggests that the maximum Primary Data Set cannot exceed 2.66TB on an 8100 device, which has 200TB maximum allocatable capacity, reserving 64TB to be able to restore the 64TB Vol6

ss-backup-target06

It also suggests that the maximum Primary Data Set cannot exceed 8.55TB on an 8600 device, which has 500TB maximum allocatable capacity, reserving 64TB to be able to restore the 64TB Vol6

ss-backup-target07

Even if we consider cloud snapshots to be used only in case of total device loss – disaster recovery, and we allocate the maximum device capacity, the 8100 and 8600 devices can accommodate 3.93TB and 9.81TB respectively:

ss-backup-target08

Conclusion:

Although the allocation of 51TB of space to backup 1 TB of data resolves the tiering issue noted above, it significantly erodes the value proposition provided by StorSimple.



StorSimple 8k series software version reference


This post lists StorSimple software versions, their release dates, and major new features for reference. Microsoft does not publish release dates for StorSimple updates. The release dates below are from published documentation and/or first hand experience. They may be off by up to 15 days.

  • Version 4.0 (17820) – released 12 February 2017 – see release notes, and this post.
    • Major new features: Invoke-HCSDiagnostics new cmdlet, and heatmap based restores
  • Version 3.0 (17759) – released 6 September 2016 – see release notes, and this post.
    • Major new features: The use of a StorSimple as a backup target (9/9/2016 it’s unclear what that means)
  • Version 2.2 (17708) – see release notes
  • Version 2.1 (17705) – see release notes
  • Version 2.0 (17673) – released January 2016 – see release notes, this post, and this post
    • Major new features: Locally pinned volumes, new virtual device 8020 (64TB SSD), ‘proactive support’, OVA (preview)
  • Version 1.2 (17584) – released November 2015 – see release notesthis post, and this post
    • Major new features: (Azure-side) Migration from legacy 5k/7k devices to 8k devices, support for Azure US GOV, support for cloud storage from other public clouds as AWS/HP/OpenStack, update to latest API (this should allow us to manage the device in the new portal, yet this has not happened as of 9/9/2016)
  • Version 1.1 (17521) – released October 2015 – see release notes
  • Version 1.0 (17491) – released 15 September 2015 – see release notes and this post
  • Version 0.3 (remains 17361) – released February 2015 – see release notes
  • Version 0.2 (17361) – released January 2015 – see release notes and this post
  • Version 0.1 (17312) – released October 2014 – see release notes
  • Version GA (General Availability – 0.0 – Kernel 6.3.9600.17215) – released July 2014 – see release notes – This is the first Windows OS based StorSimple software after Microsoft’s acquisition of StorSimple company.
  • As Microsoft acquired StorSimple company, StorSimple 5k/7k series ran Linux OS based StorSimple software version 2.1.1.249 – August 2012

StorSimple Software update 3.0 (17759)


This post describes one experience of updating StorSimple 8100 series device from version 0.2 (17361) to current  (8 September 2016) version 3.0 (17759). It’s worth noting that:

  • StorSimple 8k series devices that shipped in mid 2015 came with software version 0.2
  • Typically, the device checks periodically for updates and when updates are found a note similar to this image is shown in the device/maintenance page: storsimple3-03
  • The device admin then picks the time when to deploy the updates, by clicking INSTALL UPDATES link. This kicks off an update job, which may take several hoursstorsimple3-01
  • This update method is known as updating StorSimple device using the classic Azure portal, as opposed to updating the StorSimple device using the serial interface by deploying the update as a hotfix.
  • Released updates may not show up, in spite of scanning for updates manually several times: storsimple3-04
    The image above was taken on 9 September 2016 (update 3.0 is the latest at this time). It shows that no updates are available even after scanning for updates several times. The reason is that Microsoft deploys updates in a ‘phased rollout’, so they’re not available in all regions at all times.
    storsimple3-05
  • Updates are cumulative. This means for a device running version 0.2 for example, we upgrade directly to 3.0 without the need to manually upgdate to any intermediary version first.
  • An update may include one or both of the following 2 types:
    • Software updates: This is an update of the core 2012 R2 server OS that’s running on the device. Microsoft identifies this type as a non intrusive update. It can be deployed while the device is in production, and should not affect mounted iSCSI volumes. Under the covers, the device controller0 and controller1 are 2 nodes in a traditional Microsoft failover cluster. The device uses the traditional Cluster Aware Update to update the 2 controllers. It updates and reboots the passive controller first, fails over the device (iSCSI target and other clustered roles) from one controller to the other, then updates and reboots the second controller. Again this should be a no-down-time process.
    • Maintenance mode updates:

      These are updates to shared components in the device that require down time. Typically we see LSI SAS controller updates and disk firmware updates in this category. Maintenance mode updates must be done from the serial interface console (not Azure web interface or PowerShell interface). The typical down time for a maintenance mode update is about 30 minutes, although I would schedule a 2 hour window to be safe. The maintenance mode update steps are:

      • On the file servers, offline all iSCSI volumes provisioned from this device.
      • Log in to the device serial interface with full access
      • Put the device in Maintenance mode: Enter-HcsMaintenanceMode, wait for the device to reboot
      • Identify available updates: Get-HcsUpdateAvailability, this should show available Maintenance mode updates (TRUE)
      • Start the update: Start-HcsUpdate
      • Monitor the update: Get-HcsUpdateStatus
      • When finished, exit maintenance mode: Exit-HcsMaintenanceMode, and wait for the device to reboot.

 


Powershell script to list StorSimple network interface information including MAC addresses


In many cases we can obtain the IP address of a network interface via one command but get the MAC address from another command. StorSimple 8k series which runs a core version of server 2012 R2 (as of 20 June 2016) is no exception. In this case we can get the IP address information of the device network interfaces via the Get-HCSNetInterface cmdlet. However, to identify MAC addresses we need to use the Get-NetAdapter cmdlet. This Powershell script merges the information from both cmdlets presenting a PS Object collection, each of which has the following properties:

  • InterfaceName
  • IPv4Address
  • IPv4Netmask
  • IPv4Gateway
  • MACAddress
  • IsEnabled
  • IsCloudEnabled
  • IsiSCSIEnabled

Script output may look like:

SS-MAC1

For more information about connecting to StorSimple via PowerShell see this post.

 


Presenting StorSimple iSCSI volumes to a failover cluster


In a typical implementation, StorSimple iSCSI volumes (LUNs) are presented to a file server, which in turn presents SMB shares to clients. Although the StorSimple iSCSI SAN features redundant hardware, and is implemented using redundant networking paths on both the Internet facing side and the iSCSI side, the file server in this example constitutes a single point of failure. One solution here is to present the iSCSI volume to all nodes in a failover cluster. This post will go over the steps to present a StorSimple iSCSI volumes to a failover cluster as opposed to a single file server.

Create volume container, volume, unmask to all cluster nodes

As usual, keep one volume per volume container to be able to restore one volume at a time. Give the volume a name, size, type: tiered. Finally unmask it to all nodes in the failover cluster:

StorSimple-Cluster01

Format the volume:

In Failover Cluster Manager identify the owner node of the ‘File Server for general use‘ role:

StorSimple-Cluster02

In Disk Management of the owner node identified above, you should see the new LUN:

StorSimple-Cluster03

Right click on Disk20 in the example above, click Online. Right click again and click Initialize Disk. Choose GPT partition. It’s recommended to use GPT partition instead of MBR partition for several reasons such as maximum volume size limitation.

Right click on the available space to the right of Disk20 and create Simple Volume. It’s recommended to use Basic Disks and Simple Volumes with StorSimple volumes.

Format with NTFS, 64 KB allocation unit size, use the same volume label as the volume name used in StorSimple Azure Management Interface, and Quick format. Microsoft recommends NTFS as the file system to use with StorSimple volumes. 64KB allocation units provide better optimization as the device internal deduplication and compression algorithms use 64KB blocks for tiered volumes. Using the same volume label is important since currently (1 June 2016) StorSimple does not provide a LUN ID that can be used to correlate a LUN created on StorSimple to one appearing on a host. Quick formatting is important since these are thin provisioned volumes.

For existing volumes, Windows GUI does not provide a way of identifying the volume allocation unit size. However, we can look it up via Powershell as in:

@('c:','d:','y:') | % {
 $Query = "SELECT BlockSize FROM Win32_Volume WHERE DriveLetter='$_'"
 $BlockSize = (Get-WmiObject -Query $Query).BlockSize/1KB
 Write-Host "Allocation unit size on Drive $_ is $BlockSize KB" -Fore Green
}

Replace the drive letters in line 1 with the ones you wish to lookup.

Summary of volume creation steps/best practices:

  • GPT partition
  • Basic Disk (not dynamic)
  • Simple volume (not striped, mirrored, …)
  • NTFS file system
  • 64 KB allocation unit (not the default 4 KB)
  • Same volume label as the one in StorSimple
  • Quick Format

Add the disk to the cluster:

Back in Failover Cluster Manager, under Storage, right click on Disks, and click Add Disk

StorSimple-Cluster05

Pick Disk20 in this example

StorSimple-Cluster06

Right click on the new cluster disk, and select Properties

StorSimple-Cluster07

Change the default name ‘Cluster Disk 3’ to the same volume label and name used in StorSimple

StorSimple-Cluster08

Assign Cluster Disk to File Server Role

In Failover Cluster Manager, under Storage/Disks, right click on TestSales-Vol in this example, and select Assign to Another Role under More Actions

StorSimple-Cluster09

Select the File Server for General Use role – we happen to have one role in this cluster:

StorSimple-Cluster10

Create clustered file shares

As an example, I created 2 folders in the TestSales-Vol volume:

StorSimple-Cluster11

In Failover Cluster Manager, under Roles, right click on the File Server for General Use role, and select Add File Share

StorSimple-Cluster12

Select SMB Quick in the New Share Wizard

StorSimple-Cluster13

Click Type a custom path and type in or Browse to the folder on the new volume to be shared

StorSimple-Cluster14

Change the share name or accept the default (folder name). In this example, I added a dollar sign $ to make this a hidden share

StorSimple-Cluster15

It’s important to NOT Allow caching of share for StorSimple volumes. Access based enumeration is my personal recommendation

StorSimple-Cluster16

Finally adjust NTFS permissions as needed or accept the defaults:

StorSimple-Cluster17

Click Create to continue

StorSimple-Cluster18

Repeat the above steps for the TestSales2 folder/share in this example

 


Powershell script to Auto-Expand StorSimple volume based on amount of free space


StorSimple Hybrid Cloud Storage array is an on-premise iSCSI SAN that extends seamlessly to the cloud. iSCSI volumes provisioned from a StorSimple device can be expanded but cannot be shrunk. So, a typical recommendation here is to start a volume small and grow it as needed. Growing a volume is a process that does not require down time. This script grows a StorSimple volume automatically based on set conditions of volume free space and a not-to-exceed value.

The input region of this script is the one that should be edited by the script user:

Expand01

Similar to the script that monitors StorSimple Backups, the values for SubscriptionName, SubscriptionID, and StorSimpleManagerName variables can be found in the classic Azure Management Interface under your StorSimple Manager node Dashboard and Device pages:

Monitor-StorSimple05

and the RegistrationKey:

Monitor-StorSimple06

and the SSDeviceName (StorSimple Device Name)

Monitor-StorSimple07

The value for the SSVolumeName (StorSimple volume name) variable can be found under the device\volume container:

Expand02

Notify variable can be either $true or $false. This instructs the script whether or not to send email notification when an expansion is triggered,

Similarly, Expand variable can be either $true or $false. This instructs the script whether or not to expand the volume when an expansion is triggered, When set to $false (and Notify is set to $true) and an expansion is triggered, the script will send an email notification that an expansion is triggered but will not do the actual expansion.

ExpandThresholdGB and ExpandThresholdPercent variables are used by the script to identify the amount of free space on the volume below which a volume expansion is triggered. Only one of these variables is needed. If both are provided the script will use the larger value.

  • Example 1: If the volume size is 100 GB, and the ExpandThresholdGB is set to 10 (GB) and the ExpandThresholdPercent is set to 15 (%), the script will trigger a volume expansion if the amount of free space is at or below 15 GB
  • Example 2: If the volume size is 100 GB, and the ExpandThresholdGB is set to 10 (GB) and the ExpandThresholdPercent is set to 5 (%), the script will trigger a volume expansion if the amount of free space is at or below 10 GB

Similarly, the ExpandAmountGB and ExpandAmountPercent variables instruct the script on how much to expand the volume once expansion is triggered. Only one of these variables is needed. If both are provided the script will use the larger value.

  • Example 1: If the volume size is 100 GB, and the ExpandAmountGB is set to 10 (GB) and the ExpandAmountPercent is set to 15 (%), the script will expand the volume by 15 GB once expansion is triggered.
  • Example 2: If the volume size is 100 GB, and the ExpandAmountGB is set to 10 (GB) and the ExpandAmountPercent is set to 5 (%), the script will expand the volume by 10 GB once expansion is triggered.

The value assigned to the variable NotToExceedGB is used by the script as volume maximum size that the script must not exceed. For example, if the prior 4 variables instruct the script to expand a 900 GB volume by an additional 200 GB and the NotToExceedGB variable is set to 1024 (1 TB), the script will expand the volume by 124 GB only to reach the NotToExceedGB amount but to not to exceed it.

DiskNumber and DriveLetter are values that the script user should obtain from the server’s Disk Management screen of the file server using this iSCSI volume:

Expand05

As of the time of writing this post and script (1 April 2016), there’s no way to correlate a volume on a file server to a volume on a StorSimple device. For example, if you create 3 volumes of the same size on a StorSimple device and call them data1, data2, and data3, and present them to the same file server and format them with the same file system and block size, and use volume labels data1, data2, data3, there’s no way to tell if data1 on the StorSimple device is the volume labeled data1 on the file server. This is why it’s recommended to provision and format StorSimple volumes one at a time and use the same volume label when formatting the volume as the volume name on StorSimple. Long story short, it’s the user’s responsiblity to:

  1. Make sure the DrviveLetter and DiskNumber correspond to the SSVolumeName, and
  2. Update the DrviveLetter and DiskNumber values if they change on the file server due to adding or removing volumes.

One last point here; if this iSCSI volume is presented to a Windows Failover cluster, this script must be run on the owner node.

LogFile is the path to where the script will log its actions – each log line will be time stamped. This could be on a network share.

EmailSender is the name and email address you wish to have the email notification appear to come from. For example: StorSimple Volume Size Monitor <DoNotReply@YourDomain.com>

$EmailRecipients = @(
‘Sam Boutros <sboutros@vertitechit.com>’
‘Your Name <YourName@YourDomain.com>’
)

is an array that takes one or more email addresses in the format shown above.

SMTPServer is your SMTP relay server. You need to make necessary configuration/white-listing changes to allow your SMTP server to accept and relay SMTP email from the server running the script.

Sample script output:

Expand06

another example:

Expand04

and example of email notification:

Expand07

Possible future enhancements to this script include:

  1. Rewrite the script as a function so that it can handle several volumes
  2. Rewrite the script to use Powershell remoting, so that it does not have to run on the file server.
  3. Add functionality to detect if the target file server is a member of a failover cluster, and to automatically target the owner node.

 

 


Troubleshooting StorSimple high latency IO’s blocking low latency IO’s


By design StorSimple hybrid cloud storage tiers off automatically the oldest blocks from the local SSD tier down to the SAS tier as the SSD tier fills up (reaches ~80% capacity). In turn it also tiers down the oldest blocks from the SAS tier to the Azure tier as that fills up (reaches ~80% capacity).

This has the great benefits of:

  1. Automated tiering: This negates the need for data classification and the entirety of the efforts associated with that.
  2. Granular tiering: Tiering happens at the block level not at the file level. That’s 64KB for tiered volumes. So, a file can have some hot blocks in SSD, some older blocks in SAS, and some cold blocks that have been displaced all the way down to the Azure tier by warmer blocks (of the same or other files)

As of the time of writing this post (28 March 2016), tiering is fully automated and not configurable. The exception is ‘Locally Pinned Volume’ feature that comes with StorSimple software update 2.0 (17673) and above. A locally pinned volume loses the deduplication and compression features of a ‘Tiered Volume’, and always resides on the physical device. Currently no visibility is provided as to what tier a Locally Pinned Volume resides (SSD or SAS).

In the following scenario – take the example of an 8100 StorSimple device that has 15.8 TB local usable capacity (prior to dedplication and compression):

  1. Customer creates handful of volumes – about 30 TB provisioned out of 200 TB max allowed on the device, migrates some 25 TB of data:
    Capacity02
    The above ‘Primary’ capacity graph shows about 25 TB of data as it appears to the SMB file servers that consume the iSCSI volumes, while the below ‘Device’ capacity graph shows that about 10 TB of that 25 TB resides on the same device for the same time period.
    Capacity01
  2. Customer does an archive data dump, such as 2 TB of old backup or archive files. Any new data comes in as hot and in a ‘full’ device, it will displace older blocks to Azure. In this case, we have several TB of active production data that got inadvertently displaced to Azure. The following access pattern is observed:
    1. End user attempts to retrieve files. If the file blocks are in Azure, they will be retrieved, but to make room for them in the SSD tier, other blocks has be tiered down to the full SAS tier, which will have to tier off blocks back to Azure to make room for blocks coming down from SSD. So, a read operation has caused 2 tiering operations including a write operation to Azure. This is described as high latency IO operation.
    2. If this is taking several minutes, during the period where the device is handling high latency IO’s described above, if other users are requesting files that RESIDE ENTIRELY LOCALLY on the device (described as low latency IO operations), it has been observed that those read requests are slowed down as well to a crawl. That’s is high latency IO’s appear to block low latency IO’s.
    3. So in this scenario, a 2 TB archive data dump on an 8100 device with 10 TB on the device, result in the entire 10 TB being shuffled out to Azure and back in, few blocks at a time, until the 2 TB archive data ends up in Azure returning the device to its pre-incident status.

In my opinion, this is a situation to be avoided at all costs. Once it occurs, the device may exhibit very slow performance that may last for weeks until the archive data dump has made its way through the rest of the data on the device to Azure.

Best practices recommended to avoid this scenario:

  1. Adhere to the recommended device use cases, particularly unstructured data/file shares. StorSimple is not meant for multi-terabyte high performance SQL databases for example. Another example that is not recommended as a workload on StorSimple is large PST files. They’re essentially database file that are accessed frequently, and get scanned, indexed and accessed in their entirety.
  2. Do not run any workload or process that scans the active data set in its entirely. Anti-virus and anti-malware scans must be configured for incremental use or quick scans only, never for a full scan of all files on a volume. This applies to any process that may try to index, categorize, classify, or read all files on a volume. The exception is a process or application that reads files metadata and properties only – not open the files and reads inside of them. Reading metadata is OK because metadata always resides locally on the device.
  3. Carefully plan your data migration to StorSimple, putting emphasis on migrating the oldest data first. Robocopy can be a very helpful tool in the process.

I’m adding the following enhancements to my wishlist that I hope to see implemented by Microsoft in the next StorSimple software release:

  • Resolving the core issue of high latency IO’s seeming to block/impede low latency IO’s
  • More visibility into the device tiering metrics. Simply put, a storage admin needs to know when a StorSimple device is ‘full’ and is tiering off blocks from the primary data set to Azure. This knowledge is critical to avoid the situation described above. A metric of the amount of space available before the device is full, is even better to help provide predictability before reaching that point.
  • ‘Cloud Pinned Volume’ feature would be very helpful. This should allow the StorSimple storage admin to provision an iSCSI volume that resides always in Azure and does not affect the device heat map.

Powershell script to re-hydrate StorSimple files based on date last accessed


In some rare situations, a StorSimple hybrid cloud storage device can reach a point where a large cold data dump has displaced hot data to the cloud (Azure). This happens if the device local SSD and SAS tiers are full (including reserved space that cannot be used for incoming data blocks from the iSCSI interfaces). In this situation, most READ requests will be followed by Azure WRITE requests. What’s happening is that the device is retrieving the requested data from Azure, and to make room for it on the local tiers it’s displacing the coldest blocks back to Azure. This may result in poor device performance especially in situations where the device bandwidth to/from the Internet is limited.

In the scenario above, if the cold data dump occurred 8 days ago for example, we may be interested in re-hydrating files last access in the week prior to that point in time. This Powershell script does just that. It identifies files under a given directory based on date last accessed, and reads them. By doing so, the StorSimple device brings these files to the top SSD tier. This is meant to run off hours, and is tested to improve file access for users coming online the next day.

To use this script, modify the values for the $FolderName variable. This is where the script looks for files to re-hydrate. The script searches for all sub-folders.

Rehydrate2

Also modify the values of the $StartDays and $EndDays variables. As shown in the example above, the selection of 15 StartDays and 9 EndDays will re-hydrate data whose LastAccessTime was 9-15 days ago.

Script output may look like:

Rehydrate1

As usual, a log file is generated containing the same output displayed on the console. This is helpful if the script will be run as a scheduled task or job.