Archive for July, 2018

Azure storage – features and pricing – June 2018


In the ever evolving Azure storage list of offerings, it may be hard to fully realize the available Azure storage offerings and their general cost structure at a given point in time. This post lays out a general summary of Azure storage offerings and costs, from the prospective of a consultant trying to make a recommendation to a large client as what storage type/options to use for which workload and why.

Storage account type

Classic storage

  • This is of type ‘Microsoft.ClassicStorage/storageAccounts’
  • This is considered legacy and should be migrated to GPv1 which provides backward compatibility to Azure classic (ASM) services

General purpose v1 (GPv1)

  • This is of type ‘Microsoft.Storage/storageAccounts’, kind ‘Storage’
  • Does not support Cool and Archive access tier attributes
  • Lower read/write transaction cost compared to GPv2
  • Can be used with Azure classic (ASM) services

General purpose v2 (GPv2)

  • This is of type ‘Microsoft.Storage/storageAccounts’, kind ‘StorageV2’
  • Supports Cool and Archive access tier attributes
  • Access Tier attribute (Hot/Cool) is exposed at the account level
  • Cannot be used with Azure classic (ASM) services
  • Compared to Blob Storage account: GPv2 charge for Cool early deletion but not Cool data writes (per GB)

Blob storage

  • This is of type ‘Microsoft.Storage/storageAccounts’, kind ‘BlobStorage’
  • This is a sub-type of GPv2 that supports only Block Blobs – not Page Blobs, Azure files, …
  • Minor pricing difference: charge for Cool data writes (per GB) but not Cool early deletion
  • Features similar to GPv2:
    • Supports Cool and Archive access tier attributes
    • Access Tier attribute (Hot/Cool) is exposed at the account level
    • Cannot be used with Azure classic (ASM) services

GPv1 and Blob Block accounts can be upgraded to a GPv2 account. This change cannot be reversed.

Set-AzureRmStorageAccount -ResourceGroupName <resource-group> -AccountName <storage-account> -UpgradeToStorageV2

Data Access Tiers

Data Access Tiers are Hot, Cool, and Archive. They represent 3 different physical storage platforms.

For the purpose of simplification, I will refer to LRS pricing in US East Azure region in US dollars for the next 450TB/month.

  • Data Access Tiers are supported in GPv2 accounts only (and the Block Blob storage sub-type) not GPv1
  • A blob cannot be read directly from the Archive tier. To read a blob in Archive, it must first be moved to Hot or Cool.
  • Data in Cool tier is subject to minimum stay period of 30 days. So, moving 1 TB of data to Cool tier for example, obligates the client to paying for that storage for 30 days at a minimum even if the data is moved or deleted before the end of the 30 days. This is implemented via the ‘Early deletion charge’ which is prorated.
  • Data in Archive tier is subject to minimum stay period of 180 days.

Hot

  • This is the regular standard tier where data can be routinely read and written
  • Storage cost is 2 cents per GB per month
  • Lowest read and write IO charges (5 cents per 10k write, 0.4 cents per 10k read)
  • No data retrieval charge (This is the cost to copy data to Hot tier from another tier before it can be read – already in Hot tier)

Cool

  • 25% cheaper storage cost down to 1.5 cents per GB per month
  • More costly read and write IO charges (10 cents per 10k write = 200% Hot, 1 cent per 10k read = 250% Hot)
  • 1 cent per GB data retrieval charge (this is the cost to copy the data to Hot tier, which is a required interim step to read data that resides in Cool tier)

Archive

  • 90% cheaper storage cost down to 0.2 cents per GB per month
  • Most costly read and write IO charges (10 cents per 10k write = 200% Hot, 5 dollars per 10k read = 125,000% Hot)
  • 2 cent per GB data retrieval charge (this is the cost to copy the data to Hot tier, which is a required interim step to read data that resides in Archive tier)
  • The archive tier can only be applied at the blob level, not at the storage account level
  • Blobs in the archive storage tier have several hours of latency (Is this tier using tape not disk!?)

Geo-replication

Geo-replication refers to automatically keeping a copy of a given Storage Account data in another Azure region that’s 400 miles or more away from the primary site. The primary Azure site is the Azure region where the Storage account resides and where the usual read/write transactions occur. The choice of a secondary site is a constant determined by Microsoft and is available if the client chooses the geo-replication feature of a Storage Account. The following is the list of the Azure region pairs as of 18 June 2018

Data in the secondary Azure region cannot be accessed. It is only used under the following conditions:

  • Microsoft declares a region-wide outage. For example East US. This is a Microsoft triggered – not client triggered event
  • Microsoft will first try to restore the service in that region. During that time data is not accessible for read or write. It’s unclear how long will Microsoft pursue this effort before moving to geo-failover.
  • Microsoft initiates geo-failover from the primary to secondary region. That’s all data of all tenants in the primary region.
    • This is essentially a DNS change to re-point the storage end points’ fully qualified domain names to the secondary region
    • The RPO (Recovery Point Objective) is 15 minutes. That’s to say up to 15 minutes worth of data may be lost.
    • The RTO (Recovery Time Objective) is unclear. That’s the time between primary site outage to data availability for read and write on the secondary site
    • At the end of the geo-failover, read/write access is restored (for geo-replicated storage accounts only of course)
    • At a later time, Microsoft will perform a geo-failback which is the same process in the reverse direction
  • This is a process that never happened before. No Azure data center ever sustained a complete loss.
  • It’s unclear when failback will be triggered, whether it will include down-time, or another 15 minute data loss.

A storage account can be configured in one of several choices with respect to geo-replication:

LRS (Locally redundant storage)

When a write request is sent to Azure Storage Account, the transaction is fully replicated on three different physical disks across three fault domains and upgrade domains inside the primary location, then success is returned back to the client.

GRS (Geo-redundant storage)

In addition to triple synchronous local writes, GRS adds triple asynchronous remote writes of each data block.  Data is asynchronously replicated to the secondary Azure site within 15 minutes.

ZRS (Zone-redundant storage)

ZRS is similar to LRS but it provides slightly more durability than LRS (12 9’s instead of 11 9’s for LRS over a given year).

It’s ony available to GPv2 accounts.

RA-GRS (Read access geo-redundant storage)

RA-GRS is similar to GRS but it provides read access to the data in the secondary Azure site.