This is a great feature!

I read Gaurav Mantri’s excellent blog post on copying blobs from S3 to Azure Storage and realised that this had been the feature we’d been looking for ourselves for a long time. The new 1.7.1 API enables copying from one subscription to another, intra-storage account or even inter-data centre OR as Gaurav has shown between Azure and another storage repo accessible via HTTP. Before this was enabled the alternative was to write a tool to read the blob into a local store and then upload to another account generating ingress/egress charges and adding a third unwanted wheel. You need to hit Azure on GitHub – as yet it’s not released as a nuget package. So go here and clone the repo and compile the source.

More on code later but for now let’s consider how this is being done.

Imagine we have two storage accounts, elastaaccount1 and elastaaccount2 and they are both in different subscriptions but we need to copy a package from one subscription (elastaaccount1) to another (elastaaccount2) using the new method described above.

Initially the API uses an HTTP PUT method with the new HTTP header x-ms-copy-source which allows elastaaccount2 to specify an endpoint to copy the blob from. Of course, in this instance we’re assuming that there is no innate security on this and the ACL is opened up to the public but it may be the case that this isn’t so in which case a Shared Access Signature should be used which can be generated fairly easily in code from the source account and appended to the URL to allow the copy to ensue on a non-publicly accessible Blob.

PUT http://elastaaccount2.blob.core.windows.net/vanilla/mypackage.zip?timeout=90 HTTP/1.1
x-ms-version: 2012-02-12
User-Agent: WA-Storage/1.7.1
x-ms-copy-source: http://elastaaccount1.blob.core.windows.net/vanilla/mypackage.zip
x-ms-date: Wed, 04 Jul 2012 16:39:19 GMT
Authorization: SharedKey elastastorage3:<my shared key>
Host: elastaaccount3.blob.core.windows.net

This operations returns a 202 Accepted. The API will then poll asynchronously since this is queued and then use the HEAD method to determine the status. The product team, on their blog, state that there is no SLA currently so you can have this sitting in a queue without an ackowledgement from Microsoft BUT in all our tests it is very, very quick within the same data centre.

HEAD http://elastaaccount1.blob.core.windows.net/vanilla/mypackage.zip?timeout=90 HTTP/1.1
x-ms-version: 2012-02-12
User-Agent: WA-Storage/1.7.1
x-ms-date: Wed, 04 Jul 2012 16:32:16 GMT
Authorization: SharedKey elastaaccount1:<mysharedkey>
Host: elastaaccount1.blob.core.windows.net

Bless the Fabric – this is an incredibly useful feature. Here is a class that might help save you some time now.


///<summary> 
/// Used to define the properties of a blob which should be copied to or from
/// </summary>
public class BlobEndpoint
 {
 ///<summary> 
 /// The storage account name
 /// </summary>
 private readonly string _storageAccountName = null;
 ///<summary> 
 /// The container name 
 /// </summary>
 private readonly string _containerName = null;
 ///<summary> 
 /// The storage key which is used to
 /// </summary>
 private readonly string _storageKey = null;
 /// <summary> 
 /// Used to construct a blob endpoint
 /// </summary>
 public BlobEndpoint(string storageAccountName, string containerName = null, string storageKey = null)
 {
   _storageAccountName = storageAccountName;
   _containerName = containerName;
   _storageKey = storageKey;
 }

///<summary> 
/// Used to a copy a blob to a particular blob destination endpoint - this is a blocking call
/// </summary>
public int CopyBlobTo(string blobName, BlobEndpoint destinationEndpoint)
{
   var now = DateTime.Now;
   // get all of the details for the source blob
   var sourceBlob = GetCloudBlob(blobName, this);
   // get all of the details for the destination blob
   var destinationBlob = GetCloudBlob(blobName, destinationEndpoint);
   // copy from the destination blob pulling the blob
   destinationBlob.StartCopyFromBlob(sourceBlob);
   // make this call block so that we can check the time it takes to pull back the blob
   // this is a regional copy should be very quick even though it's queued but still make this defensive
   const int seconds = 120;
   int count = 0;
   while (count < (seconds * 2))
   {
     // if we succeed we want to drop out this straight away
     if (destinationBlob.CopyState.Status == CopyStatus.Success)
        break;
     Thread.Sleep(500);
     count++;
   }
   //calculate the time taken and return
   return (int)DateTime.Now.Subtract(now).TotalSeconds;
}

///<summary> 
/// Used to determine whether the blob exists or not
/// </summary>
public bool BlobExists(string blobName)
{
   // get the cloud blob
   var cloudBlob = GetCloudBlob(blobName, this);
   try
   {
      // this is the only way to test
      cloudBlob.FetchAttributes();
   }
   catch (Exception)
   {
   // we should check for a variant of this exception but chances are it will be okay otherwise - that's defensive programming for you!
     return false;
   }
   return true;
}

///<summary> 
/// The storage account name
/// </summary>
public string StorageAccountName
{
   get { return _storageAccountName; }
}

///<summary> 
/// The name of the container the blob is in
/// </summary>
public string ContainerName
{
   get { return _containerName; }
}

///<summary> 
/// The key used to access the storage account
/// </summary>

public string StorageKey
{
   get { return _storageKey; }
}

///<summary> 
/// Used to pull back the cloud blob that should be copied from or to
/// </summary>
private static CloudBlob GetCloudBlob(string blobName, BlobEndpoint endpoint)
{
   string blobClientConnectString = String.Format("http://{0}.blob.core.windows.net", endpoint.StorageAccountName);
   CloudBlobClient blobClient = null;
   if(endpoint.StorageKey == null)
     blobClient = new CloudBlobClient(blobClientConnectString);
   else
   {
      var account = new CloudStorageAccount(new StorageCredentialsAccountAndKey(endpoint.StorageAccountName, endpoint.StorageKey), false);
      blobClient = account.CreateCloudBlobClient();
   }
   return blobClient.GetBlockBlobReference(String.Format("{0}/{1}", endpoint.ContainerName, blobName));
 }
}

The class itself should be fairly self-explanatory and should only be used with public ACLs although the modification is trivial to generate a time-dependent SAS.

A simple test would be as follows:

  var copyToEndpoint = new BlobEndpoint("elastaaccount2", ContainerName, "<secret primary key>");
  var endpoint = new BlobEndpoint("elastaaccount1", ContainerName);
  bool existsInSource = endpoint.BlobExists(BlobName);
  bool existsFalse = copyToEndpoint.BlobExists(BlobName);
  endpoint.CopyBlobTo(BlobName, copyToEndpoint);
  bool existsTrue = copyToEndpoint.BlobExists(BlobName);

There you go. Happy trails etc.

About these ads