Tuesday, 11 February 2014

DR: vSphere to AWS using vSphere snapshots and S3 sync


Disaster Recovery has always been a hot topic for business and for the longest time DR was an exclusive feature for those with huge IT budgets and expensive SANs doing replication at the array level.

Fortunately things have moved on and there are a number of purely software-based technologies which bring disaster recovery functionality within reach of even small users; for VMware-based VMs there are many options including pure vSphere ReplicationSite Recovery Manager, Veeam and Zerto.

I'm a big fan of both VMware and Amazon Web Services and the lack of any useful product to link these technologies together has often been a point of discussion in client meetings; an on-site VMware solution needs somewhere else to replicate to - AWS should be a perfect fit for this!

At $dayjob we've often quoted customers for just this, it's been based around Veeam backups. However, I have often played through a scenario in my head that would allow regular vSphere snapshots to be replicated to a 'pilot light' environment in AWS using only native tools. 

Last weekend, I made it happen.


In my head at least, it's quite straightforward. Take an initial snapshot, replicate it to AWS and then make use of vSphere Changed Block Tracking so that with every future snapshot, only the differences between each snapshot will need to be replicated, thus vastly reducing replication overhead when I need to get that data into AWS!

Step One: Take a snapshot

I use VMware's PowerCLI PowerShell cmdlets to connect to the vCenter Server (a single ESXi host should also work) and take a snapshot of the VM I want to protect.

The first time I do this, pass the 'QueryChangedDiskAreas' function a ChangeID of "*" - vSphere knows this to mean 'give me a list of all the active extents for that disk'

Many thanks to Scott Herold, the VMguru for this great post which told me how QueryChangedDiskAreas worked and how to get the ChangeId for the each virtual hard disk!

Scott makes a great job of writing function-based code that handles errors. For brevity, the necessary code can be distilled right down to this:

 Connect-VIServer -Server ... -User ... -Password ....
 $vm = Get-VM -Name dc02  
 $vmview = $vm | Get-View  
 $newSnap = $vm | New-Snapshot -Name "Karma Sync"  
 $newSnapview = $newSnap | Get-View  
 $changes=$vmview.QueryChangedDiskAreas($newSnapview.MoRef, 2000, 0, "*")  
 $changes.ChangedArea | Format-Table -HideTableHeaders > thefile.txt  
 $thischange=($newsnapview.Config.Hardware.Device | where { $_.Key -eq 2000 }).backing.changeid  
 #$newSnap | Remove-Snapshot -Confirm:$false #run at end of Step Two
To give a little context, 'thefile.txt' will contain something like this after running the PowerShell code:

              0       2097152  
        7340032      15728640  
       31457280       5242880  
      104857600      23068672  
      133169152       8388608  
      145752064      24117248  
      173015040      176160768  
      351272960      49283072  
      401604608      319815680  
      724566016       1048576  
      728760320       5242880  
      735051776     2194669568  
     2930769920      328204288  
     3260022784       8388608  
     3280994304      108003328  
     3536846848     2985295872  
     6525288448       6291456  
     6532628480     1857028096  
     8398045184       8388608  
     8412725248      15728640  
     8431599616      20971520  
     8455716864       3145728  
     8459911168      11534336  
     8474591232       7340032  
     8484028416       5242880  
     10209984512      158334976  
     42947575808       1048576  

This list describes the byte offsets and data lengths of every area of the vmdk which is actually in use - this way we don't need to transfer every byte of the hard disk (most of which will be zeroes) 

Step Two: Retrieve the block payload

thefile.txt tells us the parts of the disk we should be interested in and we now need to get those bytes. I amended the 'vixDiskLibSample.cpp' sample program which comes with VMware's Virtual Disk Development Kit.

I edited the 'DoDump' function which allowed you to specify an offset and byte count for it to spit out a hex-dump of that portion of the disk. I removed the hex-dump feature and changed the function so it would instead write a series of files to the current directory.

For every line of 'thefile.txt', the DoDump function will now write a file whose name is the byte offset. The length of that file is, you guessed it, exactly the size of the change.

After running the code, you end up with an 'ls -l' output like this:
-rw-r--r-- 1 root root   2097152 Feb  9 22:21 0
-rw-r--r-- 1 root root  15728640 Feb  9 22:21 7340032
-rw-r--r-- 1 root root   5242880 Feb  9 22:21 31457280
-rw-r--r-- 1 root root  23068672 Feb  9 22:21 104857600
-rw-r--r-- 1 root root   8388608 Feb  9 22:21 133169152
-rw-r--r-- 1 root root  24117248 Feb  9 22:21 145752064
-rw-r--r-- 1 root root 176160768 Feb  9 22:21 173015040
... and so on
You should now remove the vSphere snapshot using the PowerCLI 'Remove-Snapshot' cmdlet

Step Three: Sync to AWS

Amazon Web Services recently launched a unified CLI tool to bring together a ton of individual tools. This is great news since the configuration and usage for each of those tools was different - you'd end up with a whole stack of Python, Ruby and Java and duplicated configs. Oh dear lord so much Java. I digress.

I create an S3 bucket called 'gdh-vm-replication' in which to store the change sets and then run 'aws s3 sync' to push the changes up:

# aws s3 mb s3://gdh-vm-replication/
# aws s3 sync ./ s3://gdh-vm-replication/
upload: ./10266279936 to s3://gdh-vm-replication/10266279936
upload: ./10223616 to s3://gdh-vm-replication/10223616
upload: ./10255335424 to s3://gdh-vm-replication/10255335424
Completed 3 of 508 part(s) with 497 file(s) remaining
... and so on

Step Four: Sync to disk file

Now in AWS, I have an EC2 instance with an initially empty directory /tmp/spool. I'm dealing with a 40GiB VMware disk which means 40960MiB (i.e. power-of-two mebibytes rather than marketing megabytes) , so I created a 40GiB sparse file just to get things going:
# dd if=/dev/zero of=/mnt/dc02.raw bs=1M seek=40959 count=1
The instance performs two simple tasks in normal operations:
  • Pull down the changes

  • # aws s3 sync s3://gdh-vm-replication/ /tmp/spool/
    • Apply those changes to a local disk file
    # ls -1 /tmp/spool | awk '{print "dd if=/tmp/spool/" $1 " seek=" $1/65536 " of=/mnt/dc02.raw bs=64k "}'  | sh
    Using the same 'thefile.txt' as before, the output that gets piped into 'sh' looks like this:
    dd if=/tmp/spool/0 seek=0 of=/mnt/dc02.raw bs=64k
    dd if=/tmp/spool/7340032 seek=112 of=/mnt/dc02.raw bs=64k
    dd if=/tmp/spool/31457280 seek=480 of=/mnt/dc02.raw bs=64k
    dd if=/tmp/spool/104857600 seek=1600 of=/mnt/dc02.raw bs=64k

    Step Five: Repeat as needed

    Repeat all of the above steps every 15 minutes, or as often as you wish. The only thing of particular note is that back in Step One when you retrieved the value of $thischange, you must ensure that the NEXT time you take a snapshot, you pass that value into the QueryChangedDiskAreas function instead of "*", otherwise you will be syncing the entire disk each time!

    Step Six: Invoke disaster recovery

    Ah yes. That nice unified CLI I mentioned earlier? It's not quite completely unified yet, so the older EC2 API tools are still necessary :/
    I hope they will be incorporated into the unified AWS CLI soon - I logged a support case / feature request about that, since the console output of the ec2-import-instance command is an utter fright - look:

    # ec2-import-instance -O AKIAxxxxxxxxxxxxxx  -o AKIAxxxxxxxxxxxxxxx -W cxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxW -w cxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxW -b gdh-vmimport --region eu-west-1 -f RAW /mnt/dc02.raw
    Requesting volume size: 40 GB
    TaskType        IMPORTINSTANCE  TaskId  import-i-fh6yum0c       ExpirationTime  2014-02-16T22:36:58Z    Status  active  StatusMessage   Pending     InstanceID      i-760d4437
    DISKIMAGE       DiskImageFormat RAW     DiskImageSize   42949672960     VolumeSize      40      AvailabilityZone        eu-west-1c      ApproximateBytesConverted   0       Status  active  StatusMessage   Pending
    Creating new manifest at gdh-vmimport/41f1f52b-62e8-45d4-a3f9-9d3de98b5749/image.rawmanifest.xml
    Uploading the manifest file
    Uploading 42949672960 bytes across 4096 parts
    0% |--------------------------------------------------| 100%
    The bucket name 'gdh-vmimport' is somewhere for the ec2-import-instance to store a manifest file which details the S3 location of all of the 4096 10MB chunks of data which AWS will use to 'Amazonify' your disk image.
    This will take ages to upload - the 40GiB took approx 45 minutes on an m3.medium. After that, Amazon will boot the instance behind the scenes, install EC2Config and the Citrix PV SCSI and Network drivers.

    This takes approximately 15 minutes. That brings the minimum achievable 'Recovery Time Objective' to no less than one hour, which is a bit sucky.

    Instance i-760d4437 (mentioned in the output from ec2-import-instance) will be ready to start - you did remember to open port 3389 in the Windows host firewall, didn't you? ;)

    All comments welcome - this is a Proof of Concept piece of work and while it works, you'll want to add a wrapper for monitoring, reporting, alerting and configuration.

    Code reference

    Here's a patch of the changes I made the VMware's vixDiskLibSample application. I am not a C++ programmer!

     --- vixDiskLibSample.cpp.orig  2014-02-08 01:45:58.289800491 +0000  
     +++ vixDiskLibSample.cpp    2014-02-08 16:04:53.599361218 +0000  
     @@ -17,6 +17,7 @@  
      #include <sys/time.h>  
     +#include <fstream>  
      #include <time.h>  
      #include <stdio.h>  
      #include <stdlib.h>  
     @@ -1300,19 +1301,37 @@  
        VixDisk disk(appGlobals.connection, appGlobals.diskPath, appGlobals.openFlags);  
     -  uint8 buf[VIXDISKLIB_SECTOR_SIZE];  
     +  uint8 buf[65536];  
        VixDiskLibSectorType i;  
     +  std::ifstream infile("thefile.txt");  
     -  for (i = 0; i < appGlobals.numSectors; i++) {  
     -    VixError vixError = VixDiskLib_Read(disk.Handle(),  
     -                      appGlobals.startSector + i,  
     -                      1, buf);  
     -    CHECK_AND_THROW(vixError);  
     -    DumpBytes(buf, sizeof buf, 16);  
     +  using namespace std;  
     +  long long a, b;  
     +  while (infile >> a >> b)  
     +  {  
     +    appGlobals.startSector = a / 512;  
     +    appGlobals.numSectors = b / 65536;  
     +    char outfilename[40];  
     +    int n;  
     +    n=sprintf (outfilename, "delta/%llu", a);  
     +    std::ofstream outputBufferHere(outfilename, std::ios::binary|std::ios::out);  
     +    for (i = 0; i < appGlobals.numSectors; i++) {  
     +      VixError vixError = VixDiskLib_Read(disk.Handle(),  
     +                      appGlobals.startSector + (i * 128),  
     +                      128, buf);  
     +      CHECK_AND_THROW(vixError);  
     +//      DumpBytes(buf, sizeof buf, 16);  
     +      outputBufferHere.write((const char*)buf, 65536);  
     +    }  
     +    outputBufferHere.close();  
     +  infile.close();  

    Friday, 17 January 2014

    Masterless Puppet on Windows in AWS using Elastic Beanstalk

    Sam Bashton gave a great talk at PuppetConf 2013 about how Bashton, Ltd. deploy Linux-based EC2 instances and configure them using Puppet where all the modules and manifests are 'baked in.'

    In the world of Puppet this is a very convenient way to handle the problem of running a puppet-master and its requisite SSL key signing. Puppet's SSL signing is suitable for running a house of pets, but falls some way short when dealing with a farm of cattle.

    During the Christmas break I tried to reimplement an existing customer's AWS Elastic Beanstalk deployment with Puppet using Sam Bashton's talk as the inspiration. I've now been prompted to write it up after reading Masterzen's great article this morning.

    Current State

    The customer needs to run more than a single application on each EC2 instance - the cost to deploy a separate Beanstalk for each tier is simply too high for them, and their code is not yet ready to consume a single 'service layer.'

    Hence I made use of .ebextensions config files in Elastic Beanstalk to configure multiple IIS websites on each instance. I'll save you from the full travesty of this file (e.g. abusing the HOSTS file so the services hostname resolves to and provide this potted version of the 'beanstalk.config':

           "c:/scripts/ipfudge": http://admin01/scripts/ipfudge.zip

           "c:/program files/amazon/AWSCLI": http://admin01/scripts/AWSCLI.zip
           "c:/inetpub/newwebservices": http://admin01/sites/custname.newwebservices.zip 
             content: |
              cd \inetpub\mobile\App_Data
              "C:\program files\amazon\awscli\aws" --region eu-west-1 s3 sync s3://cust-mobile-appdata/ . --delete >%TEMP%\s3sync4.log 2>&1

        cwd: c:\windows\system32\inetsrv
        command: appcmd add apppool /name:"CustomerName.NewWebServices"
        ignoreErrors: true
        waitAfterCompletion: 0 
        cwd: c:\windows\system32\inetsrv
        command: appcmd add site /name:"CustomerName.NewWebServices" /id:2 /bindings:http://web.customername.com:80 /physicalPath:"c:\inetpub\newwebservices"
        ignoreErrors: true
        waitAfterCompletion: 0 
        cwd: c:\windows\system32\inetsrv
        command: appcmd set app "CustomerName.NewWebServices/" /applicationPool:"CustomerName.NewWebServices"
        ignoreErrors: true
        waitAfterCompletion: 0
             cwd: c:\windows\system32
             command: schtasks /create /ru SYSTEM /TN Sync_S3_App_Data /TR "cmd /c c:\scripts\s3-appdata-sync.cmd >%TEMP%\s3sync1.log 2>&1" /SC MINUTE /MO 30 /ST 00:18 /F
             ignoreErrors: true 

             waitAfterCompletion: 0 

             cwd: c:\windows\system32\inetsrv
             command: appcmd set config /section:urlCompression /doDynamicCompression:True
             ignoreErrors: true
             waitAfterCompletion: 0

             cwd: c:\windows\system32\inetsrv
             command: cmd /c c:\scripts\ipfudge\ipfudge.cmd >%TEMP%\ipfudge.log 2>&1
             ignoreErrors: true
             waitAfterCompletion: 0 


    There's a whole ton of pain going on there. Every time new code is deployed through Beanstalk (the customer uses the AWS Toolkit for Visual Studio to publish directly from their development workstations) every command in this Beanstalk configuration will be executed.

    That means at least the following things happen:

    • The c:\inetpub\directory containing the 'main' application at 'Default Web Site' will be emptied and populated according to whatever was pushed from Visual Studio
    • The 'sources' section in the beanstalk.config will remove the destination directory, download the ZIP file from the provided URL and unpack it at the destination directory
    • All of the 'container_commands' will be executed in lexicographical order (hence 'zzfinalxx' to ensure they are run last)
    • The commands in 'ipfudge.cmd' are run each time (a long series of 'appcmd set config' commands to allow/deny access to the NewWebServices site based on IP address)

    The repeated use of 'ignoreErrors: true' makes the associated output in 'cfn-init.log' very noisy, e.g.:

    2014-01-17 09:15:55,444 [ERROR] Command newwebservices01 (appcmd add apppool /name:"CustomerName.NewWebServices") failed
    2014-01-17 09:15:55,444 [INFO] ignoreErrors set to true, continuing build
    2014-01-17 09:15:56,490 [ERROR] Command newwebservices02 (appcmd add site /name:"CustomerName.NewWebServices" /id:2 /bindings:http://web.customername.com:80 /physicalPath:"c:\inetpub\newwebservices") failed
    2014-01-17 09:15:56,490 [INFO] ignoreErrors set to true, continuing build
    2014-01-17 09:15:57,648 [INFO] Command newwebservices03 succeeded


    I would far rather define this in terms of a Puppet manifest - that way only deviations from the described state will be changed, rather than churning through every change every time.

    I had something like this in mind:


    First we need to take the Windows Server 2012 AMI used with Elastic Beanstalk and install Puppet and other tools. It's common to bake your own AMI however I'm quite keen on avoiding work so I prefer to configure the existing Windows Server 2012 AMI (ami-396a784d at time of writing).

    So, first we need a hugely simplified beanstalk.config:

        Puppet: https://downloads.puppetlabs.com/windows/puppet-3.4.0.msi

      "c:/scripts/gopuppet": http://admin01/scripts/gopuppet.zip
      "c:/programdata/puppetlabs/puppet/etc/modules": http://admin01/scripts/puppetmodules.zip

        cwd: c:\windows\system32
        command: cmd /c powershell -executionpolicy remotesigned -file c:\scripts\gopuppet\gopuppet.ps1 >c:\progra~1\amazon\elasticbeanstalk\logs\puppet.log 2>&1
        ignoreErrors: true
        waitAfterCompletion: 0

    This will download 'gopuppet.zip' and puppetmodules.zip from an existing server in the environment called 'admin01' - the source could just as easily be on S3.

    Redirecting the output for the 'gopuppet' command allows us to see what's going on by using Beanstalk's 'Snapshot Logs' command in the web UI.

    What Am I?

    gopuppet.zip contains a single PowerShell script:, gopuppet.ps1:

    import-module "C:\Program Files (x86)\AWS Tools\PowerShell\AWSPowerShell\AWSPowerShell.psd1"
    Set-DefaultAWSRegion eu-west-1

    $my_instance=(New-Object System.Net.WebClient).DownloadString("")


    $my_instance2=New-Object 'collections.generic.list[string]'

    $filter= New-Object Amazon.EC2.Model.Filter -Property @{Name = 'resource-id'; Value = $my_instance2}

    #Name                                                        Values
    #----                                                        ------
    #resource-id                                                 {i-beec85f2}

    $my_tags= Get-EC2Tag -filter $filter
    # | where { $_.Key -eq "elasticbeanstalk:environment-name" }
    # the piped where clause works on my home laptop but not here - maybe AWS Tools 3.5 versus 3.0 on the AMI?

    #ResourceId                    ResourceType                  Key                           Value
    #----------                    ------------                  ---                           -----
    #i-beec85f2                    instance                      elasticbeanstalk:environme... customername-taupe

    foreach ($tag in $my_tags)
        if ($tag.Key -eq "elasticbeanstalk:environment-name")
            $envname= $tag.Value


    # Now we know who we are and what environment we're part of, we need to grab the current config for this environment

    $tempfile = Read-S3Object -BucketName cust-deployment-data -Key "$envname.pp" -File "$env:temp\$envname.pp"

    # Check that the file was written to less than 10 million ticks ago (10 seconds)

    if ((((Get-Date) - $tempfile.LastWriteTime).ticks) -lt 10000000)
     write "file $tempfile was written to recently"

     # Well this is a terrible fudge and it's the only way I can make it work at first-run
     # Using the windows_path module in Puppet doesn't work until the *second* Puppet run
     $env:Path += ";C:\Program Files (x86)\Git\cmd"
     # Finally run puppet and apply this config
     & "C:\Program Files (x86)\Puppet Labs\Puppet\bin\puppet" apply --debug $tempfile

     if ($LASTEXITCODE -ne 0)
        write "Execution failed"


    The puppetmodules.zip contains many off-the-shelf modules from the Puppet Forge:

    I had to hack the download_file module around to fix assumptions made about the location of temporary directories, and also to include a new download_unpack_file function which uses the Explorer Shell to unzip files without needing 7-Zip or other binary dependencies.

    Make It So

    Now we're really getting somewhere and we just need the manifest itself. Here's a simplified version, since the real one is an epic 400 lines.

    # Install Windows features Start
      dism { 'IIS-ASPNET':
        ensure => present,
        require => Dism['IIS-ISAPIFilter'],
      dism { 'IIS-ISAPIFilter':
        ensure => present,
        require => Dism['IIS-ISAPIExtensions'],
      dism { 'IIS-ISAPIExtensions':
        ensure => present,
        require => Dism['IIS-NetFxExtensibility'],
      dism { 'IIS-NetFxExtensibility':
        ensure => present,
        require => Dism['NetFx4Extended-ASPNET45'],
      dism { 'NetFx4Extended-ASPNET45':
        ensure => present,
      file {'c:/scripts':
        ensure          => directory,
        require         => Iis_vdir ["CustName.NewWebServices/"]

    # Install Windows features End

    # NewWebServices Start
      file {'c:/inetpub/newwebservices':
        ensure          => directory,

      download_unpack_file { 'custname.newwebservices.zip':
        url => 'http://admin01/custname.newwebservices.zip',
        destination_directory => 'c:\inetpub\newwebservices',
        temp_directory => 'c:\windows\temp',
        require => File['c:/inetpub/newwebservices'],

      iis_apppool {'CustName.NewWebServices':
        ensure                => present,
        managedpipelinemode   => 'Integrated',
        managedruntimeversion => 'v4.0',

      iis_site {'CustName.NewWebServices':
        ensure          => present,
        bindings        => ["http/*:80:web.custname.com"],

      iis_app {'CustName.NewWebServices/':
        ensure          => present,
        applicationpool => 'CustName.NewWebServices',

      iis_vdir {'CustName.NewWebServices/':
        ensure          => present,
        iis_app         => 'CustName.NewWebServices/',
        physicalpath    => 'c:\inetpub\newwebservices',
        require         => Download_unpack_file["custname.newwebservices.zip"]
    # NewWebServices End

    # S3 Mobile Sync Start
      file {'c:/scripts/s3-mobile-appdata-sync.cmd':
        content =>
    cd \inetpub\mobile\App_Data
    "C:\program files\amazon\awscli\aws" --region eu-west-1 s3 sync s3://cust-mobile-appdata/ . --delete >%TEMP%\s3sync4.log 2>&1
        require => File["c:/scripts"]

     scheduled_task { 'Sync_S3_Mobile_App_Data':
         ensure    => present,
         enabled   => true,
         command   => 'c:\scripts\s3-mobile-appdata-sync.cmd',
         trigger   => {
           schedule   => daily,
           start_time => '00:17'
         require   => File["c:/scripts/s3-mobile-appdata-sync.cmd"]

    # This is a weak fudge because Puppet's scheduled_task support 
    # does not include 'every N minutes'

      exec { 'Sync_S3_Mobile_App_Data':
        command => 'c:\windows\system32\schtasks.exe /change /TN "Sync_S3_Mobile_App_Data" /RI 30',
        subscribe   => Scheduled_task["Sync_S3_Mobile_App_Data"],
        refreshonly => true

    # S3 Mobile Sync End

    # AWS CLI Tools Start
      download_unpack_file { 'AWSCLI.zip':
        url => 'http://admin01/scripts/AWSCLI.zip',
        destination_directory => 'c:\program files\amazon\AWSCLI',
        temp_directory => 'c:\windows\temp',
        require => File['c:/program files/amazon/AWSCLI'],

      file {'c:/program files/amazon/AWSCLI':
        ensure          => directory,

    # AWS CLI Tools End 

    # IP Security Fudge Start
      dism { 'IIS-IPSecurity':    
        ensure => present,

      exec { 'install-ipfudge':

        command   => "c:\\windows\\temp\\ipfudge.cmd",
        logoutput => true,
        refreshonly => true,
        subscribe => Download_file['ipfudge'],
        require => Dism ['IIS-IPSecurity']

      download_file { 'ipfudge':
        url => 'http://admin01/scripts/ipfudge.cmd',
        destination_directory => 'c:\windows\temp',
        temp_directory => 'c:\windows\temp',
        require => File['c:/scripts'],

    # IP Security Fudge End

    # More weakness since there is no IIS module which can manipulate this data, so we break out to a dirty exec()
      exec { 'Dynamic_Compression':
        command => 'c:\windows\system32\inetsrv\appcmd.exe set config /section:urlCompression /doDynamicCompression:True',
        refreshonly => true


    This was an interesting technical challenge and not one I would put into production for the following reasons:

    • Deployment time was far greater during the 'puppet apply' run. This was typically over three minutes for each new run compared with one minute for the 'do it all' beanstalk.config approach.
    • Initial EC2 instance build was far slower since the Puppet MSI took several minutes (I appreciate a pre-baked MSI would help here)
    • Puppet's support for Windows is still sketchy:
      • No scheduled_task 'every N minutes'
      • No manipulation of the IIS config structure
      • Much more verbose definition of IIS sites cf. three 'appcmd' lines
      • Most of the modules on the Forge will not work on Windows
    • No obvious way to get a signal to re-run 'puppet apply' other than uploading or redeploying existing code through Beanstalk. 
      • The full config included several Git-based vcsrepo sources - I wanted to be able to 'bump' the config rather than do a full environment update.
    Comments welcome! :)

    Monday, 26 August 2013

    VMFS-formatted USB sticks in VMware vSphere 5.1

    UPDATE: vSphere 5.5 behaves in the same way - still no solution other than trying to find older USB keys on eBay and hoping for the best! :(

    Over the last couple of weeks I've been building a VMware home lab (it'll probably get a post of its own in the near future) to replace 'eddie,' my ageing Core 2 Duo server and one of the constraints I enforced during the build was for the lab to be 'self consuming'.

    By that I mean it mustn't rely on external devices for DHCP/DNS or storage - the lab will be using power all the time so I wanted to minimise the amount of running equipment.

    For now I want to focus on the use of VMFS formatted partitions on USB keys / sticks / thumb drives with ESXi 5.1 because I assumed it would be straightforward. The summary is simple: ESXi is just as fussy with USB devices as it is with everything else.

    Things started off well with a 4GB Kingston DataTraveler G3 (Vendor 0951, Product 1643)

    # cd /dev/disks
    # /etc/init.d/usbarbitrator stop
    Need to disable this service else ESXi will make it available as a passthrough device for use by VMs.

    Now insert the 4GB USB stick.
    # esxcfg-scsidevs -c | grep USB
    mpx.vmhba33:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba33:C0:T0:L0                                                      988MB     NMP     Local USB Direct-Access (mpx.vmhba33:C0:T0:L0)
    mpx.vmhba39:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba39:C0:T0:L0                                                      3690MB    NMP     Local USB Direct-Access (mpx.vmhba39:C0:T0:L0)

    Now I can see two USB devices - a 1GB and a 4GB. I am using the 1GB to boot the ESXi host - no problem there since there are no VMFS partitions on the boot device - just the small boot, a few FATs and a VMkernel diagnostic partition. The interesting stuff is on the 4GB device. Here's the commands I used to create a new GPT disk label and a VMFS5 partition to go on it:

    # partedUtil mklabel mpx.vmhba39\:C0\:T0\:L0 gpt
    # partedUtil setptbl mpx.vmhba39\:C0\:T0\:L0 gpt \"1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0"
    0 0 0 0 
    1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0
    # vmkfstools -C vmfs5 mpx.vmhba39\:C0\:T0\:L0:1  
    create fs deviceName:'mpx.vmhba39:C0:T0:L0:1', fsShortName:'vmfs5', fsName:'(null)' 
    deviceFullPath:/dev/disks/mpx.vmhba39:C0:T0:L0:1 deviceFile:mpx.vmhba39:C0:T0:L0:1 
    Checking if remote hosts are using this device as a valid file system. This may take a few seconds... 
    Creating vmfs5 file system on "mpx.vmhba39:C0:T0:L0:1" with blockSize 1048576 and volume label "none". 
    Successfully created new volume: 521b9571-a9085b97-0a71-0015176fe0a0

    At this point, you can simply click 'Refresh' in the VMware vSphere Client and the new volume will appear. Rename it to something sensible (or use the -S option in vmkfstools) and we're done.

    Great. Now I got on with my life for a week and then came back to this work using the next of the USB sticks that I had lying around. This time it's a 4GB Lexar JumpDrive TwistTurn (Vendor 05dc, Product a815)


    So, same again, right?

    # esxcfg-scsidevs -c | grep USB 
    mpx.vmhba33:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba33:C0:T0:L0                                                      988MB     NMP     Local USB Direct-Access (mpx.vmhba33:C0:T0:L0) 
    mpx.vmhba39:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba39:C0:T0:L0                                                      3690MB    NMP     Local USB Direct-Access (mpx.vmhba39:C0:T0:L0) 
    mpx.vmhba40:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba40:C0:T0:L0                                                      3748MB    NMP     Local USB Direct-Access (mpx.vmhba40:C0:T0:L0)
    Ah OK, the Lexar is a bit bigger. Not all the 4GB sticks are the same. Anyway, let's continue as before:

    # partedUtil mklabel mpx.vmhba40:C0:T0:L0 gpt 
    # partedUtil setptbl mpx.vmhba40:C0:T0:L0 gpt "1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0" 
    0 0 0 0 
    1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0   
    # vmkfstools -C vmfs5 mpx.vmhba40:C0:T0:L0:1  
    create fs deviceName:'mpx.vmhba40:C0:T0:L0:1', fsShortName:'vmfs5', fsName:'(null)' 
    deviceFullPath:/dev/disks/mpx.vmhba40:C0:T0:L0:1 deviceFile:mpx.vmhba40:C0:T0:L0:1 
    Checking if remote hosts are using this device as a valid file system. This may take a few seconds... 
    Creating vmfs5 file system on "mpx.vmhba40:C0:T0:L0:1" with blockSize 1048576 and volume label "none". 
    Usage: vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/vml... or,       vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/naa... or,       vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/mpx.vmhbaA:T:L:P 
    Error: vmkfstools failed: vmkernel is not loaded or call not implemented.


    I had a look in the vmkernel.log (typically through the output of 'dmesg' and found a couple of lines which seemed relevant since they came up at the time when I issued the 'vmkfstools' command:
    2013-08-26T17:40:09.176Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x41240080c0c0, 0) to dev "mpx.vmhba40:C0:T0:L0" on path "vmhba40:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE
    2013-08-26T17:40:09.176Z cpu1:2049)ScsiDeviceIO: 2331: Cmd(0x41240080c0c0) 0x1a, CmdSN 0x1a4b from world 0 to dev "mpx.vmhba40:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

    The sense data seems interesting. The VMware KB article 289902 discusses the sense keys - 0x5 is "ILLEGAL REQUEST" and the linked page on the T10 website goes into further detail that '0x24 0x0' is INVALID FIELD IN CDB. 

    However, this is all a red herring since the DataTraveler G3 is also seeing the same logs and it's working fine:

    2013-08-26T18:53:31.861Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x412400810cc0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba39:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE
    2013-08-26T18:53:31.861Z cpu1:2049)ScsiDeviceIO: 2331: Cmd(0x412400810cc0) 0x1a, CmdSN 0x4b6 from world 0 to dev "mpx.vmhba39:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.


    The conclusion is a very unsatisfactory "ESXI 5.1 is fussy" since really there's no news there at all. I also have no idea how to pre-judge which USB sticks will work and which will fail, so I've just gone and bought another three DataTraveler G3's since I need them to complete the lab build.

    Comments welcome! :)

    <blink>HILARIOUS UPDATE</blink>

    Those three identical 4GB sticks I ordered? They arrived. Look:

    The one which is not in the packet is currently in the first MicroServer.

    It exhibits exactly the same behaviour as the Lexar sticks. 

    There are a number of differences:
    • Product IDs: the working stick is Vendor 0951, Product 1643 whilst the new one is Vendor 0951, Product 1665
    • Model strings: 'G3' in the working and '2.0' in the non-working, despite both bearing the 'G3' moniker on the plastic casing
    • Usable capacity: 3690MB in the working and 3695MB in the non-working
    • SCSI revision: the working stick is natively version 4 and natively version 2 in the non-working. There is a kernel message regarding making the version 4 command set appear to be version 2

    The SCSI revision number appears to be the crux of the matter.

    Working Kingston DataTraveler:

    2013-08-29T20:18:13.102Z cpu0:210103)usb-storage: detected SCSI revision number 2 on vmhba32 
    2013-08-29T20:18:13.102Z cpu0:210105)ScsiScan: 888: Path 'vmhba32:C0:T0:L0': Vendor: 'Kingston'  Model: 'DataTraveler G3 '  Rev: '1.00' 
    2013-08-29T20:18:13.102Z cpu0:210105)ScsiScan: 891: Path 'vmhba32:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

    Non-working Kingston DataTraveler:

    2013-08-29T19:47:33.356Z cpu1:29884)usb-storage: detected SCSI revision number 4 on vmhba38
    2013-08-29T19:47:33.356Z cpu1:29884)usb-storage: patching inquiry data to change SCSI revision number from 4 to 2 on vmhba38
    2013-08-29T19:47:33.357Z cpu1:29885)ScsiScan: 888: Path 'vmhba38:C0:T0:L0': Vendor: 'Kingston'  Model: 'DataTraveler 2.0'  Rev: '1.00' 
    2013-08-29T19:47:33.357Z cpu1:29885)ScsiScan: 891: Path 'vmhba38:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

    Non-working Lexar JumpDrive:

    2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: detected SCSI revision number 4 on vmhba40 
    2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: patching inquiry data to change SCSI revision number from 4 to 2 on vmhba40 
    2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: patching inquiry data to clear TPGS (3) on vmhba40 
    2013-08-29T20:30:25.002Z cpu0:210740)ScsiScan: 888: Path 'vmhba40:C0:T0:L0': Vendor: 'Lexar   '  Model: 'USB Flash Drive '  Rev: '1100' 
    2013-08-29T20:30:25.002Z cpu0:210740)ScsiScan: 891: Path 'vmhba40:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

    I did find that the 'usb-storage' kernel module being used supports per-device 'quirks':

    ~ # esxcfg-module -i usb-storage
    esxcfg-module module information input file: /usr/lib/vmware/vmkmod/usb-storage 
    License: GPL
    Version: Version 1.0-3vmw, Build: 1065491, Interface: 9.2  
    Built on: Mar 23 2013
    quirks: string   supplemental list of device IDs and their quirks

    Since the ESXi kernel is based on Linux, I found a clue as to what the various quirks mean: http://lxr.free-electrons.com/source/drivers/usb/storage/usb.c so I tried setting all the available options for a Lexar stick except for 'Ignore Device' (option 'i') with the following command:

    # esxcfg-module -s quirks=5dc:a815:abcdehlnoprsw usb-storage

    I know the module settings took effect because this was logged:

    2013-08-29T21:16:22.137Z cpu0:2489)<6>usb-storage 1-4:1.0: Quirks match for vid 05dc pid a815: 392b1

    However, even after a reboot of that ESXi host, there was no change in the behaviour :( Boo! Let's hope that the kernel with vSphere 5.5 is a little more forgiving!

    Thursday, 22 August 2013

    Shrinking a Windows boot drive in AWS

    I recently deployed a couple of Windows domain controllers from a CloudFormation template and only after I'd done reasonable work to them did I realise that the boot drives were 100GB rather than a more sensible 40GB.

    Now, that additional 60GB isn't exactly going to break the bank (it works out at about £50 per year) however I'm committed to provisioning the correct size. 

    I investigated a few different options for shrinking the Windows boot drive in AWS and this is the one that I found to be the most clear.

    In this walkthrough I'll be working on hostname DC01 as the Windows machine in question

    AWS Console

    • Take a snapshot of DC01


    • Start -> Administrative Tools -> Computer Management -> Storage -> Disk Management
    • Right-click on Volume C: -> Shrink Volume so the 'Total size after shrink' is no more than 39000 MB

    • Start -> Shut down. 

    AWS Console

    •  Determine which Availability Zone DC01 is in. In this case it's zone 1a in eu-west:

    • Create a new Amazon Linux EC2 instance in the same zone.
      • I chose an EBS-optimised m1.large to improve the disk speed
    • Detach the 100GB root volume from DC01
    • Attach the 100GB root volume to the new Amazon Linux EC2 instance as device /dev/sdf (the default as of writing)
    • Create a new 40GB standard EBS volume in the same zone and attach it to the Amazon Linux EC2 instance as device /dev/sdg (be careful - it defaults to /dev/sdf)

    Amazon Linux EC2 instance

    • Check that the devices are mounted correctly with 'fdisk':

    • Yes, /dev/sdf is our 100G source drive and /dev/sdg is the new 40G destination
    • Use 'dd' to copy the contents of the source to the destination
      • dd if=/dev/sdf of=/dev/sdg bs=1M
      • The 'bs=1M' tells dd to use a block size of one megabyte - this is not only faster but also reduces the number of I/O operations. AWS charges per I/O operation :)
      • BONUS: this will also do a full block format to maximise EBS performance
    • The 'no space left on device' message is perfectly normal - 100GB of data will not fit into 40GB - the important part is the useful data is occupying less than 40GB

    AWS Console

    • Stop the Amazon Linux EC2 instance
    • Detach both the 100G and 40G volumes
    • Attach the 40G volume as device /dev/sda1 of DC01
    • Start DC01


    • Verify that the instance comes up and the root device is 40GB as expected:

    AWS Console

    • delete the 100GB snapshot
    • delete the 100GB original volume from DC01
    • terminate the Amazon Linux EC2 instance