Friday 17 January 2014

Masterless Puppet on Windows in AWS using Elastic Beanstalk

Sam Bashton gave a great talk at PuppetConf 2013 about how Bashton, Ltd. deploy Linux-based EC2 instances and configure them using Puppet where all the modules and manifests are 'baked in.'

In the world of Puppet this is a very convenient way to handle the problem of running a puppet-master and its requisite SSL key signing. Puppet's SSL signing is suitable for running a house of pets, but falls some way short when dealing with a farm of cattle.

During the Christmas break I tried to reimplement an existing customer's AWS Elastic Beanstalk deployment with Puppet using Sam Bashton's talk as the inspiration. I've now been prompted to write it up after reading Masterzen's great article this morning.


Current State


The customer needs to run more than a single application on each EC2 instance - the cost to deploy a separate Beanstalk for each tier is simply too high for them, and their code is not yet ready to consume a single 'service layer.'

Hence I made use of .ebextensions config files in Elastic Beanstalk to configure multiple IIS websites on each instance. I'll save you from the full travesty of this file (e.g. abusing the HOSTS file so the services hostname resolves to 127.0.0.1) and provide this potted version of the 'beanstalk.config':

     sources:
       "c:/scripts/ipfudge": http://admin01/scripts/ipfudge.zip

       "c:/program files/amazon/AWSCLI": http://admin01/scripts/AWSCLI.zip
       "c:/inetpub/newwebservices": http://admin01/sites/custname.newwebservices.zip 
files:
       "c:/scripts/s3-mobile-appdata-sync.cmd":
         content: |
          c:
          cd \inetpub\mobile\App_Data
          "C:\program files\amazon\awscli\aws" --region eu-west-1 s3 sync s3://cust-mobile-appdata/ . --delete >%TEMP%\s3sync4.log 2>&1

container_commands:
  newwebservices01:
    cwd: c:\windows\system32\inetsrv
    command: appcmd add apppool /name:"CustomerName.NewWebServices"
    ignoreErrors: true
    waitAfterCompletion: 0 
  newwebservices02:
    cwd: c:\windows\system32\inetsrv
    command: appcmd add site /name:"CustomerName.NewWebServices" /id:2 /bindings:http://web.customername.com:80 /physicalPath:"c:\inetpub\newwebservices"
    ignoreErrors: true
    waitAfterCompletion: 0 
  newwebservices03:
    cwd: c:\windows\system32\inetsrv
    command: appcmd set app "CustomerName.NewWebServices/" /applicationPool:"CustomerName.NewWebServices"
    ignoreErrors: true
    waitAfterCompletion: 0
       schtasks01:
         cwd: c:\windows\system32
         command: schtasks /create /ru SYSTEM /TN Sync_S3_App_Data /TR "cmd /c c:\scripts\s3-appdata-sync.cmd >%TEMP%\s3sync1.log 2>&1" /SC MINUTE /MO 30 /ST 00:18 /F
         ignoreErrors: true 

         waitAfterCompletion: 0 

       zzfinal01:
         cwd: c:\windows\system32\inetsrv
         command: appcmd set config /section:urlCompression /doDynamicCompression:True
         ignoreErrors: true
         waitAfterCompletion: 0


       zzfinal02:
         cwd: c:\windows\system32\inetsrv
         command: cmd /c c:\scripts\ipfudge\ipfudge.cmd >%TEMP%\ipfudge.log 2>&1
         ignoreErrors: true
         waitAfterCompletion: 0 


Problems


There's a whole ton of pain going on there. Every time new code is deployed through Beanstalk (the customer uses the AWS Toolkit for Visual Studio to publish directly from their development workstations) every command in this Beanstalk configuration will be executed.

That means at least the following things happen:

  • The c:\inetpub\directory containing the 'main' application at 'Default Web Site' will be emptied and populated according to whatever was pushed from Visual Studio
  • The 'sources' section in the beanstalk.config will remove the destination directory, download the ZIP file from the provided URL and unpack it at the destination directory
  • All of the 'container_commands' will be executed in lexicographical order (hence 'zzfinalxx' to ensure they are run last)
  • The commands in 'ipfudge.cmd' are run each time (a long series of 'appcmd set config' commands to allow/deny access to the NewWebServices site based on IP address)

The repeated use of 'ignoreErrors: true' makes the associated output in 'cfn-init.log' very noisy, e.g.:

2014-01-17 09:15:55,444 [ERROR] Command newwebservices01 (appcmd add apppool /name:"CustomerName.NewWebServices") failed
2014-01-17 09:15:55,444 [INFO] ignoreErrors set to true, continuing build
2014-01-17 09:15:56,490 [ERROR] Command newwebservices02 (appcmd add site /name:"CustomerName.NewWebServices" /id:2 /bindings:http://web.customername.com:80 /physicalPath:"c:\inetpub\newwebservices") failed
2014-01-17 09:15:56,490 [INFO] ignoreErrors set to true, continuing build
2014-01-17 09:15:57,648 [INFO] Command newwebservices03 succeeded


Desire

I would far rather define this in terms of a Puppet manifest - that way only deviations from the described state will be changed, rather than churning through every change every time.

I had something like this in mind:



Bootstrap


First we need to take the Windows Server 2012 AMI used with Elastic Beanstalk and install Puppet and other tools. It's common to bake your own AMI however I'm quite keen on avoiding work so I prefer to configure the existing Windows Server 2012 AMI (ami-396a784d at time of writing).

So, first we need a hugely simplified beanstalk.config:

packages:
  msi:
    Puppet: https://downloads.puppetlabs.com/windows/puppet-3.4.0.msi

sources:
  "c:/scripts/gopuppet": http://admin01/scripts/gopuppet.zip
  "c:/programdata/puppetlabs/puppet/etc/modules": http://admin01/scripts/puppetmodules.zip

container_commands:
  gopuppet:
    cwd: c:\windows\system32
    command: cmd /c powershell -executionpolicy remotesigned -file c:\scripts\gopuppet\gopuppet.ps1 >c:\progra~1\amazon\elasticbeanstalk\logs\puppet.log 2>&1
    ignoreErrors: true
    waitAfterCompletion: 0


This will download 'gopuppet.zip' and puppetmodules.zip from an existing server in the environment called 'admin01' - the source could just as easily be on S3.

Redirecting the output for the 'gopuppet' command allows us to see what's going on by using Beanstalk's 'Snapshot Logs' command in the web UI.


What Am I?


gopuppet.zip contains a single PowerShell script:, gopuppet.ps1:

import-module "C:\Program Files (x86)\AWS Tools\PowerShell\AWSPowerShell\AWSPowerShell.psd1"
Set-DefaultAWSRegion eu-west-1

$my_instance=(New-Object System.Net.WebClient).DownloadString("http://169.254.169.254/latest/meta-data/instance-id")

#i-beec85f2

$my_instance2=New-Object 'collections.generic.list[string]'
$my_instance2.add($my_instance)

$filter= New-Object Amazon.EC2.Model.Filter -Property @{Name = 'resource-id'; Value = $my_instance2}

#Name                                                        Values
#----                                                        ------
#resource-id                                                 {i-beec85f2}

$my_tags= Get-EC2Tag -filter $filter
# | where { $_.Key -eq "elasticbeanstalk:environment-name" }
# the piped where clause works on my home laptop but not here - maybe AWS Tools 3.5 versus 3.0 on the AMI?

#ResourceId                    ResourceType                  Key                           Value
#----------                    ------------                  ---                           -----
#i-beec85f2                    instance                      elasticbeanstalk:environme... customername-taupe

foreach ($tag in $my_tags)
    {
    if ($tag.Key -eq "elasticbeanstalk:environment-name")
        {
        $envname= $tag.Value
        }
    }

#customername-taupe

# Now we know who we are and what environment we're part of, we need to grab the current config for this environment

$tempfile = Read-S3Object -BucketName cust-deployment-data -Key "$envname.pp" -File "$env:temp\$envname.pp"

# Check that the file was written to less than 10 million ticks ago (10 seconds)

if ((((Get-Date) - $tempfile.LastWriteTime).ticks) -lt 10000000)
 {
 write "file $tempfile was written to recently"

 # Well this is a terrible fudge and it's the only way I can make it work at first-run
 # Using the windows_path module in Puppet doesn't work until the *second* Puppet run
 $env:Path += ";C:\Program Files (x86)\Git\cmd"
 # Finally run puppet and apply this config
 & "C:\Program Files (x86)\Puppet Labs\Puppet\bin\puppet" apply --debug $tempfile

 if ($LASTEXITCODE -ne 0)
    {
    write "Execution failed"
    }
 }


Modules


The puppetmodules.zip contains many off-the-shelf modules from the Puppet Forge:


I had to hack the download_file module around to fix assumptions made about the location of temporary directories, and also to include a new download_unpack_file function which uses the Explorer Shell to unzip files without needing 7-Zip or other binary dependencies.


Make It So


Now we're really getting somewhere and we just need the manifest itself. Here's a simplified version, since the real one is an epic 400 lines.

# Install Windows features Start
  dism { 'IIS-ASPNET':
    ensure => present,
    require => Dism['IIS-ISAPIFilter'],
  }
  dism { 'IIS-ISAPIFilter':
    ensure => present,
    require => Dism['IIS-ISAPIExtensions'],
  }
  dism { 'IIS-ISAPIExtensions':
    ensure => present,
    require => Dism['IIS-NetFxExtensibility'],
  }
  dism { 'IIS-NetFxExtensibility':
    ensure => present,
    require => Dism['NetFx4Extended-ASPNET45'],
  }
  dism { 'NetFx4Extended-ASPNET45':
    ensure => present,
  }
  file {'c:/scripts':
    ensure          => directory,
    require         => Iis_vdir ["CustName.NewWebServices/"]
  }

# Install Windows features End

# NewWebServices Start
  file {'c:/inetpub/newwebservices':
    ensure          => directory,
  }

  download_unpack_file { 'custname.newwebservices.zip':
    url => 'http://admin01/custname.newwebservices.zip',
    destination_directory => 'c:\inetpub\newwebservices',
    temp_directory => 'c:\windows\temp',
    require => File['c:/inetpub/newwebservices'],
  }

  iis_apppool {'CustName.NewWebServices':
    ensure                => present,
    managedpipelinemode   => 'Integrated',
    managedruntimeversion => 'v4.0',
  }

  iis_site {'CustName.NewWebServices':
    ensure          => present,
    bindings        => ["http/*:80:web.custname.com"],
  }

  iis_app {'CustName.NewWebServices/':
    ensure          => present,
    applicationpool => 'CustName.NewWebServices',
  }

  iis_vdir {'CustName.NewWebServices/':
    ensure          => present,
    iis_app         => 'CustName.NewWebServices/',
    physicalpath    => 'c:\inetpub\newwebservices',
    require         => Download_unpack_file["custname.newwebservices.zip"]
  }
# NewWebServices End

# S3 Mobile Sync Start
  file {'c:/scripts/s3-mobile-appdata-sync.cmd':
    content =>
'c:
cd \inetpub\mobile\App_Data
"C:\program files\amazon\awscli\aws" --region eu-west-1 s3 sync s3://cust-mobile-appdata/ . --delete >%TEMP%\s3sync4.log 2>&1
',
    require => File["c:/scripts"]
  }

 scheduled_task { 'Sync_S3_Mobile_App_Data':
     ensure    => present,
     enabled   => true,
     command   => 'c:\scripts\s3-mobile-appdata-sync.cmd',
     trigger   => {
       schedule   => daily,
       start_time => '00:17'
     },
     require   => File["c:/scripts/s3-mobile-appdata-sync.cmd"]
   }



# This is a weak fudge because Puppet's scheduled_task support 
# does not include 'every N minutes'


  exec { 'Sync_S3_Mobile_App_Data':
    command => 'c:\windows\system32\schtasks.exe /change /TN "Sync_S3_Mobile_App_Data" /RI 30',
    subscribe   => Scheduled_task["Sync_S3_Mobile_App_Data"],
    refreshonly => true
  }

# S3 Mobile Sync End

# AWS CLI Tools Start
  download_unpack_file { 'AWSCLI.zip':
    url => 'http://admin01/scripts/AWSCLI.zip',
    destination_directory => 'c:\program files\amazon\AWSCLI',
    temp_directory => 'c:\windows\temp',
    require => File['c:/program files/amazon/AWSCLI'],
  }

  file {'c:/program files/amazon/AWSCLI':
    ensure          => directory,
  }


# AWS CLI Tools End 


# IP Security Fudge Start
  dism { 'IIS-IPSecurity':    
    ensure => present,
  }

  exec { 'install-ipfudge':

    command   => "c:\\windows\\temp\\ipfudge.cmd",
    logoutput => true,
    refreshonly => true,
    subscribe => Download_file['ipfudge'],
    require => Dism ['IIS-IPSecurity']
  }

  download_file { 'ipfudge':
    url => 'http://admin01/scripts/ipfudge.cmd',
    destination_directory => 'c:\windows\temp',
    temp_directory => 'c:\windows\temp',
    require => File['c:/scripts'],
  }

# IP Security Fudge End

# More weakness since there is no IIS module which can manipulate this data, so we break out to a dirty exec()
  exec { 'Dynamic_Compression':
    command => 'c:\windows\system32\inetsrv\appcmd.exe set config /section:urlCompression /doDynamicCompression:True',
  }
    refreshonly => true
  }


Reflection

This was an interesting technical challenge and not one I would put into production for the following reasons:

  • Deployment time was far greater during the 'puppet apply' run. This was typically over three minutes for each new run compared with one minute for the 'do it all' beanstalk.config approach.
  • Initial EC2 instance build was far slower since the Puppet MSI took several minutes (I appreciate a pre-baked MSI would help here)
  • Puppet's support for Windows is still sketchy:
    • No scheduled_task 'every N minutes'
    • No manipulation of the IIS config structure
    • Much more verbose definition of IIS sites cf. three 'appcmd' lines
    • Most of the modules on the Forge will not work on Windows
  • No obvious way to get a signal to re-run 'puppet apply' other than uploading or redeploying existing code through Beanstalk. 
    • The full config included several Git-based vcsrepo sources - I wanted to be able to 'bump' the config rather than do a full environment update.
Comments welcome! :)