AWS

In my previous blog post on using AWS CloudFormation to provision a Centos based environment I mentioned how at first the JSON syntax used within CloudFormation templates can be a little daunting, especially for people with limited scripting experience.

At the time of writing the blog I was using a combination of   Notepad++ with the JSON Viewer Plug-In to create and maintain my CloudFormation templates. One of the problems I had with this approach was that the templates are treated as pretty basic text files by Notepad++ and the JSON Viewer Plug-In only checks the base JSON syntax, so there is no validation of the objects defined in the JSON to ensure that the correct properties are being creating to allow the template to form a valid CloudFormation stack.

As an early Christmas present to anyone working with CloudFormation AWS recently announced the introduction of a new CloudFormation editor as part of the AWS Toolkits for Visual Studio and Eclipse. I have now had a chance to download and experiment with the Visual Studio version of this and am really impressed with how much easier it makes the creation and maintenance of CloudFormation templates.

As a .Net developer I have grown used to regularly relying upon IntelliSense features to aid with code creation, particularly when it comes to accessing and updating object properties. The CloudFormation editor provides this for the objects defined within a template as well as code snippets for all the object types you might want to define within your template. This greatly reduces the number of errors caused by ‘finger problems’ that used to occur when creating a template by hand.

The other really useful feature of the editor is the ability to estimate the costs of a stack before creating it. In the past when quoting for customers we have tended to pull together a design, plug the figures into the AWS Simple Monthly Calculator and then once we have received the go ahead provision the environment. With the ability to create costs from a template we are now looking at generating a base template at the design phase and then costing and building from this, which should help with the speed of environment creation and clarity around the prices for altering the design.

Based on our experiences so far  it’s a big thank-you to AWS Santa and his helper elves for the early Christmas present of the CloudFormation Editor and we are all looking forward to further presents from AWS  throughout next year :-)

ImageIf you have any experience of supporting large scale infrastructures, whether they are based on ‘old school’ tin and wires, virtual machines or cloud based technologies you will know that it is important to be able to create consistently repeatable platform builds. This includes ensuring that the network infrastructure, ‘server hardware’, operating systems and applications are installed and configured the same way each time.

Historically this would have been achieved via the use the same hardware, scripted operating system installs and in the Windows application world of my past the use of application packagers and installers such as Microsoft Systems Management Server.

With the advent of cloud computing the requirements for consistency are still present and just as relevant. However the methods and tools used to create cloud infrastructures are now much more akin to application code than the shell script / batch job methods of the past (although some of those skills are still needed). The skills needed to support this are really a mix of both development and sys-ops and have led to the creation of Dev-Ops as a role in its own right.

Recently along with one of my colleagues I was asked to carry out some work to create a new AWS based environment for one of our customers. The requirements for the environment were that it needed to be:

  • Consistent
  • Repeatable and quick to provision
  • Scalable (the same base architecture needed to be used for development, test and production just with differing numbers of server instances)
  • Running Centos 6.3
  • Running Fuse ESB and MySQL

To create the environment we decided to use a combination of AWS CloudFormation to provision the infrastructure and Opscode Chef to carry out the installation of application software, I focussed primarily on the CloudFormation templates while my colleague pulled together the required Chef recipes.

Fortunately we had recently had a CloudFormation training day delivered by our AWS Partner Solutions Architect so I wasn’t entering the creation of the scripts cold, as at first the JSON syntax and number of things you can do with CloudFormation can be a little daunting.

To help with script creation and understanding I would recommend the following:

For the environment we were creating the infrastructure requirements were:

  • VPC based
  • 5 subnets
    • Public Web – To hold web server instances
    • Public Secure – To hold bastion instances for admin access
    • Public Access – To hold any NAT instances needed for private subnets
    • Private App – To hold application instances
    • Private Data – To hold database instances
    • ELB
      • External – Web server balancing
      • Internal – Application server balancing
      • Security
        • Port restrictions between all subnets (i.e. public secure can only see SSH on app servers)

To provision this I decided that rather than one large CloudFormation template I would split the environment into a number of smaller templates:

  • VPC Template – This created the VPC, Subnets, NAT and Bastion instances
  • Security Template – This created the Security Groups between the subnets
  • Instance Templates – These created the required instance types and numbers in each subnet

This then allowed us to swap out different Instance Templates depending on the environment we were creating for (i.e development could have single instances in each subnet whereas Test could have ELB balanced pairs or production could use features such as auto-scaling).

I won’t go into the details of the VPC and Security Templates here, suffice it to say that with the multiple template approach the outputs from the creation of one stack were used as the inputs to the next.

For the Instance Templates the requirement was that the instances would be running Centos 6.3 and that we would use Chef to deploy the required application components onto them. When I started looking in to how we would set the instances up do this I found that the examples available for Centos and CloudFormation were extremely limited compared to Ubuntu or Windows. As this is the case I would recommend working from a combination of the Opscode guide to installing Chef on Centos and AWS’s documentation on Integrating AWS with Opscode Chef.

Along the way to producing the finished script there were a number of lessons which I will share with you to help with your installation, the first of these was the need to use a Centos.org AMI from the AWS Marketplace. After identifying the required AMI I tried running up a test template to see what happens before signing up for it in the Marketplace, in CloudFormation this failed with an error of ‘AccessDenied. User doesn’t have permission to call ec2::RunInstances’ which was slightly misleading. Once I’d signed our account up for the AMI then this was cured.

The next problem I encountered was really one of my own making / understanding. When looking at AMIs to use I made sure that we had picked one that was Cloud-Init enabled, in my simplistic view I thought that this meant that commands such as cfn-init that are used within CloudFormation to carry out CloudFormation specific tasks would already be present. This wasn’t the case as the cfn- commands are part of a separate bootstrap installer that needs to be included in the UserData Section of the template (see below):

"UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
 "#!/bin/bash -v\n",
 "function error_exit\n",
 "{\n",
 " cfn-signal -e 1 -r \"$1\" '", { "Ref" : "ResFuseClientWaitHandle" }, "'\n",
 " exit 1\n",
 "}\n",<br /> "# Install the CloudFormation tools and call init\n",
 "# Note do not remove this bit\n",<br /> "easy_install https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz\n",
 "cfn-init --region ", { "Ref" : "AWS::Region" },
 " -s ", { "Ref" : "AWS::StackName" }, " -r ResInstanceFuse ",
 " --access-key ", { "Ref" : "ResAccessKey" },
 " --secret-key ", { "Fn::GetAtt" : [ "ResAccessKey", "SecretAccessKey" ]},
 " -c set1",
 " || error_exit 'Failed to run cfn-init'\n",
 "# End of CloudFormation Install and init\n", 
 "# Make the Chef log folder\n",
 "mkdir /etc/chef/logs\n",
 "# Try starting the Chef client\n",
 "chef-client -j /etc/chef/roles.json --logfile /etc/chef/logs/chef.log &gt; /tmp/initialize_chef_client.log 2&gt;&amp;1 || error_exit 'Failed to initialise chef client' \n",
 "# Signal success\n",
 "cfn-signal -e $? -r 'Fuse Server configuration' '", { "Ref" : "ResFuseClientWaitHandle" }, "'\n"
]]}}

As the cfn-signal which comes as part of the bootstrap installer is used for messaging to any wait handlers defined in the template this can lead to long breaks at the coffee machine before any feedback is received if they are not present.

The final lesson was how to deploy the Chef Client and configuration to the instances. Chef is a rubygems package, so needs this and supporting packages present on the instance before it can be installed. Within CloudFormation packages can be installed via the use of the packages configuration sections of AWS::CloudFormation::Init which for Linux supports rpm, yum and rubygems installers. Unfortunately for the AMI we chose to use the available repositories didn’t contain all packages necessary for our build, to get around this I had to rpm on the rbel repository definitions before using a combination of yum and rubygems to install Chef:

"packages" : {
 "rpm" : {
 "rbel" : "http://rbel.frameos.org/rbel6"
 },
 "yum" : {
 "ruby" : [],
 "ruby-devel" : [],
 "ruby-ri" : [],
 "ruby-rdoc" : [],
 "gcc" : [],
 "gcc-c++" : [],
 "automake" : [],
 "autoconf" : [],
 "make" : [],
 "curl" : [],
 "dmidecode" : [],
 "rubygems" : []
 },
 "rubygems" : {
 "chef" : [] 
 }
}

Once Chef was installed the next job was to create the Chef configuration files and validation key on the instance. This was carried out using the “files” options within AWS::CloudFormation::Init:

"files" : {
 "/etc/chef/client.rb" : 
 "content" : { "Fn::Join" : ["", [
 "log_level :info", "\n", "log_location STDOUT", "\n",
 "chef_server_url '", { "Ref" : "ParChefServerUrl" }, "'", "\n",
 "validation_key \"/etc/chef/chef-validator.pem\n",
 "validation_client_name '", { "Ref" : "ParChefValidatorName" }, "'", "\n"
 ]]}, 
 "mode" : "000644",
 "owner" : "root",
 "group" : "root"
 },
 "/etc/chef/roles.json" : {
 "content" : { 
 "run_list" : [ "role[esb]" ]
 },
 "mode" : "000644",
 "owner" : "root",
 "group" : "root"
 },
 "/etc/chef/chef-validator.pem" : {
 "source" : { "Fn::Join" : ["", [{ "Ref" : "ParChefKeyBucket" }, { "Ref" : "ParChefValidatorName" }, ".pem"]]},
 "mode" : "000644",
 "owner" : "root",
 "group" : "root",
 "authentication" : "S3Access"
 }
}

The hardest part of this was the validation key, as we had multiple instances wanting to use the same key we decided to place this within an S3 bucket and pull the key down. During the script creation I tried multiple ways of doing this, such as using S3Cmd (which needed another repository and set of configuration to run) but found that using the files section worked best.

Once Chef was installed the client was started via the UserData section (basically a shell script), this then handed control of what additional software and configuration is installed on the instance to the Chef Master. How much Chef does at this stage is a bit of a balancing act as the wait handler within the template will fail the stack creation if its timeout period is exceeded.

As you can probably tell if you have got this far, the creation of the templates took quite a few iterations to get right as I learnt more about CloudFormation. When debugging what is going on it is worth remembering that you should always set the stack to not rollback on failure. This then allows you to access the instances created to find out where they got to within the install, as the UserData section is basically a shell script with some CloudFormation hooks, more times than not the faults are likely to be the same as you would see on a standard non-AWS Linux install. Also for a Centos install remember that the contents of /var/log are your friend as both cloud-init and cfn-init create log files here for debugging purposes.

After watching Werner Vogels keynote speech from AWS Re:Invent it’s clear that treating infrastructure as a programmable resource (i.e. using technologies such as CloudFormation and Chef) is somewhere organisations need to be moving towards and based on my experience so far I will be recommending using this approach on all future AWS environments we get involved with, even the small ones.

Whilst there is a bit of a learning curve the benefits of repeatable builds, known configuration and the ability to source control infrastructure far outweigh any shortcomings, such as granular template validation which I’m sure will come with time.

If you have any comments or want to know more please get in touch.

Last week 4 Smarties including myself were lucky enough to attend Microsoft Tech.Days Windows Azure Bootcamp in London. Usually for days like this Microsoft estimate (and allow for) around a 50% attendance rate, however 90% of the registered attendees for this event showed up which showed how much interest in cloud computing is taking off and required some quick provisioning of additional space from the organisers (akin to provisioning additional storage space in the cloud).

The camp was presented by Steve Plank (http://plankytronixx.com/default.aspx), with the content being a mixture of lecture / demo and try for yourself. This provided the audience with an overview of the current Azure platform and enough knowledge to walk away and start creating cloud based apps capable of exercising a number of basic Azure features.

There was also a quick 10 minute presentation during break time from the recently formed London Windows Azure User Group (http://www.lwaug.net) who’s first meeting is on the 6th December and sounds like it will be well worth regularly attending to both hear from their guest speakers and touch base with a number of other Azure developers to talk through experiences using the platform. Plus mention was given to the upcoming ‘6 Weeks of Azure’ programme (http://www.sixweeksofazure.co.uk/) beginning at the end of Jan 2012 where 6 weeks of free help will be given to UK companies wanting to have a look at the platform.

Based on the content presented on the day it is clear that a lot of thought has been put into the features within Azure and making sure that these will be easy to integrate into both existing and new developments where appropriate (and not just using Microsoft development tools or languages).

The main area that I am looking forward to diving deeper into is the Azure App Fabric, particularly the Azure Service Bus (http://msdn.microsoft.com/en-us/library/windowsazure/ee732537.aspx) as this looks like it will be incredibly useful in stitching together dispersed applications and also hooking existing on-premise solutions into new cloud based offerings and also the Caching Service (http://msdn.microsoft.com/en-us/library/windowsazure/gg278356.aspx) which will be really great  for both distributed and high bandwidth apps.

On the day the only part of the App Fabric that was demoed was the Access Control Service which under the covers used SAML.  Before the event I had dismissed as being just another way to validate users. However after seeing how easy this is to implement and use plus the ability to integrate with Active Directory via the use of ADFS2 (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=10909)  and a number of other authentication providers such as Google, Yahoo or LiveID I can see it becoming part of most Azure developments.

One question we had been kicking around prior to the event was ‘how production ready is Azure?’ as with other cloud service providers (such as AWS) we have tended to see our customers start by using these services for test / development or disaster recovery environments rather than production.

Although not fully answered it was clear that Microsoft are looking for customers to place production as well as development systems in Azure and have also been doing their homework on the problems previously experienced by other providers architecting the underlying infrastructure accordingly to cope with this.

Microsoft are offering a 99.95% SLA for external connectivity for compute hosted services that meet their criteria (at least 2 host instances configured for a service) which is the same as AWS for EC2 instances however the  Azure terms are measured monthly rather than yearly for AWS.

Azure also contains additional features such as the recently released SQL Azure Data Sync which allows data to be synchronised between SQL Azure instances in different locations and it was hinted that resilience features like this along with a huge number of other enhancements are currently under development across the platform.

Based on what was shown and discussed during the event it looks like there are exciting times ahead in the Windows Azure space and I am looking forward to architecting, developing and supporting applications that make use of its features.

Follow

Get every new post delivered to your Inbox.

Join 1,122 other followers