cloud and seaDo the service level agreements (SLAs) offered by public cloud providers hold water? Or are they useless to a customer -and not worth the cyberspace they are written on? We decided to pick the cloud fraternity poster child – Amazon EC2 – review their current SLA offering (which is defined here), and then compare it with ‘traditional’ hosting vendors offerings.

First let’s review the Amazon EC2 SLA. We’re not offering a legal perspective here, but several things strike us as interesting about this:

  • The SLA does provide for some punitive damages in the sense that should the service availability be between 90% and 99.95%, Amazon would still have to pay out 10% of your payments to them.
  • If EC2 was only available 60% of the time, their liability would be limited to 10%.
  • In no way are the damages that Amazon pay out linked to your potential losses. If Amazon had a major outage that could cost you millions, they might only pay out a few thousand in compensation.
  • The definition of ‘unavailable’ that they use here could potentially allow you to claim service credits even when your application is functioning perfectly well if we’re reading the SLA correctly, because ‘region unavailable’ appears to mean that one (not all) of the availability zones within a region is down. For example, if there are two availability zones within your region – your application could be running in both of these zones, such that it fails over between them. You’d get credit if a zone went down even in the event of the application failing over to the backup zone – even though your app is still ‘up.
  • Availability is measured in 5 minute blocks, which implies that they exclude all 4 minute downtimes. 4 minutes is a long time for a critical system to be down.
  • The other angle on this is that the advantage of using EC2 is that it gives you much easier ways to recover from any outage that does occur (provided it doesn’t bring down the whole region, when things get a little trickier).

Having said all of that, how different is this arrangement from what traditional hosting providers offer?

The view from our Head of Service Management is that it looks fairly standard – the hosting providers we work with have similar offerings. One of them offers up to 50% off the monthly fee for poor availability which sounds great but when you read the small print it’s measured in quarterly periods so they can get away with a bad month so long as they have two good months in the same quarter. Amazon’s small print doesn’t look too pleasant from an admin point of view as you must claim within 30 days of the last outage, and have to provide evidence of the outage etc.

Service credits are things that procurement and legal departments like but IT departments can’t be bothered with them, as they cause too much hassle to claim back a small sum that doesn’t really hurt the supplier and doesn’t compensate the customer for the real loss. We have come across some companies who offer a service whereby they take 100% of the service fee if they meet service levels i.e. you don’t have to prove it or claim it, which is obviously more attractive.

Our conclusion is that Amazon have paid lip service to service credits, and have probably done enough to satisfy procurement/legal requirements but nothing to shout about.

Thanks are due to my colleagues Paul Russell and Andy Budd for their input on this material…

Further reading on this subject…