Layer 7 Logo

Last in our series of posts looking at API management platforms is Layer 7.

The Layer 7 API Management Solution evolved from their SOA gateway products, which Smart421 has been tracking for a number of years. Computer Associates (CA) acquired Layer 7 on 22nd April 2013, with the Layer 7 products becoming a key strategic element of CA’s security product portfolio.

The Layer 7 products can be deployed in four ways: as hardware appliances, as VMWare Virtual Appliances (i.e. packaged VXDs), as Amazon AWS Machine Images, and as a traditional deployable software package. Different product ranges apply to each deployment approach, but all options use a traditional perpetual licence arrangement with annual support. Exact licence terms and costs vary by deployment approach, but are in general are based on the performance of the hardware.

For companies that prefer to use hardware appliances, terms are significantly less onerous than other appliances (e.g. IBM DataPower), as hardware and software licences are paid separately, so replacing hardware doesn’t require a new software licence. Equally, software upgrades for appliances are provided as a standard part of annual support for as long as the hardware can support them, rather than being firmware upgrades which are provided for a shorter length of time.

Alongside their core API management products, Layer 7 have a software as a service offering known as APIfy. This proposition is currently in beta, is free to use, and could be an interesting deployment option for customers if a clear upgrade path to the full product becomes available when it leaves beta.

The Layer 7 products support all the features you would expect of an API management platform, but because this platform is based on Layer 7′s mature XML gateway product, it also supports very extensive and flexible features for traffic management, custom security, message encryption, transformation, and routing. The core API management functions have been implemented using the same SOA gateway primitives available to developers, which gives a good indication of the power of the gateway.

Advantages:

  • Long history of providing high security SOA gateway technology is an excellent foundation for deployment in blue chip organisations with stringent security requirements. Supports a wide range of security technologies, e.g. SAML, X.509, LDAP, OAuth, OpenID and Kerberos.
  • Very flexible technology providing support for esoteric/unusual environments common in enterprises. Supports protocol transformation (even down to TCPIP sockets), complex routing, orchestration and parallel execution.
  • Extensible with Java plugins.
  • Flexible deployment models, on prem and in-cloud.
  • Very strong scoring by both Gartner & Forrester
  • The only of the 4 vendors offerings which is available from the AWS Marketplace (but still using a BYOL model)

Disadvantages:

  • Unlike e.g. APIgee, there is no ‘free’ version that can be used for a production pilot with easy migration to the production version. This may change once APIfy leaves beta.
  • Traditional commercial models only – no pay-as-you-go option, although licences are available for trial use.

When would we use it?

  • Enterprises requiring high-security on premises deployment with virtual or hardware appliances.
  • Enterprises wanting to deploy a custom solution within an AWS virtual private cloud (i.e. where all components are hosted within the client’s virtual cloud rather than on the public internet).
  • Enterprises with complex integration requirements (e.g. integration with MQ, databases, TCP/IP sockets etc).

Next in our series of posts looking at API management platforms is Mashery.

 

Mashery scored well in both Gartner and Forrester reports. Mashery were acquired in April last year by Intel. This has strengthened both Mashery with the backing of a company the size of Intel, but also provides Intel with a way into the API Management market place and aligns with their recent shift towards the software market (e.g. through the acquisition of McAfee)

The Mashery product provides similar features to the other products, and can be deployed both in the cloud and on-premises. Integration between Mashery and Intel’s Expressway Gateway appliance will also add comfort to those customers who are used to having a physical appliance on premise.

Interestingly, Mashery’s marketing message revolves as much around internal APIs as public ones: Something we agree with wholeheartedly.

Advantages:

  • Strong, feature rich product (including protocol translation; SAML, X.509, LDAP, OAuth, OpenID support; policy enforcement etc).
  • On-premise, Cloud and hybrid options available which provides flexibility when engaging with customers
  • Strong presence in the UK markets with the likes of Argos, TomTom, ASOS, Experian etc using their products
  • Developer portal is strong and Mashery I\O docs are a differentiator to other API Management systems
  • Backing of Intel likely to lead to significant investment into the Mashery products

Disadvantages:

  • Risk of potential product consolidation as a result of Intel Acquisition, although no sign of this occurring yet.
  • Like Apigee, in our opinion the enterprise security story isn’t quite as strong with the core Mashery product as with some other options, although this is bolstered by integration with Intel’s Expressway appliances.
  • Level of sophistication of the integration with Expressway was unclear in our investigation. It might be brilliant, but we’d advise further investigation.

When would we use it?

  • Deployment where quality of portal experience is paramount (including the documenting of APIs – I\O Docs helps with this!).
  • Where a customer is an existing Expressway customer, or has a strong preference for physical appliances and/or Intel networking kit.
  • To utilise the enhanced capabilities such as  pre-packaged reporting for internal and/or external use , policy enforcement or protocol translation.
Heartblled logo

Well if you have had your head in the sand, then you might just have had a chance of missing out on the news of the Heartbleed Bug. I thought that there was quite a good post on what it all meant on Troy Hunt’s blog, but the Codenomicon site that published the exposure is also very good.

The upshot of the bug, which is really a vulnerability, in OpenSSL, is that in versions 1.0.1 to 1.0.1f (1.0.1g is now released) of OpenSSL, there is a buffer overrun vulnerability in the heartbeat service which allows attackers to read chunks of memory which may contain the secret part of the SSL key (which would allow the attacker to pretend to be you) or even users passwords.

Services that use the OpenSSL implementation include apache HTTP servers, and commercial offerings based on the same (e.g. by Oracle, IBM etc.), other open-source servers like nginx. Also many commercial offerings are based on this code, including firewall appliances, VPN concentrators, and Load Balancers.

The vulnerability has been out in the wild for over two years, so there is a good chance that a web-site you use has been compromised at some time in the past, though many sites (google, amazon etc.) are patching up the vulnerability…you don’t know if your password has been compromised in the past.

This is yet another reason to use things like password managers for making sure you have separate passwords for all of your accounts, and for corporations, there is yet another reason to use Single Sign-On software. Even though your password will open more doors, the patch, if required, is to many fewer systems. Even if the remote systems that are integrated with your SSO solution have been compromised, they will not have seen any passwords in their traffic, only the session key which has a limited lifetime.

For example: In the case of ForgeRock’s OpenAM SSO solution, the authentication servers run on JEE platforms. This means and unless you are running on Tomcat and have configured it to use the native APR libraries,  that the OpenSSL libraries are not being used…. so it will not have been vulnerable. As you will see in other discussions… even if the downstream resources are protected you need to check that upstream resources (load balancers etc.) are not vulnerable, if they terminate the SSL session.

The end result is that there will be quite a few bleeding hearts.  Most organisations that use SSL will need to check for vulnerabilities, and patch as appropriate. Then they will need to renew all of their SSL certs on the previously compromised components. And if those certs are shared for multiple hosts (via multiple Subject Alternate Names (SANs)), then even the certs on the invulnerable resources will need to be renewed.

On top of that, most slightly paranoid consumers (including me), will want to renew their passwords once they are confident that the services that they use have been patched. Personally I would advise everyone to do it. Just because you’re not paranoid does not mean that no-one’s out to get you.

Please rate and Like this blog. Share it using the social icons below or via short URL http://bit.ly/1jAFCiq

Our readers want to know what YOU think, so please Comment.

By Liliandecassai (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

Impala by Liliandecassai

Impala 1.0 was launched back in July last year, and it’s been supported by AWS EMR since last December so I’ve been meaning to have a quick play and also to compare it with a classic map-reduce approach to see the performance difference. It’s not like I don’t believe the promises – I just wanted to see it for myself.

So I ran up a small cluster on AWS – with an m1.large for the master node and 2 core nodes, also running m1.large. I used the US-West region (Oregon) – which offers the same cheap price points as US-East but is 100% carbon-neutral as well :). This was all running using spot instances in a VPC. For interest, the total AWS cost for 24 normalised instance hours (I actually ran the cluster for just over 3 hours, including one false cluster start!) was $1.05.  Using developer standard units of cost, that’s nearly the price of half a cup of coffee! (or since we’re using Oregon region, a green tea?)

Impala

As I’m lazy, I used the code and datasets from the AWS tutorial – and decided to just use a simple count of records that contained the string “robin” in the email address field of a 13.3m row table as my comparison. Here’s how you define the basic table structure…

create EXTERNAL TABLE customers( id BIGINT, name STRING, date_of_birth TIMESTAMP, gender STRING, state STRING, email STRING, phone STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION '/data/customers/';

The output is…

[ip-10-0-0-26.us-west-2.compute.internal:21000] > select count(*) from customers;
Query: select count(*) from customers
+----------+
| count(*) |
+----------+
| 13353953 |
+----------+
Returned 1 row(s) in 1.09s

[ip-10-0-0-26.us-west-2.compute.internal:21000] > select count(*) from customers where customers.email like "%robin%";
Query: select count(*) from customers where customers.email like "%robin%"
+----------+
| count(*) |
+----------+
| 66702    |
+----------+
Returned 1 row(s) in 1.73s

A slight aside – Impala uses run-time code generation to compile down the query down to machine code using LLVM, and this introduces a compilation overhead of circa 150ms, but which more than pays back on the majority of queries.  So this is where some of our 1.73s is going.  More about this here.

Pig comparison

As a glutton for punishment, I decided to use pig rather than the more usual hive for the comparison with Impala. The first thing to say – it was way harder, as the aptly named pig is just a bit more foreign to me than the SQL-like niceness of Impala…so there was some desperate checking of cheatsheets etc to remind me how best to do it…

The basic code for the same source data (already loaded into HDFS) is as follows…

CUST = LOAD 'hdfs://10.0.0.26:9000//data/customers/customers' USING PigStorage('|')
as (id:    chararray,
name:  chararray,
dob:   chararray,
sex:   chararray,
state: chararray,
email: chararray,
phone: chararray);
C2 = FILTER CUST BY REGEX_EXTRACT_ALL(email, '(.*)robin(.*)') IS NOT NULL;
C3 = FOREACH (GROUP C2 ALL) GENERATE COUNT(C2) as cnt;
dump C3;

As you can see the pig approach ran 8 maps. The output is as follows (with all the INFO messages and some other noise removed)…

HadoopVersion PigVersion UserId StartedAt           FinishedAt          Features
2.2.0         0.11.1.1   hadoop 2014-04-10 12:11:13 2014-04-10 12:12:26 GROUP_BY,FILTER

Success!

Input(s):
Successfully read 13353953 records (9 bytes) from: "hdfs://10.0.0.26:9000//data/customers/customers"

Output(s):
Successfully stored 1 records (9 bytes) in: "hdfs://10.0.0.26:9000/tmp/temp1725123561/tmp-1782422819"

(66702)

Conclusion

I was just trying it out, so this is not a fair test in some ways – and I didn’t try and do any optimisation of either approach. The Impala approach ran about 40x faster, and this was consistent with repeated runs.

ImpalaPigComparisonGraph

I checked out the CPU, IO etc and there was nothing hitting any limits, and CPU consumption when I was alternately using Impala and pig looked like this – load was even across my two core nodes, and the master had it’s feet up most of the time…

CPU CloudWatch metrics

I haven’t reported the data here, but I also played with some nasty 3-way joins using Impala and the results were really impressive. Obviously though it’s horses-for-courses – MapReduce-based approaches like hive and pig will soldier on when Impala has run out of memory for certain query types, or in the event of a node failure etc. But definitely a great bit of kit to have in the AWS EMR toolbag!

logo-3scale
Next in our series of posts looking at API management platforms is 3scale.

3Scale offer a SaaS API Management Solution which differs from the other API Management Vendors in the way it handles API traffic. Rather than traffic passing through a centralised proxy, 3Scale provide a series of open source plugins, allowing decentralised processing of traffic. These plugins can be installed either within individual applications, existing ESBs, or within on-premises or cloud hosted proxy servers running Varnish or Apache HTTP Server. 3Scale also supports integration with the Akamai Content Distribution Network allowing authentication, throttling and caching to occur at the network edges.

Regardless of chosen deployment methodology, API traffic does not traverse or get stored within 3Scale’s infrastructure, eliminating a potential scalability bottleneck, and easing any potential concerns about security particularly given recent revelations about national intelligence agencies’ ability to conduct surveillance on private communication lines.

3Scale is a simpler product than many of the others, and therefore does not support e.g. message transformation or routing. Smart421 would therefore recommend 3Scale is deployed alongside existing integration infrastructure. 3Scale’s plugin architecture should allow 3Scale capabilities to be added to an existing ESB technology.Whilst they didn’t score as highly in the Gartner and Forrester reports, 3Scale do have some big named customers such as Skype, Telegraph Group, The Guardian and JustGiving.

Advantages:

  • Simple, low pricing.
  • Free tier allows POCs and Pilots to be built and deployed cheaply and easily.
  • Clean simple architecture supporting both cloud and on-prem deployment of traffic management components.
  • Solid core product including authentication/authorisation, developer plans, developer portals, forums and billing engine.

Disadvantages:

  • Not as feature rich as some of the competition. In particular doesn’t provide the ability to do protocol or message transformation. Needs to be augmented by a REST-capable ESB product for internal integration.
  • Portal always cloud hosted, which may be a hard barrier for some customers. Also limits ability to integrate with existing user credentials etc.
  • Rated towards the back of the pack by both Gartner and Forrester
  • Smaller company than most other players, which carries some commercial risk.  3scale secured $4.2m private funding in April 2013.

When would we use it?

  • Smaller customers for whom cost is the overriding factor
  • Customers looking for a simple solution to combine with an existing investment in internal REST-capable ESB technology, or green field customers who will expose REST APIs directly from back-end systems

empty pocketFollowing on from my post about Google, AWS and then Azure price cuts the other day, there’s an interesting summary of Rackspace’s position covered on TechCrunch. In summary, the Rackspace CTO John Engates explained that they are continuing on the same track of not matching the recent price drops – which is consistent with his blog from July last year where he said…

We at Rackspace don’t aspire to offer the lowest unit prices. We strive instead to offer the best value…

I suspect a key reason is because they can’t afford to play this game of chicken.

Looking at basic storage as it’s easiest to do a like-for-like comparison, Rackspace’s Cloud Files is 10 cents/GB still, so that’s now 3.33x than the entry price for AWS S3, and 3.8x the entry cost of Google Cloud Storage. Whilst I firmly believe that agility is typically a stronger driver than cost in the enterprise market, that’s such a huge difference that I don’t see how a customer procurement department can ignore it. Rackspace is having to move up the food chain as the base services get commoditised underneath them, i.e. focusing on service management, OpenStack, DevOps etc – get (a bit more) niche or get out. I get the “focus on value” message, but it’s hard to show much differentiating value on relatively commodity services like storage. It looks like this price drop was one price drop too far for Rackspace’s pockets. And then there were 3…

PS As an illustration of the positive impact on our customers, we’ve recently re-priced a customer proposal that was already going through the Smart421 sales machine when these price cuts were announced, and it’s resulted in an immediate 17% overall AWS cost reduction. Nice.

 

Couchbase Live London 2014 stackedI was fortunate enough to attend this years Couchbase Live [London] event. Having experience with MongoDB in production with one of our clients I was keen to see what Couchbase has to offer.

Viber were on scene to talk about their decision to migrate from a MongoDB backend to Couchbase. They started off using MongoDB for persistence, with Redis in front of the Mongo instances for caching, they found that the setup just wasn’t scaling out as they needed so opted for Couchbase and were able to reduce their server count by three fold. Switching to Couchbase simplified their persistence architecture and enabled them to have several clusters and a dedicated backup cluster using XDCR (cross data centre replication).

The first session I went to was “Anatomy of a Couchbase app”, where J Chris Anderson (@jchris) and Matt Revell (@matthewrevell) gave a demonstration of a Node.js and Couchbase backed application that enables users to post video clips onto a web page; like a chat room for pre-recorded videos. As a developer, this session was my favourite, after a quick demo of the app they dived straight into the code and showed you how to use the APIs (from a Node.js perspective, but other languages would have similar features). They covered auto-expiring documents, and binary storage, which were two things I wanted to see how Couch handled, as I already knew MongoDB had good support for these. If you have time, look at the application, it’s on their github

Another session that I found incredibly useful, was “Document Modelling” by Jasdeep Jaitla (@scalabl3). Whilst I already have experience working with MongoDB in production, I have a good understanding of how a document should be structured, but I was a little unsure of how this is implemented in Couchbase. For a start, MongoDB uses collections within databases, whereas Couchbase uses buckets, so there is one less layer of abstraction, meaning buckets can store different types of documents. Also, Couchbase is a key-value document store, so your keys could be a string such as “user:1″ or even just “1″, and the value itself would be the json document (or binary data).

Couchbase also has the concept of document meta-data, for every document stored, it will have a corresponding meta-data document that stores things such as the Id, an expiration (for TTL purposes), document type. The document itself can be up to 20mb, as opposed to 16mb in MongoDB.

Jasdeep then explained various patterns that can be used for storing data, such as a lookup pattern, and counter pattern. This was very useful.

The mobile sessions were not quite as good, I was expecting more of a workshop style whereby we could see some code and get advise on how to implement CBLite, however there were some very good demos of a todo-list and syncing data between various devices (android smartwatch included!). If you’re interested, have a look at the grocery-sync app on github, it is very similar.

The last session worth noting, was from David Haikney; “Visualising a Couchbase server in flight”. David discussed (and demonstrated, perfectly) replication, fail overs, scaling and XDCR. He had a cluster of servers, and was able to remove and add new nodes and demonstrate how the rest of the cluster reacts to such scenarios. You can get a lot of information from the statistics that are built into the admin console, and the top tip I picked up was to ensure the active docs resident is close to 100%, as that shows documents are being served from memory instead of disk.

Some other advice was to take advantage of XDCR, such as creating a backup cluster, or live-hot standby setups, or even using XDCR to replicate to a test environment so that you always have representative live data.

There was a hackathon in the evening, I stayed for this but didn’t participate as I was too keen to setup bucket shadowing on a demo app I was working on. The beta3 release of the sync gateway introduced this feature whereby you can configure your sync gateway bucket to automatically sync with a standard bucket, this is fantastic for exposing your applications data to a mobile tier (you can restrict this of course, using channels). If you want to read more, have a look here.

A great day, I learned a lot, well worth the trip. I even bagged a free Apple TV for being one of the first through registration…

 

 

Please Rate and Like this blog. Share it using the social icons below or via the short URL http://wp.me/pd7EK-1b0

Our readers want to know what YOU think so please take a moment to Comment.

Follow

Get every new post delivered to your Inbox.

Join 1,084 other followers