I attended the Big Data Analytics 2012 event at the Victoria Park Plaza, London (organised by Whitehall Media) yesterday along with our CTO, Robin Meehan. We wanted to attend to keep in touch with what some of the big players are saying about “Big Data” and their views on analysis with Cloudera, Oracle, SAP, Informatica, SAS UK, SGI, MapR Technologies, GigaSpaces, MicroStrategy, Pentaho etc all there (and furnished with masses of pens, notepads, pen drives and stands etc).
The event itself was good; usual mix of CIO’s, CTO’s and techies in attendance. A number of the key note speakers (in and amongst the sales pitches) had some interesting stories and facts, such as John O’Donovan, Director of Technical Architecture & Development at the Press Association, talking about how they analysed the “masses of data” captured during the London 2012 Olympics, to deliver “Content as a service” to consumers around the globe (including the translation of the content on route) supporting up to 50K TPS. This was followed by a great clip explaining the math behind Robert Carlos’s improbable goal – it’s worth a look – click here to watch on YouTube.
Bob Jones, Head of the CERN Openlab project gave us an insight to some of the “Big Data” challenges they are facing at CERN and with the generation of 1 petabyte per second it is clear to see why! Even when throwing most of the data they still permanently store between 4 – 6 GB of data per second and on board for recording 30 PB of data for 2012 and they aren’t even running at full steam yet!
Many other companies spoke at the event but the one that resonated with me most was the one by David Stephenson, Head of Business Analytics at eBay. It wasn’t the impressive stats such as 70-80 billion database calls per day, 50+ TB of new data stored per day or the 100+ PB of data processed daily. It was what he called “The prize”:
“using behavioural data to understand our customers intenet, preferences and decision making processes”
The reason this resonated so much with me, because this is exactly “the prize” that I have been working with one of our (Smart421′s) customers for – tapping into the rich vein of information available and utilising this to ensure that they engage with the customer in a more relevant and timely manner.
It really does come down to the four V’s: (i.e. Doug Laney’s “3Vs” construct of Big Data mentioned in previous blog here, plus one further crucial point)
And actually the one we all really want to focus on is the fourth VALUE! Otherwise, why are you doing the first three anyway – right ?!
It is the VALUE of the data that we seek, seeing what else we have available to us to allow us to progress, make better applications, communication more effectively and more relevantly, whatever you business is is the VALUE that you derive from your data that really counts.
One of the points that David Stephenson made was that 85% of the eBay analytical workload is new or unknown – you don’t need to know all the questions you need answers for when you start a “Big Data” programme, just look at what you already have, see how this can be supplemented, what is relevant to your market or other areas of business and take it from there!
You’ll be amazed at what you find and the impact that it can have on your business! It is not all about unstructured data, or installing and using Hadoop, it is about using your data and this will most likely fall into all three of the structured, semi structured and unstructured camps and no one tool is going to give you a solution – it is about realising that there is an untapped resource that give you so much – so remember it’s all about the four V’s (well… really… it is the last V we are all wanting to get from our “Big Data”).