Open Source


ForgeRock OpenAM version 11.0 highlights Powerpoint presention slide on 15 October 2013 by John Barco, VP Product Management at ForgeRock. Photo: Kirsten Hurley

ForgeRock OpenAM version 11.0 highlights Powerpoint presention slide on 15 October 2013 by John Barco, VP Product Management at ForgeRock.
Photo: Kirsten Hurley

Last week I had the pleasure of attending the ForgeRock Open Identity Stack Summit: Europe in a chateau on the outskirts of Paris. The event was a sell out but, keen not to turn away delegates, the ForgeRock team valiantly sacrificed their rooms to the extra attendees while they stayed in a hotel half an hour away!

There was a real mix of attendees, from the technical community to procurement executives, and of course ForgeRock’s partner community. All were eager to hear product updates, and listen to industry experts such as Eve Maler (Principal Analyst at Forrester) on the trends they are witnessing across the world. Eve’s key note on modern IAM and trends certainly caused a stir with her statement that XACML (eXtensible Access Control Markup Language) is now in clear decline, and that authorisation standards need to get finer still. UMA (User-Managed Access) is the one to watch apparently…

A lot of the messaging revolved around how IAM (Identity Access Management) is moving to ‘IRM’ (Identity Relationship Management). This is largely driven by factors such as internet scale over enterprise scale –  an inescapable requirement for a user base that is no longer restricted to employees, but partners and customers too, and accessing not just from on premise, but from the internet and in the cloud. And that’s without even mentioning the number of devices each individual is expecting to be able to access your systems with!

It was also apparent why ForgeRock had taken the radical step to rebrand the latest upgrade from v10.2 to v11 when the new features were revealed (see photo below). ForgeRock has so rapidly developed an already market-leading product to become even simpler to integrate and deploy that the changes certainly justified the leap in nomenclature.

Finally, I cannot sign off without mentioning CEO Mike Ellis’s big announcement of the event – Salesforce and ForgeRock announced a partnership which will see the SaaS vendor integrate the open source software into its Identity Connect product.

If there is anyone out there who still wonders whether open source technology really has a place in the enterprise, surely this news that one of the world’s largest technology vendors sees fit to partner with them must mean that ForgeRock’s position in the IAM (or IRM!) market is confirmed?!

Please share this blog using the social icons, or via the short URL http://bit.ly/1aFumKp

Please Rate and Like this blog.  Our readers want to see YOUR opinion, so please take a moment to leave a Comment.

Today myself and four other Smarties attended Norfolk’s first Mobile Development Conference at the Hethel Engineering Centre, which is right next to where they make Lotus Cars.

Conference Room

There is an obvious tie-up between Hethel and Lotus given that the main presentations were held in the Colin Chapman room (founder of Lotus cars) where one of Ayrton Senna’s “99T” F1 cars was stuck to the wall!

Mobile Development is one of the most exciting and diverse areas in IT at the moment and this conference did very well to have a wide coverage from games developers like MonoGame to Tim Ferguson, Head of Digital at one of our customers AVIVA and their mobile app lessons learnt from their various innovations and experiments.

The keynote by Neil Garner of @Proxama resonated with me very much, both in his memories of tech from past years (Nokia 7110 first Wap phone) to his honest assessment of NFC and rebuttal of the doubters who don’t see NFC taking off now. The ARM Trustzone was highlighted by Neil as a key element in providing security for NFC applications. There are Contactless terminals everywhere now and 9 of the top 10 device manufacturers are signed up to support NFC – Apple is the odd one out but aren’t they always?

Our own @JamesElsey1986 later showed that NFC is more flexible and powerful than you think using Android. James later tweeted:

Source code/slides from my #NFC workshop http://ow.ly/mDz7A  Feel free to ask questions / give feedback. Thanks for attending! #MobDevCon

Matt Lacey presented two sessions, his first on tips for developing for Windows 8 included some real gems which will help us with our tailoring of our cross-platform Apps to work well on the new Windows platforms. I agree with Matt, who worked on PhoneGap’s Windows integration code that you have to be knowledgeable and experienced in developing native Apps to be able to build successful cross-platform Apps. Luckily Smart421 have a whole Microsoft practice to help us Java-oriented types out with that. Read Matt’s blog for more info and his slides from his second presentation on monetising Apps.

I was first on to present after lunch and talked about our work delivering cross-platform mobile experiences with Worklight – my slides are now up on slideshare. There was a general theme at the conference that cross-platform tools are coming of age and the compromise of user experience and performance when compared to native development is far outweighed by the much faster and cheaper overall costs of App development and maintenance. I just about managed to demo the new Worklight 6 Studio IDE and Console. I am really liking the improved jQueryMobile integration and want to find time to check out the new App Centre tools and automated testing when I get the chance.

Ruth John (@rumyra) of O2′s “The Lab” gave a kitty and puppy-tastic presentation on FireFoxOS and why Telefonica have taken it up especially in the emerging South American markets – it’s free, works well on low-end handsets with the FireFox operating system built on top of the Gecko layer as is Android. It will be really interesting to see if this will catch on in the UK and European markets in these times of austerity where people are perhaps not quite ready to splash a few hundred every year on the latest iOS gadgets.

There was also a really enlightening “sponsor presentation” by Basho on the subject of reclaiming the terms web scale, big data, dev ops and how the NHS is using Riak’s open source technology.

Massive thanks to Naked Element (Paul and Marie) and everyone involved in setting up the event, thanks to Hethel for such a great venue, the sponsors for the delicious lunch and the attendees for their support and kind comments.

P.S. Welcome to twitter @CharlesBSimms :-)

sexton_blakeSome things take a bit of unravelling. But to solve mysteries, you don’t have to be Sexton Blake (doubt you remember him?).

With the help of search engines, a few analysts’ reports and a bit of time, the fog quickly clears to reveal (another?) new wave coming in the IT industry.

Only this time, we’re talking databases.

Databases?  Ok – not the sexiest of subjects – I grant you – but we would do well to note the emerging trend in NoSQL and in open source distributed datastores generally.

Fear not. SQL hasn’t suddenly abdicated its crown, or become the object of sordid revelations about its private life. Far from it. SQL has deservedly won its place in the history of computing, especially for transactional databases.

But apparently not all databases were created the same (all the vendors will tell you that… and show you their glossy marketing brochures to back up their assertions – right?).

Mystery solved -  NoSQL means “Not Only” SQL

NoSQL doesn’t mean literally “No” SQL. And it is this “not only” aspect that is causing a bit of a stir. NoSQL databases are created in an entirely different way compared to traditional SQL databases.

In fact there are four main kinds:

NoSQL

Technology Landscape: No SQL

In the blog by our CTO on 28 May, Robin made mention of one such technology, a graph database called Neo4j which was one of the things that caught his eye at Big Data London.

I first heard Neo4j explained by Ian Robinson back in February this year at SyncConf. I was somewhat riveted by the capability of a graph database, which is regarded by many as a superset of all the others.

here at Smart421, we have already been working with others on customer engagements, for example with Cassandra one of the leading column data stores and  MongoDB, which is arguably the leading document database, overtaking CouchDB.

If you’re a Solution Architect and Technical Architect, you will almost certainly be tracking these and several others.

If you’re a developer, programmer or involved in some capacity in DevOps, you will almost certainly had a play or done something more serious with NoSQL (if not, why not?)

For what it’s worth, I’ve been quite impressed by some I’ve seen. Take Riak, a key-value pair distributed datastore by Basho which, although a comparatively young business, has an impressive management team exported out of Akamai and has already built a strong user base in the United States. Riak looks like it deserves more prominence over here; I’ll stick my neck out and predict it will rise to become major name before too long.  Basho will be sponsoring MobDevCon this July where two “Smarties” will be speaking.

Basho will also be organising RICON Europe, a tech led event for those interested in all-things NoSQL which will be coming to London in October (remember – you heard it here first).

NoSQL is on the up – it’s official

As a collective, NoSQL database management systems are on the move and picking up pace. Market analysts are tracking their progress carefully.

Gartner for example has predicted that NoSQL could account for 20 per cent of market penetration as early as 2014, which seems rather astonishing until you see how Gartner arrives at its assumptions. Merv Adrian, ex-Forrester and now Research VP at Gartner (@merv), appears to have done his homework on this and he is seeing NoSQL rise from basically a standing start.

As recently as 2012, Adrian quantified NoSQL Database Management Systems as having a market penetration of 1 per cent to 5 per cent of target audience (Adrian in Lapkin, 2012, pp. 36-38), upgrading his assessment in 2011 of NoSQL having a market penetration of less than 1 per cent of target audience (Adrian in Edjlali and Thoo, 2011, pp. 31-33).

Merv Adrian, and other market watchers, will be well worth listening to both this year and next if you get the chance at a Gartner Event, or if you have a Gartner research subscription perhaps you should request an inquiry call sooner rather than later.

References:

araven07 (2011) Introduction to Graph Databases. Recording of presentation by E. Eifrem, 14 July 2011]. Available at <https://www.youtube.com/watch?v=UodTzseLh04> [accessed 23 May 2013].

Amazon Web Services (2013 ) AWS Marketplace: Riak. [Online]. Available at <https://aws.amazon.com/marketplace/pp/B00AMRXCQA/> [accessed 29 May 2013].

Adrian, M. (2012) Who’s Who is NoSQL DBMS. Gartner. 07 Jun. G00228114.

Aslett, M. (2013) ‘Navigating 451 Research’s revised database landscape map’. 451 Research. 10 January. [Online]. Available <https://451research.com/report-short?entityId=75461> [accessed 25 May 2013].

Aslett, M. (2013) ‘451 Research survey highlights growing adoption of NoSQL databases’. 451 Research. 16 May. [Online]. Available <https://451research.com/report-short?entityId=77136> [accessed 25 May 2013].

De Castro, R. (2012) ‘Why I think Riak is a great NoSQL’ DZone. 30 July. [Online]. Available at <http://architects.dzone.com/articles/why-i-think-riak-great-nosql> [accessed 26 May 2013].

Eagle, L., Brooks, C. and Sadowski, A. (2013) ‘New wave databases in the cloud, part 3: SoftLayer and Basho’. 451 Research. 01 May. [Online]. Available <https://451research.com/report-short?entityId=76917> [accessed 27 May 2013].

Edjlali, R. and Thoo, E. (2011) Hype Cycle for Data Management, 2011. Gartner. 26 Jul. G00213386.

Eifrem, E. (2011) Overview of NoSQL. [Recording of presentation by E.Eifrem ]. Available at <https://www.youtube.com/watch?v=sh1YACOK_bo> [accessed 23 May 2013].

Kovacs, K. (2013) Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbase vs Neo4j vs Hypertable vs ElasticSearch vs Accumulo vs VoltDB vs Scalaris. [Online]. Available at <http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis&gt; [accessed 08 June 2013].

Lapkin, A. (2012) Hype Cycle for Big Data, 2012. Gartner. 31 Jul. G00235042.

Novet, J. (2013) ‘Basho Technologies takes aim at more enterprises with upgrades’ GigaOM. 21 February. [Online]. Available at <http://gigaom.com/2013/02/21/basho-technologies-takes-aim-at-more-enterprises-with-upgrades/> [accessed 26 May 2013].

Ricon (2013) RICON 2013. Available at <http://ricon.io/>  [accessed 25 May 2013].

Villanovauniversity (2011) A Tour of the NoSQL World. [Recording of lecture by David Cassel, Senior Consultant of MarkLogic at Department of Computer Science at Villanova University, United States of America on 07 Nov 2011]. Available at <https://www.youtube.com/watch?v=nXQsykDfGBk> [accessed 27 May 2013].

Please Rate and Link this blog.  We welcome your Comments.

Tonight’s 18th Big Data London meetup lived up to the title – it was big. Held at the News International offices it was the best attended meetup I’ve been to, and really encouraging for the future of the UK economy – nice to see a really strong community in action with 2000+ members, and 300+ attending. The organiser Manu Marchal was saying that this is the largest big data community in Europe now.

Manu Marchal at the stand

And the reason it’s so well attended is because this is just a really hot technology area – hot as you like. It’s the perfect storm (ack) of…

  • really interesting challenges to solve that have not been solved before – this is what makes it really attractive I think, plus
  • more new technologies than you can shake a stick at – the obvious candidates (Hadoop, pig, hive, R etc) plus Apache Crunch, Avro, cascading, etc
  • open source (predominately)
  • cloud computing
  • a white hot jobs market – almost every speaker (including the hosts News International) was blatantly recruiting
  • and finally a valid reason to need to ask your boss to let you run a cluster of 100+ cores :)

Every presentation was of value. Rik from Neo Technology talked about Neo4J – about NoSQL and graph databases and somewhat tenuously related to big data, but informative all the same. He was clear about the pros and cons of graph databases – whilst maintaining ACID transactions (i.e. immediately rather than eventually consistent), fundamentally Neo4J scales well vertically well but poorly horizontally – and there’s significant computer science research underway in this area. It’s horses for courses – graph databases have their place in a polyglot persistence architecture model for certain types of data and queries.

Mike from News International was pretty open about what they are doing in their big data internal “startup” – all running on AWS EMR interestingly (i.e. not a Cloudera/MapR/Hortonworks distribution on AWS or elsewhere). He made a really good point that they favoured technologies that they had deep skills in and so had stuck with AWS EMR as their Hadoop platform as News International are big AWS users (600 instances or so I think he mentioned), i.e. moving on to the next shiny thing is all very well but pretty pointless if only Fred knows anything about it (and he’s just buffing up his CV for his next contract). They are processing about 20-30 data sources with 2-3 of those producing about 20Gb/hour each.

After giving a brief overview of the search market, Costin from ElasticSearch gave a great demo of Kibana, a JavaScript based frontend that provides an interface to ElasticSearch to perform real time log analysis – a use case for ElasticSearch that I wouldn’t have expected. It struck me that a potential weakness of ElasticSearch is one of its strengths – that it is so flexible and can be used in lots of ways (full text search, data analytics etc) that it’s hard to say in a nutshell what it is/where you should use it.

The brief presentation at the end that really caught my attention was from Paul Salazar from Skytree. This is a US based startup who have some really interesting technology and algorithms to bring to bear on the discipline of machine learning. They have a pretty heavyweight set of founders and tech advisors (PhDs all over the place) and have found algorithmic and architectural mechanisms to speed up core machine learning algorithms, e.g. Taken from here, “head to head comparisons of their offering vs. R and WEKA. Administered using Amazon’s EC2 cloud and Sloan Digital Sky Survey (SDSS) public data. Skytree was faster in the tests displayed, including an all neighbors query in which it completed indexing in 4.2 seconds compared to 2,272 seconds for WEKA and more than 72,000 seconds for R”. They’ve achieved this mainly by finding ways to optimise algorithms that require NxN computations (for a dataset of N objects) to N or logN computations – and so makes Mahout look – well – like it’s riding a lumbering elephant. Of course, it’s not going to be cheap to buy this kind of performance advantage, and so the pricing put it firmly into enterprise territory, e.g. for customers who would get the financial payback from detecting outliers in their datasets that indicate fraudulent activity. Definitely one to watch.

I think there’s probably an interesting big data project to use machine learning to investigate correlations between meetup community size and activity, location and quality of venue, quality of catering (very high at this event), etc. I’ll leave that as an exercise for the reader…

Old record player

Photo: Old record player by Grafphotogpaher (c) Dreamstime Stock Free Images

Those of you accustomed to developing android application will be familiar with the mechanism used for building android apps, such as having the IDE do the work for you, which is somewhat limited, or breaking out and writing ant scripts to orchestrate a build. Some of the adventurous amongst you may be using the maven-android-plugin.

It appears that the Google Android build tools team has come to terms with the fact that ant probably isn’t the best option these days, and are moving towards replacing ant with gradle in the SDK in the future.

Last night I was fortunate enough to attend a Meetup.com event with the London Android Group (@londroid) at the SkillsMatter (@skillsmatter) HQ, where Hans Dockter (@hans_d), the CEO of gradleware gave a presentation on what gradle is, and how it can be used for building android projects, enabling developers to manage their builds with groovy based scripts rather than XML.

What is a build framework anyway?

As Hans put it so well, in short, a build framework will “compile stuff, copy things around, and drop out an archive“.

Whats so limiting with the current options?

Ant can be considered imperative, simply meaning that you have to spoon feed it with instructions via tasks; compile these files that reside here, copy them to this directory, jar it up etc, its not really capable of figuring things out for itself.

Maven on the other hand is considered declarative, meaning that you can focus more on the outputs of a task, such as “I want a web application”, providing you have your source in the right place, maven is smart enough to figure out where to find it, how to compile it, and what the output for a web application should look like. Essentially you tell maven what the inputs are, and what you expect to get out of it, and maven figures out the bit inbetween, thus avoiding the need for scripting tasks as you would have to if using ant.

Sounds great, so what does gradle bring to the android party?

Free beer! If only…but we get the next best thing.

Gradle attempts to take the best parts of ant and maven, by using it for building your android projects you can benefit from:

  • Very light and easy to manage build files, no trawling through humungous XML files. (gradle vs maven).
  • Gradle follows convention over configuration like maven. It knows where your source is for a java project (unless you decide to override it), you can read more about convention over configuration here.
  • Flexible dependency management, integrate with existing maven/ivy repositories. Different versions of dependencies for different build tasks? No problem.
  • It gives you the freedom and flexibility to define your own custom behaviour without needing to write plugins as you would if using maven. Groovy is your friend here.
  • Support for multiple projects. You don’t need to have separate projects for production code and integration tests, you can have keep them within the same project and define different source sets. This greatly reduces parent/child projects that can be a chore to maintain.
  • Don’t have gradle installed? Can’t install gradle easily? No worries, there is gradle wrapper for that. This is particularly useful on CloudBees Jenkins environments where you don’t have access to install gradle directly.
  • You have a free and paid for version of the app, with some common shared library between them? Gradle handles this perfectly via product flavours
  • In addition to free and paid for flavours, you also have builds for different architectures such as ARM and x86? Flavour groups will help you there.
  • You’re not tied to an IDE, the build scripts should be IDE independent so your team can choose their own flavour of IDE, or build from the command line as you would on a jenkins environment.
  • Don’t want to use Eclipse, prefer IntelliJ instead? No worries, apply the idea plugin and run gradle cleanIdea idea. Boom, idea project is setup and ready to go, no messing around with project settings.
  • Easily run instrumentation tests on multiple virtual devices, no need for manual testing each time you make a change.

Getting started?

  1. Watch the presentation
  2. Read through the android build tools page, plenty of information here on how to use the android gradle plugin
  3. Have a read of the gradle documentation, its very well documented.
  4. Checkout the samples on github
  5. Have a browse of the groovy documentation so you can understand the basic syntax of the language.
  6. code, code, code!

Make sure you join the London Android group on meetup.com and look out for future events like this, they’re free and well worth the train ticket. For those in the south east, be sure to checkout SyncIpswich and SyncNorwich for free technology meetups.

As my conference season is fast approaching I have been looking at what will I be wearing this year. Well, Open Identity, Open Infrastructure and Open Integration are the themes of my Spring collection.

First up is Open Identity with the Gartner Identity and Access Management Summit ( #GartnerIAM ). I’m heading straight for the ForgeRock stand which I think will be buzzing this year as we’re seeing interest taking off in their products – and our partnership is really starting to get into its stride too.

Next up will be Open Infrastructure at the AWS Summit in London. This will be especially interesting following Smart421 winning two major AWS contracts recently and starting to operate in the SIAM role for National Rail Enquiries. My expectation is that this time around many more enterprises be declaring their AWS credentials – which is closer to my personal experience.

Last but not least, it’s Open Integration, back at Gartner for the Application Architecture, Development & Integration Summit ( #GartnerAADI ) in London on 16-17 May. This time we’re showcasing how Smart421 is turning the Service Factory concept “inside-out” to create the Open Enterprise. If you want to see more then come along and meet us there.

All three themes all tie back to how we see the Future of Architecture developing and that’s getting quite exciting too.

DATABASE at Postmasters, March 2009DATABASE at Postmasters, March 2009 to Michael Mandiberg

“NoSQL” is an unfortunate term for the currently hype around non-relational database systems. Many of the ideas presented at the NoSQL Roadshow in London of the new wave of different databases are not new. More than one presenter used the term preSQL, and during a break Brian Bulkowski from Aerospike explained that Oracle had created these ideas many times over, but had not considered them commercially worthwhile. What has changed in recent years though is the business need. There are new problems to solve such as online and mobile gaming and advert-serving that require users to trade-off consistency, availability, and partition tolerance.

For example, Amazon require a horizontally scalable system (partition tolerance), so that when, and it can be any time of day, a customer adds an item to their shopping cart, it is captured in the database (availability). They are less concerned about the time it takes for this to filter through their other data stores (eventual consistency), or how long it takes to fulfil the order. In the mid-2000s Amazon realised that the current relational database were not meeting their needs and they created Dynamo, a highly-available and scalable technology for key-value storage.

Papers by Amazon on Dynamo, and Google on their technology BigTable, were a major contributor to the current early adopter market. Today there are many competitor NoSQL products, including the aggregate-oriented databases (key-value stores, document stores and column stores), and graph databases.

Key-value stores are the simplest store type, with keys mapping to binary objects. They allow low-latency writes and scale easily across multiple servers, but can only offer single key/value access. David Dawson and Marcus Kern of MIG gave an example of using the key-value store Riak as a persistence store for a bespoke queuing system in their SMS gateway product. The biggest difficulty was finding a way to simplify the retrieval of messages in the event of a node failure; their solution involved the use of predictable keys.

Column stores manage structured data, stored in columns, with multiple-attribute access. Apache Cassandra, originally developed by Facebook, is a well known example. These are also optimised for writes and used for high throughput uses such as activity feeds or message queues.

In document stores keys are unique references to “documents” which encapsulate and encode data in a standard format. These database systems hold hierarchal data structures that reduce the need for table joins and allow for variety and evolving schemas. Akmal B. Chaudhri of IBM presented his investigations into the popularity of the various NoSQL offerings, showing that MongoDB, a document store for JSON documents, is leading the way. Some of our Smart consultants are using MongoDB from an application development perspective and are very positive about their experience so far. It will be interesting to follow these projects and understand the effects on administration, support and future change. What impact will the schemaless nature of the database have?

Graph databases use the concept of nodes and edges to store information about entities and the relationships between them. Jim Webber of Neo Technolgy gave an example of the use of the open source Neo4j graph database in modelling the relationships in Doctor Who.

Each type of database system has its own strengths and weaknesses, and the reality is that NoSQL databases will only be used as part of a solution. Choosing different data storage techonologies for different persistence needs has been termed polyglot persistence. Every example provided at the roadshow included relational database systems alongside NoSQL technology. Wes Biggs of Adfonics, an independent advertising marketplace, explained the architecture of their solution for buying advertising on mobile devices on behalf of advertising agencies. They use the relational MySQL with data on hard drives for long running information such as campaign details, MySQL Cluster with data on flash drives for aggregate instructions such as user details, and Aerospike with data in RAM for raw instructions such as the inflight data.

The overall message from the NoSQL Roadshow was that this is still very early days for NoSQL database systems, and it is not yet clear whether the future popularity will be closer to that of OO or relational databases.

Many presenters offered words of caution. Security is a big issue and many of the major NoSQL database systems must be run in a trusted environment. Wes Biggs discussed the issue of the huge number of vendors and the lack of evidence for their claims. At this moment any system choices are basically faith based, and should only be taken if someone else has used it for a similar use case; they were burned with a few early choices for a key/value store before they settled on Aerospike.

The Fusion-io presentation on their new directFS filesystem for flash drivers was only one example with obsession with performance, and the importance of the hardware architecture.

At the moment there are some exciting potentials for NoSQL technology, but anyone getting involved at this stage will be making some brave choices. It will be fascinating to see how the market shakes out over the next few months and years and I’ll be following with interest.

Update: 18 Dec 2012 – interested to learn more about MongoDB ? Click here.

Please Like and Rate this blog. If you can, please leave a Comment.

I attended the Big Data Analytics 2012 event at the Victoria Park Plaza, London (organised by Whitehall Media) yesterday along with our CTO, Robin Meehan. We wanted to attend to keep in touch with what some of the big players are saying about “Big Data” and their views on analysis with Cloudera, Oracle, SAP, Informatica, SAS UK, SGI, MapR Technologies, GigaSpaces, MicroStrategy, Pentaho etc all there (and furnished with masses of pens, notepads, pen drives and stands etc).

The event itself was good; usual mix of CIO’s, CTO’s and techies in attendance. A number of the key note speakers (in and amongst the sales pitches) had some interesting stories and facts, such as John O’Donovan, Director of Technical Architecture & Development at the Press Association, talking about how they analysed the “masses of data” captured during the London 2012 Olympics, to deliver “Content as a service” to consumers around the globe (including the translation of the content on route) supporting up to 50K TPS. This was followed by a great clip explaining the math behind Robert Carlos’s improbable goal – it’s worth a look – click here to watch on YouTube.

Where do I start?

Where do I start?

Bob Jones, Head of the CERN Openlab project gave us an insight to some of the “Big Data” challenges they are facing at CERN and with the generation of 1 petabyte per second it is clear to see why! Even when throwing most of the data they still permanently store between 4 – 6 GB of data per second and on board for recording 30 PB of data for 2012 and they aren’t even running at full steam yet!

Many other companies spoke at the event but the one that resonated with me most was the one by David Stephenson, Head of Business Analytics at eBay. It wasn’t the impressive stats such as 70-80 billion database calls per day, 50+ TB of new data stored per day or the 100+ PB of data processed daily.  It was what he called “The prize”:

“using behavioural data to understand our customers intenet, preferences and decision making processes”

The reason this resonated so much with me, because this is exactly “the prize” that I have been working with one of our (Smart421′s) customers for – tapping into the rich vein of information available and utilising this to ensure that they engage with the customer in a more relevant and timely manner.

It really does come down to the four V’s: (i.e. Doug Laney’s “3Vs” construct of Big Data mentioned in previous blog here, plus one further crucial point)

  • VOLUME
  • VARIETY
  • VELOCITY
  • VALUE

And actually the one we all really want to focus on is the fourth VALUE!  Otherwise, why are you doing the first three anyway – right ?!

It is the VALUE of the data that we seek, seeing what else we have available to us to allow us to progress, make better applications, communication more effectively and more relevantly, whatever you business is is the VALUE that you derive from your data that really counts.

One of the points that David Stephenson made was that 85% of the eBay analytical workload is new or unknown – you don’t need to know all the questions you need answers for when you start a “Big Data” programme, just look at what you already have, see how this can be supplemented, what is relevant to your market or other areas of business and take it from there!

You’ll be amazed at what you find and the impact that it can have on your business! It is not all about unstructured data, or installing and using Hadoop, it is about using your data and this will most likely fall into all three of the structured, semi structured and unstructured camps and no one tool is going to give you a solution – it is about realising that there is an untapped resource that give you so much – so remember it’s all about the four V’s (well… really… it is the last V we are all wanting to get from our “Big Data”).

Having recently spent time working on the IBM Worklight platform, I thought it would only be fair if I documented some of my findings. No disrespect to the IBM’ers, but its reasonably fair to say that documentation is a little sparse in places, so lets give a little back to the community by discussing some of the hurdles. Lets not dwell on what Worklight is, Andy has already covered this well in a previous post; but lets just dive right into some of the technical aspects.

General Thoughts

Development on the whole is a relatively straightforward process, even for someone like myself that often steers well clear of anything that involves web presentation technologies (it reminds me of dark nights at the university labs spending hours trying to get a button to align correctly, the night before coursework submission *shudder*).

The Worklight eclipse plugin provides a good drag & drop GUI builder, but with support only for dojo. I opted to drop dojo and go for jQuery. jQuery is very well documented, and is easy to get help should you require it. One of the main things I like about jQuery is its showcase and examples, they are documented very well and the learning curve is generally quite small, but also the themeroller, it becomes incredibly easy to customise the default colour scheme and drop the generated CSS into your app. It always amazes me how excited the marketing guys will get if you can add in the corporate colour scheme to your app (thanks Joseph!).

Continuous Integration

We’re big fans of CI here, so I was quite keen to understand how easy it would be to have our Worklight apps built from the command line, and ultimately on a Jenkins CI box. The chaps over at IBM have done a fantastic job of exposing an array of Ant tasks that help with building and deploying apps, you’ll almost certainly want to read through module 42 on the getting started page that covers these tasks:

  • adapter-builder – Use this task to build your adapter and create the .adapter file
  • adapter-deployer – Use this to deploy a .adapter file to a Worklight server (very  useful for deploying to a remote AWS instance)
  • war-builder – Use this to build the server .war file that you will deploy to the application server (some manual tweaks are required)
  • app-builder – Use this to build the .wlapp files that you will deploy into your Worklight container
  • app-deployer – Use this to deploy your .wlapp files onto a Worklight server (useful again for remote deployments)

Lets have a closer look at each of those targets, and how we’re using them here at Smart421:

Getting the party started, with init

Firstly, grab the worklight ant jar (you’ll need to have purchased the WL Enterprise edition for this) and add it into your ant context like so :

<target name="init">
 <echo message="Loading ANT Tool"/>
 <taskdef resource="com/worklight/ant/defaults.properties">
 <classpath>
 <pathelement location="./build-config/worklight-ant.jar"/>
 </classpath>
 </taskdef>
 <property environment="env"/>
 </target>

Now you’re free to use the ant tasks anywhere in your build script.

Building & Deploying WL Adapters

You need to build each adapter individually, and then deploy each one. You can create the following ant targets to do that for you:

<target name="buildAdapters" depends="init">
 <echo message="Building all adapters"/>
 <adapter-builder
 folder="./adapters/TwitterAdapter"
 destinationfolder="./bin"/>
 <!-- Build your other adapters here, same as above-->
</target>

<target name="deployAdapters" depends="init">
 <property name="WLSERVERHOST" value="http://my_aws_ip_here:8080/SmartConf"/>
 <echo message="Deploying all adapters"/>
 <adapter-deployer
 worklightServerHost="${WLSERVERHOST}"
 deployable="./bin/TwitterAdapter.adapter"/>
 <!-- Deploy your other adapters here, same as above-->
</target>

Building the Server WAR

You can build the server war file using the war-builder task, as shown below. It is important to note however, that I needed to do some tweaking to the war file to avoid any post-installation configuration tasks. According to the Worklight forums, there doesn’t appear to be a way to include files in the WEB-INF when the war is created, which means that once you’ve expanded the war on the application server you’d need to manually replace the default web.xml and context.xml files (to set your datasources), this can be quite frustrating, so in true Blue Peter fashion, I’m updating the war file with files I created earlier.

<target name="warBuilder" depends="init">
 <echo message="Building the war file"/>
 <war-builder
 projectfolder="./"
 destinationfolder="./bin"
 warfile="./bin/SmartConf.war"
 classesFolder="./bin/classes"/>
</target>

<target name="updateWar">
 <echo message="Updating the war file"/>
 <war destfile="./bin/SmartConf.war" update="true" webxml="./build-config/web.xml">
 <metainf dir="./build-config" includes="context.xml"/>
 </war>
</target>

Building & Deploying the WL Apps

You’ll also want to automate the building and deploying of the wlapp files, you can do this with the following :

<target name="buildApps">
 <echo message="Building all WL Apps"/>
 <app-builder
 applicationFolder="./apps/Smartconf"
 nativeProjectPrefix="SmartConf"
 outputfolder="./bin"/>
</target>

<target name="deployApps">
 <property name="WLSERVERHOST" value="http://my_aws_ip_here:8080/SmartConf"/>
 <echo message="Deploying all WL Apps"/>
 <app-deployer
 worklightServerHost="${WLSERVERHOST}"
 deployable="./bin/SmartConf-all.wlapp"/>
</target>

Building the Native Application Distributable Binaries You’ve survived this far, and I’m thankful to you for that, however we’re not quite finished yet. Worklight will generate the native projects for you, but its your own responsibility to take those project directories and build the Android APK, or the iOS IPA etc. IBM will draw the line at this point, so you need to build them yourself, you can do this for all of the environments quite easily using additional ant tasks, android is the easiest :

<target name="client-android" depends="buildAndroid">
 <!-- Run the android native build, in its own directory -->
 <ant antfile="./apps/SmartConf/android/native/build.xml" target="release" useNativeBasedir="true"/>
 <!-- Copy up the apk into the bin area, for consistency -->
 <copy file="./apps/SmartConf/android/native/bin/SmartConf-release-unsigned.apk" tofile="./bin/SmartConfSmartConfAndroid.apk" overwrite="true"/>
 </target>

Building Blackberry and iOS apps from the command line is slightly more involved, and I feel they warrant their own blog post on that, alternatively, get in touch and we’d be glad to offer some assistance. Bear in mind you will need an Apple MAC to build iOS, for which we’ve installed a shared box in our build environment.

Other Gotchas

As with taking on board any emerging technology, there will always be plenty of head-scratching moments where the documentation is thin, and Uncle Google doesn’t provide much help, fortunately for you, we’re a nice bunch of guys here at Smart421 so we’ll share some of the things that had us pondering over a coffee:

  • The trailing “/” in the Worklight server host URL is required, don’t ask why, it just is.
  • The versioning conventions for Worklight are a little strange5.0.0.270 = v5.0 GA, but the developer edition is 5.0.2.407-developer-edition = 5.0.0.3.
  • If you have an existing 5.0.0.2 WL server installation, don’t upgrade it to 5.0.0.3, it fails to upgrade all components and leaves you with some obscure error messages that are hard to trace. The best plan of action is to uninstall, install again, but make sure you check for updates at time of installing, via the wizard
  • App crashes with Unreachable host? When you build and deploy the app to your device, it has the WL server IP hardcoded into it. The next day when you arrive at the office and hop onto the Wifi, DHCP gives you a different IP address…It’s a classic schoolboy error, but catches us out from time to time. A simple solution if you don’t have a spare box lying around is to install the Worklight server on AWS and deploy to the cloud, bearing in mind that it needs to be open to your mobile devices over the Internet in a real-life installation anyway.
  • Results is undefined on adapter call. A subtle difference here, HTTP adapters use invocationResult.results, whereas SQL adapters use invocationResults.result. That last character makes all the difference.
  • Response cannot be parsed, please contact support; this is an annoying error that you often see in the developer preview, just make sure you set the body onload to WL.Client.init() as mentioned here.
  • Unable to use geolocation services on android? You’re probably seeing Caught security exception registering for location updates from the system, this should only happen in DumpRenderTree. Make sure you have the geolocations permission in your android manifest as detailed here.

Conclusion

On the whole, I was very impressed with Worklight, they are offering a lot of functionality over and above the standard Cordova project. Some of the errors I’ve encountered have been a little frustrating, as often my only source of help was the forums, but I can accept that it is a product in its early stages of adoption, and will probably go very far. I’m looking forward to working with it in the future.

If you’d like to have a look at some of the apps we’re creating, or generally just want a chat about Worklight and some of its capabilities, or Mobility in general, we’d love to hear from you.

In part 1 I set out my requirements for evaluating and choosing an open source ESB, and promptly fall down a rabbit hole of OSGI and Maven when I get to see how much more there is to them than I was previously aware.

From time to time we get requests on how to get started with middle-ware technology on the cheap. Here the emphasis is just about connecting service providers and consumers up, without getting into anything fancy like orchestration or service repositories. Of course, in an ideal world the solution should not rule the latter out.
So here are the requirements I can filter out of a few of these conversations.

  • Open source for low cost and wider options when upgrades appear – i.e. not always forced onto the latest version.
  • Must handle WS-Security, WS-Addressing
  • Freedom for choosing java-XML binding framework
  • Supports contract-first services design (as opposed to generating the service artefacts (WSDLs, schemata) from java classes.
  • Run-time is ‘light’: i.e. when service-enabling components are deployed on the same machine as an application which will be service enabled, these service-enabling components do not gobble up all of the resources.

Contract first development is very important in a heterogenous environment. See arguments on the object – XML impedance mismatch here. Another way of putting it is: if you are going to stick with just one language (e.g. java) then why bother with XML in the first place – just go with some RMI technology, RMI-IIOP. If we are using Web-services, then interoperability is a big consideration, and for that we have to think contract-first.
One of the reasons for separating out the java-XML binding from the web-service end-point binding code is that it is great to use the same pojo to describe an entity, whether it is serialised to XML or persisted to a database or just as a value object.
On the one hand it is good practice to work with web-services in a contract-first style, and on the other hand: if you use the code (specifically the domain objects to which the XML is bound) throughout your application then you can introduce a dependancy on the XML marshalling generated classes, which is not great either. In an automated build environment, it means building your gen-src directory from schema before you can write any src code which uses it.
In the past I have got around this by specifically generating class libraries of domain objects, using jaxb, and then importing the resulting jar into any code (both client and server side) which manipulated these objects. The compromise at the time was that I ended up writing my own endpoints (servlets) to expose web-services – which is OK when there is not much WS-*, (e.g. Addressing, Security) going on.

I wanted to see if the latest incarnation of the open source frameworks would enable contract first, generation of domain objects first (such that they could also be used in persistence layers and as value objects) and relatively easy handling of WS-Security and WS-Addressing.
The new kids on the block for me are ServiceMix (a.k.a. Fuse), apache cxf, spring-ws, and the sun JAX-WS.

The previous time, the players had been Apache Axis, WSIF, and JAX-RPC. Oh I almost forgot Castor. Every one of these had had their own java-XML binding code and none of the produced beans were interoperable between frameworks. Stand-alone java-XML binding frameworks like JAXB (1.x) were not interoperable with the web services binding (e.g. JAX-RPC) generated objects.

Anyway: enough of the background… The first two I wanted to look at were both FUSE and Spring-WS, as they both allow contract first (Spring-WS will not allow anything else) and they both support Spring and its IoC (another Good Thing, but that’s a different discussion).

I had only got around to looking at FUSE when I fell down the first rabbit-hole: OSGI and Maven. I have had a look at the excellent video tutorials by Adrian Trenaman of progress (formerly Iona) software, (see the demo videos tab at the bottom of this page.
I had been aware of Maven for a while as a ‘slightly better Ant’, but the demo and a bit more digging around reveals there are two big extra features in Maven which move the game on a whole lot more:
Firstly there is the project template feature. This is the feature whereby you can create a java project using a Maven build file with a command-line argument. The command builds the appropriate directory structure and even the correct POM.xml (Maven equivalent of an Ant build.xml file). Although I had been demo’d this before, it has only really sunk in this time what a big deal this is.

We have in the past put a lot of energy into our automated build system, based around Ant. For it to work well, there is a mandated project directory structure, and set of files and libraries have to be in the right places relative to each other. There’s a bit of a learning curve on top of Ant to understand what is going on. The template projects from Maven give you all that in one go. That becomes especially evident when you try a new project: for example an OSGI plug in project. You just run Maven with the proper archetype and bingo…

Secondly there is the repository system. You can configure a set of remote repositories, and just by specifying a library in your project file (.POM file) the maven build will try to fetch the library – of the version you specify too to your local machine, in a local repository – which is then shared amongst all of your projects. Again you notice how powerful this is when you download a maven-enabled project, and on the first build it goes and fetches all of its libraries which it depends on – unless you already have them locally. A large number of common shared libraries (e.g. most of the well-know apache projects) are available in the default repositories. It is possible to configure which external repositories are trusted and should be used.

The repository system has become effectively just another resource, to the extent that to install an OSGI bundle from the OSGI console (more on this next time) ‘mvn:’ is named as the protocol type for a given resource. The resource is then seamlessly retrieved; either from local storage, or from one of the configured remote repositories.

All clever stuff.

So from starting to look at Open Source middle-ware, I have fallen down a couple of rabbit holes. The maven excursion is definitely going to make me sit up and give that a much closer look (talk about late adopter!). The second rabbit hole for me was OSGI. more on that next time. Then it will be back on track for the open source middle-ware.


Next Page »

Follow

Get every new post delivered to your Inbox.

Join 1,122 other followers