I've often said that data analytics and cloud computing were made for each other. In fact, I believe this so strongly that I've included this in talks I've given at various conferences and academic institutions. Of course, it's clear that the user community has adopted Hadoop as a de facto standard analytics tool running on whatever cloud service provider's infrastructure. The trick is, how will this evolve in the coming years?
AWS was the first cloud service provider (CSP) to offer a data analytics platform on demand. Soon thereafter, and as Hadoop matured, other CSPs such as Google, Rackspace and Greenplum (acquired by EMC in 2010 for $300 million and subsequently spun off under Pivotal and rebranded to HAWQ), among others. I don't think there is anyone who will dispute that this has signaled that the initial wave of adoption of large scale data analytics by innovators and early adopters has begun.
Clearly, larger organizations constrained by regulations and laws will opt to build in house The corollary to this is that, for smaller scale use and for the vast majority of adopters, the clear path is to use these low cost service providers as a test bed for on boarding this new paradigm.
Accenture took a stab at this (video of presentation here, deck here, white paper here) and does a relatively good job (albeit slightly dated to 2013) of describing the TCO of an on-premises deployment --$21,845-- and then uses this as the budget for AWS which results in an estimated number of instances that can be purchased using three potential flavors (68x m1.xlarge, 20x m2.4xlarge, 13x cc.8xlarge). This model slightly oversimplifies the acquisition of a number of servers and assumes a refresh cycle of some sort. Take that for what it's worth.
That said, organizations using public CSP services need to do some extra math to figure out if there is an inflection point; in other words, does it become more expensive to grow beyond a certain threshold in an on-demand environment than to deploy in-house on bare metal? There is no easy answer to this question because it's going to depend on the organization's preference for hardware vendor, utilization rates, labor, etc.
So how does this evolve in the coming years? Data analytics adoption will continue to grow and, I think at least, that we'll see:
- More public service providers entering the market with analytics offerings;
- New tools being offered as a service;
- A growing skill gap that organizations will have to scramble to fill.
You decide what comes of these predictions.
No comments:
Post a Comment