|By Bob Gourley||
|January 2, 2013 08:00 AM EST||
By Daniel Abadi
Editor’s note: The piece below by Daniel Abadi first appeared on the Hadapt blog and is republished with permission here. The framework presented provides insight into the very dynamic market around “Big Data Innovators” and should be of use for classifying many other firms in this interesting space. -bg
Recently InformationWeek published a piece, authored by Doug Henschen, that listed 13 innovative Big Data vendors. The complete list is reproduced below:
2. Amazon (Redshift, EMR, DynamoDB)
3. Cloudera (CDH, Impala)
11. Neo Technology
These 13 vendors distribute 16 unique data management products (since both Amazon and Cloudera offer multiple distinct data management/processing systems), all of which push the boundary on Big Data management.
In this post I will attempt to subcategorize these 16 products into a competitive grouping, where products placed inside the same group can be considered replacements for each other (and hence are competitive), and each group is complementary to every other group.
Before starting this classification, I will remove three products that, while potentially being interesting from a Big Data perspective, are often used outside of what has become known as the “Big Data realm”, and therefore their primary competitors did not make it on the InformationWeek list. These three products are Splunk (which typically competes with companies focused on the security, compliance, and IT operations management verticals), Amazon Redshift (which typically completes with traditional MPP database vendors), and Neo Technology (which, although usually classified as a “NoSQL database”, its focus on graph data makes it highly unique from a technology and use case perspective relative to the other NoSQL databases on this list).
The remaining 13 products can be classified into four distinct groups:
1. Operational data stores that allow flexible schemas
2. Hadoop distributions
3. Real-time Hadoop-based analytical platforms
4. Hadoop-based BI solutions
Group 1 (operational data stores that allow flexible schemas)
This group is composed of database products that can be used to manage active data for dynamic applications with hard to define (or hard to predict) schemas. The database must be optimized for inserting, retrieving, updating, or deleting individual data items in real-time (latencies on the order of milliseconds), but should also support some sort of interface for performing analysis of the data stored within. The dynamic nature of the typical use case for databases in this group implies a NoSQL interface, and either a key-value or document-store retrieval model. From the InformationWeek list, MongoDB, DynamoDB, Couchbase, and Datastax all fit in this category. Although there are some significant technical differences between these products, they can nonetheless be roughly described as potential replacements for each other in Group 1 use cases.
Group 2 (Hadoop distributions)
The products in this group are designed for very different situations than Group 1. Hadoop is typically used for large scale data analysis and batch processing. Rather than inserting, retrieving, updating, or deleting individual data items, Hadoop is optimized for scanning through large swaths of data, processing and analyzing the data as it proceeds. Hadoop has become the poster-child for “Big Data” due to its proven massive scalability, and its ability to handle the “variety” aspect of Big Data (since Hadoop does not require data to fit neatly into rows and columns in order to be analyzed and processed). From the InformationWeek list, Cloudera, Hortonworks, MapR, and Amazon EMR all fit in this category.
Group 3 (real-time Hadoop-based analytical platforms)
Group 3 takes Hadoop to the next level, transforming it from a mere batch processing system to a full-fledged analytical platform that can answer queries in real-time. Furthermore, by adding a more robust SQL interface to Hadoop (in addition to industry-standard ODBC connectors), group 3 products help to hide the complexity of Hadoop and the need for Hadoop specialists, since traditional business intelligence and visualization tools are now able to interface directly with data stored inside Hadoop. From the InformationWeek list, Hadapt clearly fits in this category, and with certain caveats, so does Cloudera Impala (the caveats are that as of the time of writing this blog post (a) Impala is an extremely young codebase and is still only in beta (b) Impala only supports a small subset of SQL and does not support UDFs or other ways to combine structured and unstructured data in the same query, so calling it an “analytical platform” might be a bit of a stretch).
Group 4 (Hadoop-based BI solutions)
Often lumped together with group 3 products, group 4 products are often confused as being competitive with group 3 products. However, just as business intelligence tools and analytical database solutions are highly complementary and were often packaged together in the pre-Hadoop world, the same is true in the Hadoop/Big Data world. Therefore, Datameer, Karmasphere, and Platfora, all of which function as a business intelligence layer above Hadoop, are capable of working closely with the group 3 products (with announcements along these lines already starting to begin).
In conclusion, although “Big Data” is an enormous and rapidly growing market, one single data management software product is not going to rule the market. Rather, there are four major groups of data management solutions within the Big Data space; and while there is fierce competition within each group, at the macro level these groups can not only co-exist, but are highly complementary. In the long run, it is likely that the 2-3 leaders in each group will emerge and share the Big Data pie.
Over the last few years the healthcare ecosystem has revolved around innovations in Electronic Health Record (HER) based systems. This evolution has helped us achieve much desired interoperability. Now the focus is shifting to other equally important aspects – scalability and performance. While applying cloud computing environments to the EHR systems, a special consideration needs to be given to the cloud enablement of Veterans Health Information Systems and Technology Architecture (VistA), i.e., the largest single medical system in the United States.
Aug. 1, 2014 02:17 PM EDT Reads: 661
It’s time to face reality: "Americans are from Mars, Europeans are from Venus," and in today’s increasingly connected world, understanding “inter-planetary” alignments and deviations is mission-critical for cloud. In her session at 15th Cloud Expo, Evelyn de Souza, Data Privacy and Compliance Strategy Leader at Cisco Systems, will discuss cultural expectations of privacy based on new research across these elements.
Aug. 1, 2014 12:00 PM EDT Reads: 1,182
The Internet of Things is not new. Historically, smart businesses have used its basic concept of leveraging data to drive better decision making and have capitalized on those insights to realize additional revenue opportunities. So, what has changed to make the Internet of Things one of the hottest topics in tech? In his session at Internet of @ThingsExpo, Chris Gray, Director, Embedded and Internet of Things, will discuss the underlying factors that are driving the economics of intelligent systems. Discover how hardware commoditization, the ubiquitous nature of connectivity, and the emergence of Big Data and analysis are providing the pull to meet customer expectations of a widely connected, multi-dimensional universe of people, things, and information.
Aug. 1, 2014 09:00 AM EDT Reads: 1,346
SYS-CON Events announced today that Esri has been named “Bronze Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Esri inspires and enables people to positively impact the future through a deeper, geographic understanding of the changing world around them. For more information, visit http://www.esri.com.
Aug. 1, 2014 08:45 AM EDT Reads: 1,444
SYS-CON Events announced today that Cloudian, Inc., the leading provider of hybrid cloud storage solutions, has been named “Bronze Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Cloudian is a Foster City, Calif.-based software company specializing in cloud storage. Cloudian HyperStore® is an S3-compatible cloud object storage platform that enables service providers and enterprises to build reliable, affordable and scalable hybrid cloud storage solutions. Cloudian actively partners with leading cloud computing environments including Amazon Web Services, Citrix Cloud Platform, Apache CloudStack, OpenStack and the vast ecosystem of S3 compatible tools and applications. Cloudian's customers include Vodafone, Nextel, NTT, Nifty, and LunaCloud. The company has additional offices in China and Japan.
Jul. 31, 2014 03:45 PM EDT Reads: 1,410
There will be 50 billion Internet connected devices by 2020. Today, every manufacturer has a propriety protocol and an app. How do we securely integrate these "things" into our lives and businesses in a way that we can easily control and manage? Even better, how do we integrate these "things" so that they control and manage each other so our lives become more convenient or our businesses become more profitable and/or safe? We have heard that the best interface is no interface. In his session at Internet of @ThingsExpo, Chris Matthieu, Co-Founder & CTO at Octoblu, Inc., will discuss how these devices generate enough data to learn our behaviors and simplify/improve our lives. What if we could connect everything to everything? I'm not only talking about connecting things to things but also systems, cloud services, and people. Add in a little machine learning and artificial intelligence and now we have something interesting...
Jul. 30, 2014 09:45 PM EDT Reads: 1,250
After a couple of false starts, cloud-based desktop solutions are picking up steam, driven by trends such as BYOD and pervasive high-speed connectivity. In his session at 15th Cloud Expo, Seth Bostock, CEO of IndependenceIT, cuts through the hype and the acronyms, and discusses the emergence of full-featured cloud workspaces that do for the desktop what cloud infrastructure did for the server. He’ll discuss VDI vs DaaS, implementation strategies and evaluation criteria.
Jul. 29, 2014 11:45 AM EDT Reads: 1,689
Cloud computing started a technology revolution; now DevOps is driving that revolution forward. By enabling new approaches to service delivery, cloud and DevOps together are delivering even greater speed, agility, and efficiency. No wonder leading innovators are adopting DevOps and cloud together! In his session at DevOps Summit, Andi Mann, Vice President of Strategic Solutions at CA Technologies, will explore the synergies in these two approaches, with practical tips, techniques, research data, war stories, case studies, and recommendations.
Jul. 29, 2014 10:00 AM EDT Reads: 1,789
Cloud Computing is evolving into a Big Three of Amazon Web Services, Google Cloud, and Microsoft Azure. Cloud 360: Multi-Cloud Bootcamp, being held Nov 4–5, 2014, in conjunction with 15th Cloud Expo in Santa Clara, CA, delivers a real-world demonstration of how to deploy and configure a scalable and available web application on all three platforms. The Cloud 360 Bootcamp, led by Janakiram MSV, an analyst with Gigaom Research, is the first bootcamp that introduces the core concepts of Infrastructure as a Service (IaaS) based on the workings of the Big Three platforms – Amazon EC2, Google Compute Engine, and Azure VMs. Bootcamp attendees will get to see the big picture and also receive the knowledge needed to make the best cloud decisions for their business applications and entire enterprise IT organization.
Jul. 28, 2014 01:30 AM EDT Reads: 2,216
The Internet of Things promises to transform businesses (and lives), but navigating the business and technical path to success can be difficult to understand. In his session at 15th Internet of @ThingsExpo, Chad Jones, Vice President, Product Strategy of LogMeIn's Xively IoT Platform, will show you how to approach creating broadly successful connected customer solutions using real world business transformation studies including New England BioLabs and more.
Jul. 27, 2014 11:45 PM EDT Reads: 2,856
“Distrix fits into the overall cloud and IoT model around software-defined networking. There’s a broad category around software-defined networking that’s focused on data center, and we focus on the WAN,” explained Jay Friedman, President of Distrix, in this SYS-CON.tv interview at the Internet of @ThingsExpo, held June 10-12, 2014, at the Javits Center in New York City. Internet of @ThingsExpo 2014 Silicon Valley, November 4–6, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading IoT industry players in the world.
Jul. 27, 2014 11:45 PM EDT Reads: 2,807
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using the URL as a basic building block, we open this up and get the same resilience that the web enjoys.
Jul. 27, 2014 11:00 PM EDT Reads: 2,325
“The Internet of Things is a wave that has arrived and it’s growing really fast. The concern at Aria Systems is making sure that people understand the ramifications of their attempts to monetize whatever it is they build on the Internet of Things," explained C Brendan O’Brien, Co-founder and Chief Architect at Aria Systems, in this SYS-CON.tv interview at the Internet of @ThingsExpo, held June 10-12, 2014, at the Javits Center in New York City. Internet of @ThingsExpo 2014 Silicon Valley, November 4–6, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading IoT industry players in the world.
Jul. 27, 2014 09:00 PM EDT Reads: 2,413
The Internet of Things is a natural complement to the cloud and related technologies such as Big Data, analytics, and mobility. In his session at Internet of @ThingsExpo, Joe Weinman will lay out four generic strategies – digital disciplines – to exploit emerging digital technologies for strategic advantage. Joe Weinman has held executive leadership positions at Bell Labs, AT&T, Hewlett-Packard, and Telx, in areas such as corporate strategy, business development, product management, operations, and R&D.
Jul. 21, 2014 11:17 AM EDT Reads: 2,050
SYS-CON Events announced today that DevOps.com has been named “Media Sponsor” of SYS-CON's “DevOps Summit at Cloud Expo,” which will take place on June 10–12, 2014, at the Javits Center in New York City, New York. DevOps.com is where the world meets DevOps. It is the largest collection of original content relating to DevOps on the web today Featuring up-to-the-minute news, feature stories, blogs, bylined articles and more, DevOps.com is where the thought leaders of the DevOps movement make their ideas known.
Jul. 20, 2014 03:00 PM EDT Reads: 1,824
- CiRBA Executives Speaking at Key Upcoming Industry Events
- WSTA Named “Association Sponsor” of Cloud Expo Silicon Valley
- Docker + Stackato: The Perfect Workload Portability Solution
- Choosing Cloud Providers – Has the Provider Utilized a Proven Methodology?
- CodeFutures’ Cory Isaacson to Preview His Newest Book at Cloud Expo
- An API Strategy Is a Business Strategy
- MangoApps to Exhibit at Cloud Expo New York
- A Globally Distributed Storage Cloud with Disaster Recovery
- DevOps Drives Growth, Profits and Business Performance
- E-Signature Integration Workshop
- DevOps Summit Power Panel | Is DevOps Really Changing How IT Is Working?
- Cloud Infrastructure for the Real World
- CiRBA Executives Speaking at Key Upcoming Industry Events
- Eight Ways Cloud-Empowered HCM Solutions Are Driving Business Success
- AMAG, HP, ImageWare Systems, March Networks and StrikeForce Discuss Security Solutions in SecuritySolutionsWatch.com Interviews
- MapR Technologies Announces Upcoming June Conferences
- More Mainstream Businesses Depend on Open Source
- Enterprise Cloud Analytics and Business Intelligence
- Top Five Best Practices for Your Application PaaS Audience
- WSTA Named “Association Sponsor” of Cloud Expo Silicon Valley
- Intelligent Systems in Transportation
- PEER 1 Hosting to Exhibit at Cloud Expo New York
- WSO2 Guest Speakers at WSO2Con Europe 2014 Will Examine Technology Developments and Best Practices Enabling the Connected Business
- Powering the Mobile Enterprise
- Cloud Expo New York: Best CIO Practices Shared from SHI’s Customers
- How Platfora Is Transforming Hadoop
- Meal Management System ISOBAG™ Offers 10% Off Coupon To Spur Holiday Season Shopping in 2013
- Cloud Computing and Big Data in 2013: What's Coming Next?
- Think You Heard It All About The Best of the Best from CES? Well, Think Again ... My eHome® -- the Gotta-Have-It Multi-Play Solution -- Targeted for Launch in First Quarter 2014
- Cloud Expo New York: How to Use Google Apps Script
- Examining the True Cost of Big Data
- Don’t forget to register for FOSE 2013
- Small Cancers, Big Data, and a Life Examined
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Best Practices: The Role of API Management
- ARM Server to Transform Cloud and Big Data to the Internet of Things