Welcome!

Big Data Journal Authors: Liz McMillan, Dana Gardner, Kira Makagon, Elizabeth White, Pat Romanski

Related Topics: Big Data Journal, SOA & WOA, Virtualization, Cloud Expo, Apache, SDN Journal

Big Data Journal: Blog Feed Post

Big Data Top Ten

What do you get when you combine Big Data technologies….like Pig and Hive? A flying pig?

What do you get when you combine Big Data technologies….like Pig and Hive? A flying pig?

No, you get a “Logical Data Warehouse”.

My general prediction is that Cloudera and Hortonworks are both aggressively moving to fulfilling a vision which looks a lot like Gartner’s “Logical Data Warehouse”….namely, “the next-generation data warehouse that improves agility, enables innovation and responds more efficiently to changing business requirements.”

In 2012, Infochimps (now CSC) leveraged its early use of stream processing, NoSQLs, and Hadoop to create a design pattern which combined real-time, ad-hoc, and batch analytics. This concept of combining the best-in-breed Big Data technologies will continue to advance across the industry until the entire legacy (and proprietary) data infrastructure stack will be replaced with a new (and open) one.

As this is happening, I predict that the following 10 Big Data events will occur in 2014.

Screen Shot 2013-12-20 at 7.52.56 AM

1. Consolidation of NoSQLs begins

A few projects have strong commercialization companies backing them. These are companies who have reached “critical mass”, including Datastax with Cassandra, 10gen with MongoDB, and Couchbase with CouchDB.  Leading open source projects, like these, will pull further and further away from the pack of 150+ other NoSQLs, who are either fighting for the same value propositions (with a lot less traction) or solving small niche use-cases (and markets).

2. The Hadoop Clone wars end

The industry will begin standardizing on two distributions. Everyone else will become less relevant (It’s Intel vs. AMD. Lets not forget the other x86 vendors like IBM, UMC, NEC, NexGen, National, Cyrix, IDT, Rise, and Transmeta). If you are a Hadoop vendor, you’re either the Intel or AMD. Otherwise, you better be acquired or get out of the business by end of 2014.

3. Open source business model is acknowledged by Wall Street

Because the open source, scale-out, commodity approach to Big Data is fundamental to the new breed of Big Data technologies, open source now becomes a clear antithesis of the proprietary, scale-up, our-hardware-only, take-it-or-leave-it solutions. Unfortunately, the promises of international expansion, improved traction from sales force expansion, new products and alliances, will all fall on deaf ears of Wall Street analysts. Time to short the platform RDBMS and Enterprise Data Warehouse stocks.

4. Big Data and Cloud really means private cloud

Many claimed that 2013 was the “year of Big Data in the Cloud”. However, what really happened is that the Global 2000 immediately began their bare metal projects under tight control. Now that those projects are underway, 2014 will exhibit the next phase of Big Data on virtualized platforms. Open source projects like Serengeti for VSphere; Savanna for OpenStack; Ironfan for AWS, OpenStack, and VMware combined, or venture-backed and proprietary solutions like Bluedata will enable virtualized Big Data private clouds.

5. 2014 starts the era of analytic applications

Enterprises become savvy to the new reference architecture of combined legacy and new generation IT data infrastructure. Now it’s time to develop a new generation of applications that take advantage of both to solve business problems. System Integrators will shift resources, hire data scientists, and guide enterprises in their development of data-driven applications. This, of course, realizes the concepts like the 360 degree view, Internet of things, and marketing to one.

6. Search-based business intelligence tools will become the norm with Big Data

Having a “Google-like” interface that allows users to explore structured and unstructured data with little formal training is the where the new generation is going. Just look at Splunk for searching machine data. Imagine a marketer being able to simply “Google Search” for insights on their customers?

7. Real-time in-memory analytics, complex event processing, and ETL combine

The days of ETL in its pure form are numbered. It’s either ‘E’, then ‘L’, then ‘T’ with Hadoop, or it’s EAL (extract, apply analytics, and load) with new real-time stream-processing frameworks. Now that high-speed social data streams are the norm, so are processing frameworks that combine streaming data with micro-batch and batch data, performing complex processors on that data and feeding applications in sub-second response times.

8. Prescriptive analytics become more mainstream

After descriptive and predictive, comes prescriptive. Prescriptive analytics automatically synthesizes big data, multiple disciplines of mathematical sciences and computational sciences, and business rules, to make predictions and then suggests decision options to take advantage of the predictions. We will begin seeing powerful use-cases of this in 2014. Business users want to be recommended specific courses of action and to be shown the likely outcome of each decision.

9. MDM will provide the dimensions for big data facts

With Big Data, master data management will now cover both internal data that the organization has been managing over years (like customer, product and supplier data) as well as Big Data that is flowing into the organization from external sources (like social media, third party data, web-log data) and from internal data sources (such as unstructured content in documents and email). MDM will support polyglot persistence.

10. Security in Big Data won’t be a big issue

Peter Sondergaard, Gartner’s senior vice president of research, will say that when it comes to big data and security that “You should anticipate events and headlines that continuously raise public awareness and create fear.” I’m not dismissing the fact that with MORE data comes  more responsibilities, and perhaps liabilities, for those that harbor the data. However, in terms of the infrastructure security itself, I believe 2014 will end with a clear understanding of how to apply those familiar best-practicies to your new Big Data platform including trusted Kerberos, LDAP integration, Active Directory integration, encryption, and overall policy administration.

More Stories By Jim Kaskade

Jim Kaskade is CEO of Infochimps. Before that he served as SVP and General Manager at SIOS Technology, a publicly traded firm in Japan, where he led a business unit focused on developing private cloud Platform as a Service targeted for Fortune 500 enterprises. He has been heavily involved in all aspects of cloud, meeting with prominent CIOs, CISOs, datacenter architects of Fortune 100 companies to better understand their cloud computing needs. He also has hands-on cloud domain knowledge from his experience as founder and CEO of a SaaS company, which secured the digital media assets of over 10,000 businesses including Fortune 100 customers such as Lucasfilm, the NBA, Sony BMG, News Corp, Viacom, and IAC. Kaskade is also one of the Top 100 bloggers on Cloud Computing selected by the Cloud Computing Journal.

Latest Stories from Big Data Journal
The emergence of cloud computing and Big Data warrants a greater role for the PMO to successfully manage enterprise transformation driven by these powerful trends. As the adoption of cloud-based services continues to grow, a governance model is needed to orchestrate enterprise cloud implementations and harness the power of Big Data analytics. In his session at 15th Cloud Expo, Mahesh Singh, President of BigData, Inc., to discuss how the Enterprise PMO takes center stage not only in developing th...
The Open Group and BriefingsDirect recently assembled a distinguished panel at The Open Group Boston Conference 2014 to explore the practical implications and limits of the Internet of Things. This so-called Internet of Things means more data, more cloud connectivity and management, and an additional tier of “things” that are going to be part of the mobile edge -- and extending that mobile edge ever deeper into even our own bodies. Yet the Internet of Things is more than the “things” – it me...
Come learn about what you need to consider when moving your data to the cloud. In her session at 15th Cloud Expo, Skyla Loomis, a Program Director of Cloudant Development at Cloudant, will discuss the security, performance, and operational implications of keeping your data on premise, moving it to the cloud, or taking a hybrid approach. She will use real customer examples to illustrate the tradeoffs, key decision points, and how to be successful with a cloud or hybrid cloud solution.
For the last hundred years, the desk phone has been a staple of every business. The landline has been a lifeline to customers and colleagues as the primary means of communication – even as email threatened to render the telephone obsolete. For some purposes, like conference calling, there was simply no substitute. That is, until a few years ago. With all due respect and apologies to Mr. Alexander Graham Bell, the desk phone is becoming just one solution, out of many devices, used for the modern...
Software is eating the world. Companies that were not previously in the technology space now find themselves competing with Google and Amazon on speed of innovation. As the innovation cycle accelerates, companies must embrace rapid and constant change to both applications and their infrastructure, and find a way to deliver speed and agility of development without sacrificing reliability or efficiency of operations. In her keynote DevOps Summit, Victoria Livschitz, CEO of Qubell, will discuss ho...
In today's application economy, enterprise organizations realize that it's their applications that are the heart and soul of their business. If their application users have a bad experience, their revenue and reputation are at stake. In his session at 15th Cloud Expo, Anand Akela, Senior Director of Product Marketing for Application Performance Management at CA Technologies, will discuss how a user-centric Application Performance Management solution can help inspire your users with every appli...
Enthusiasm for the Internet of Things has reached an all-time high. In 2013 alone, venture capitalists spent more than $1 billion dollars investing in the IoT space. With “smart” appliances and devices, IoT covers wearable smart devices, cloud services to hardware companies. Nest, a Google company, detects temperatures inside homes and automatically adjusts it by tracking its user’s habit. These technologies are quickly developing and with it come challenges such as bridging infrastructure gaps,...
Predicted by Gartner to add $1.9 trillion to the global economy by 2020, the Internet of Everything (IoE) is based on the idea that devices, systems and services will connect in simple, transparent ways, enabling seamless interactions among devices across brands and sectors. As this vision unfolds, it is clear that no single company can accomplish the level of interoperability required to support the horizontal aspects of the IoE. The AllSeen Alliance, announced in December 2013, was formed wi...
Goodness there is a lot of talk about cloud computing. This ‘talk and chatter’ is part of the problem, i.e., we look at it, we prod it and we might even test it out – but do we get down to practical implementation, deployment and (if you happen to be a fan of the term) actual cloud ‘rollout’ today? Cloud offers the promise of a new era they say – and a new style of IT at that. But this again is the problem and we know that cloud can only deliver on the promises it makes if it is part of a well...
There’s Big Data, then there’s really Big Data from the Internet of Things. IoT is evolving to include many data possibilities like new types of event, log and network data. The volumes are enormous, generating tens of billions of logs per day, which raise data challenges. Early IoT deployments are relying heavily on both the cloud and managed service providers to navigate these challenges. In her session at 6th Big Data Expo®, Hannah Smalltree, Director at Treasure Data, to discuss how IoT, B...