Welcome!

@DXWorldExpo Authors: Pat Romanski, Zakia Bouachraoui, Elizabeth White, Yeshim Deniz, Liz McMillan

Related Topics: @DXWorldExpo, Microservices Expo, Containers Expo Blog, Apache, SDN Journal

@DXWorldExpo: Blog Feed Post

Big Data Top Ten | @CloudExpo [#BigData]

What do you get when you combine Big Data technologies….like Pig and Hive? A flying pig?

What do you get when you combine Big Data technologies….like Pig and Hive? A flying pig?

No, you get a “Logical Data Warehouse”.

My general prediction is that Cloudera and Hortonworks are both aggressively moving to fulfilling a vision which looks a lot like Gartner’s “Logical Data Warehouse”….namely, “the next-generation data warehouse that improves agility, enables innovation and responds more efficiently to changing business requirements.”

In 2012, Infochimps (now CSC) leveraged its early use of stream processing, NoSQLs, and Hadoop to create a design pattern which combined real-time, ad-hoc, and batch analytics. This concept of combining the best-in-breed Big Data technologies will continue to advance across the industry until the entire legacy (and proprietary) data infrastructure stack will be replaced with a new (and open) one.

As this is happening, I predict that the following 10 Big Data events will occur in 2014.

Screen Shot 2013-12-20 at 7.52.56 AM

1. Consolidation of NoSQLs begins

A few projects have strong commercialization companies backing them. These are companies who have reached “critical mass”, including Datastax with Cassandra, 10gen with MongoDB, and Couchbase with CouchDB.  Leading open source projects, like these, will pull further and further away from the pack of 150+ other NoSQLs, who are either fighting for the same value propositions (with a lot less traction) or solving small niche use-cases (and markets).

2. The Hadoop Clone wars end

The industry will begin standardizing on two distributions. Everyone else will become less relevant (It’s Intel vs. AMD. Lets not forget the other x86 vendors like IBM, UMC, NEC, NexGen, National, Cyrix, IDT, Rise, and Transmeta). If you are a Hadoop vendor, you’re either the Intel or AMD. Otherwise, you better be acquired or get out of the business by end of 2014.

3. Open source business model is acknowledged by Wall Street

Because the open source, scale-out, commodity approach to Big Data is fundamental to the new breed of Big Data technologies, open source now becomes a clear antithesis of the proprietary, scale-up, our-hardware-only, take-it-or-leave-it solutions. Unfortunately, the promises of international expansion, improved traction from sales force expansion, new products and alliances, will all fall on deaf ears of Wall Street analysts. Time to short the platform RDBMS and Enterprise Data Warehouse stocks.

4. Big Data and Cloud really means private cloud

Many claimed that 2013 was the “year of Big Data in the Cloud”. However, what really happened is that the Global 2000 immediately began their bare metal projects under tight control. Now that those projects are underway, 2014 will exhibit the next phase of Big Data on virtualized platforms. Open source projects like Serengeti for VSphere; Savanna for OpenStack; Ironfan for AWS, OpenStack, and VMware combined, or venture-backed and proprietary solutions like Bluedata will enable virtualized Big Data private clouds.

5. 2014 starts the era of analytic applications

Enterprises become savvy to the new reference architecture of combined legacy and new generation IT data infrastructure. Now it’s time to develop a new generation of applications that take advantage of both to solve business problems. System Integrators will shift resources, hire data scientists, and guide enterprises in their development of data-driven applications. This, of course, realizes the concepts like the 360 degree view, Internet of things, and marketing to one.

6. Search-based business intelligence tools will become the norm with Big Data

Having a “Google-like” interface that allows users to explore structured and unstructured data with little formal training is the where the new generation is going. Just look at Splunk for searching machine data. Imagine a marketer being able to simply “Google Search” for insights on their customers?

7. Real-time in-memory analytics, complex event processing, and ETL combine

The days of ETL in its pure form are numbered. It’s either ‘E’, then ‘L’, then ‘T’ with Hadoop, or it’s EAL (extract, apply analytics, and load) with new real-time stream-processing frameworks. Now that high-speed social data streams are the norm, so are processing frameworks that combine streaming data with micro-batch and batch data, performing complex processors on that data and feeding applications in sub-second response times.

8. Prescriptive analytics become more mainstream

After descriptive and predictive, comes prescriptive. Prescriptive analytics automatically synthesizes big data, multiple disciplines of mathematical sciences and computational sciences, and business rules, to make predictions and then suggests decision options to take advantage of the predictions. We will begin seeing powerful use-cases of this in 2014. Business users want to be recommended specific courses of action and to be shown the likely outcome of each decision.

9. MDM will provide the dimensions for big data facts

With Big Data, master data management will now cover both internal data that the organization has been managing over years (like customer, product and supplier data) as well as Big Data that is flowing into the organization from external sources (like social media, third party data, web-log data) and from internal data sources (such as unstructured content in documents and email). MDM will support polyglot persistence.

10. Security in Big Data won’t be a big issue

Peter Sondergaard, Gartner’s senior vice president of research, will say that when it comes to big data and security that “You should anticipate events and headlines that continuously raise public awareness and create fear.” I’m not dismissing the fact that with MORE data comes  more responsibilities, and perhaps liabilities, for those that harbor the data. However, in terms of the infrastructure security itself, I believe 2014 will end with a clear understanding of how to apply those familiar best-practicies to your new Big Data platform including trusted Kerberos, LDAP integration, Active Directory integration, encryption, and overall policy administration.

Read the original blog entry...

More Stories By Jim Kaskade

Jim Kaskade currently leads Janrain, the category creator of Consumer Identity & Access Management (CIAM). We believe that your identity is the most important thing you own, and that your identity should not only be easy to use, but it should be safe to use when accessing your digital world. Janrain is an Identity Cloud servicing Global 3000 enterprises providing a consistent, seamless, and safe experience for end-users when they access their digital applications (web, mobile, or IoT).

Prior to Janrain, Jim was the VP & GM of Digital Applications at CSC. This line of business was over $1B in commercial revenue, including both consulting and delivery organizations and is focused on serving Fortune 1000 companies in the United States, Canada, Mexico, Peru, Chile, Argentina, and Brazil. Prior to this, Jim was the VP & GM of Big Data & Analytics at CSC. In his role, he led the fastest growing business at CSC, overseeing the development and implementation of innovative offerings that help clients convert data into revenue. Jim was also the CEO of Infochimps; Entrepreneur-in-Residence at PARC, a Xerox company; SVP, General Manager and Chief of Cloud at SIOS Technology; CEO at StackIQ; CEO of Eyespot; CEO of Integral Semi; and CEO of INCEP Technologies. Jim started his career at Teradata where he spent ten years in enterprise data warehousing, analytical applications, and business intelligence services designed to maximize the intrinsic value of data, servicing fortune 1000 companies in telecom, retail, and financial markets.

DXWorldEXPO Digital Transformation Stories
CloudEXPO New York 2018, colocated with DevOpsSUMMIT and DXWorldEXPO New York 2018 will be held November 12-13, 2018, in New York City and will bring together Cloud Computing, FinTech and Blockchain, Digital Transformation, Big Data, Internet of Things, DevOps, AI and Machine Learning to one location.
CloudEXPO | DevOpsSUMMIT | DXWorldEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
ICC is a computer systems integrator and server manufacturing company focused on developing products and product appliances to meet a wide range of computational needs for many industries. Their solutions provide benefits across many environments, such as datacenter deployment, HPC, workstations, storage networks and standalone server installations. ICC has been in business for over 23 years and their phenomenal range of clients include multinational corporations, universities, and small busines...
Headquartered in Plainsboro, NJ, Synametrics Technologies has provided IT professionals and computer systems developers since 1997. Based on the success of their initial product offerings (WinSQL and DeltaCopy), the company continues to create and hone innovative products that help its customers get more from their computer applications, databases and infrastructure. To date, over one million users around the world have chosen Synametrics solutions to help power their accelerated business or per...
All in Mobile is a place where we continually maximize their impact by fostering understanding, empathy, insights, creativity and joy. They believe that a truly useful and desirable mobile app doesn't need the brightest idea or the most advanced technology. A great product begins with understanding people. It's easy to think that customers will love your app, but can you justify it? They make sure your final app is something that users truly want and need. The only way to do this is by ...
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
DXWorldEXPO LLC announced today that Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, softwar...
DXWorldEXPO LLC announced today that Big Data Federation to Exhibit at the 22nd International CloudEXPO, colocated with DevOpsSUMMIT and DXWorldEXPO, November 12-13, 2018 in New York City. Big Data Federation, Inc. develops and applies artificial intelligence to predict financial and economic events that matter. The company uncovers patterns and precise drivers of performance and outcomes with the aid of machine-learning algorithms, big data, and fundamental analysis. Their products are deployed...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
The challenges of aggregating data from consumer-oriented devices, such as wearable technologies and smart thermostats, are fairly well-understood. However, there are a new set of challenges for IoT devices that generate megabytes or gigabytes of data per second. Certainly, the infrastructure will have to change, as those volumes of data will likely overwhelm the available bandwidth for aggregating the data into a central repository. Ochandarena discusses a whole new way to think about your next...