Click here to close now.

Welcome!

Big Data Journal Authors: Elizabeth White, Pat Romanski, Roger Strukhoff, Liz McMillan, Jason Bloomberg

Related Topics: Cloud Expo, Big Data Journal

Cloud Expo: Article

Hadoop Moving More Toward Real-Time

Interview with Continuent CEO Robert Hodges

No discussion of the Red Hat Summit 2014 would be complete without some discussion of Apache Hadoop. The happy elephant has now been pushing data for close to a decade, its distributed file system (HDFS) setting the tone for support of modern-day, highly distributed and very large databases in the cloud.

So I was pleased to have Robert Hodges, CEO of Hadoop-focused Continuent Tungsten, answer a few questions about his company's world.

Roger: What's the scope of the challenge you face in addressing big Hadoop deployments?

Robert: Hadoop is really very powerful as the way to concentrate and analyze information, so the key issue is how the information from existing transactional data stores gets added to Hadoop without implying additional load, application changes, or repetitive dump processes.

From our existing customer deployments, we know that the biggest challenge is getting the information into Hadoop as quickly and timely as possible from multiple different hosts simultaneously. Our customers often have many more transactional hosts running MySQL than they have Hadoop hosts, just because the scale-out and sharding required to support their transactional needs is so high.

Roger: What are the key pain points?

Robert: The key pain points are therefore the extraction of data from the transactional stores without implying additional load on these servers which are running their live customer facing website, while simultaneously loading large quantities of data that needs to be merged and analysed on the Hadoop side.

The replication solution based on Tungsten Replicator provides this very simply by placing a very low-level of load required for extraction of data, while continually streaming the changes over into Hadoop. Because this can be done on a server or cluster basis, it is easy to scale up the replication of data into Hadoop by adding more streams of replication data.

Roger: How critical is the real-time aspect of modern IT? How quickly is it growing?

Robert: It's growing very quickly, and in some cases quicker than some company IT departments and the technology they support are able to cope. Replication has for a long time been the solution for this scale-out process, but the flows of this replication data are changing.

One of the key drivers behind the adoption of Hadoop and Cassandra and similar databases is the ability to parallel process the data to get numbers in real-time. You can see this in a wide range of different markets, from banking, through to social networking and online stores.

As we get access to more information, the services supporting them need to support that an ever faster rate. We all want the lowest rate on my plane ticket purchase, while receiving the absolute best benefits and service, and all those different elements rely on real-time analysis.

Roger: What does IT think of this?

Robert: Of course, this also presents a completely different problem for the IT departments. They must deal with how to get the data into a system so that it can be analyzed quickly. The location for your active transactional dataset is not the same as your analysis tools, and may be based on completely different quantities of raw data.

Transactional databases might be conveniently sharded into 50 or 100 different RDBMS of 100GB each, but analysis needs to process all 10,000GB of data collectively to get meaningful information. That means that the IT infrastructure needs an effective way to combine and transfer this active data.

It's also clear from recent advancements in querying and processing techniques built on top of Hadoop that Hadoop itself is moving into a more real-time tool. Spark, Storm and other query engines provide very fast query and analysis on very large datasets, taking advantage of the distributed nature of Hadoop, and the increasing RAM and CPU power in evolutions of new hardware. Compatibility with Spark and similar live query mechanisms in Hadoop will form a key part of the next evolution of all Hadoop deployments.

Roger: How key is the role of Big Data in developing your solutions? How important is the term Big Data to you?

Robert: Big Data has been a significant requirement for our customers and their needs for some time, but we have definitely seen a shift recently from the scale-out, sharded nature of the typical RDBMS towards concentrating that information for analysis in Big Data stores. As that movement of data moves into the real-time it will be critical to the tools we develop to help make the transfer and management of data replication as easy as possible for our customers.

To us as the provider of the tools that enable our customers to easily share and transfer data, Big Data is therefore as important to us as it is to our customers. Of course, transactional databases are not going away, and we certainly don't expect that to change, but Hadoop and other Big Data solutions are being brought to work alongside these active data stores. Continuent will certainly be looking to expand our different solutions and techniques to bridge the gap between RDBMS and Big Data.

Contact Me on Twitter

More Stories By Roger Strukhoff

Roger Strukhoff (@IoT2040) is Executive Director of the Tau Institute for Global ICT Research, with offices in Illinois and Manila. He is Conference Chair of @CloudExpo & @ThingsExpo, and Editor of SYS-CON Media's CloudComputing BigData & IoT Journals. He holds a BA from Knox College & conducted MBA studies at CSU-East Bay.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@BigDataExpo Stories
SYS-CON Events announced today that Creative Business Solutions will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Creative Business Solutions is the top stocking authorized HP Renew Distributor in the U.S. Based out of Long Island, NY, Creative Business Solutions offers a one-stop shop for a diverse range of products including Proliant, Blade and Industry Standard Servers, Networking, Server Options and...
What exactly is a cognitive application? In her session at 16th Cloud Expo, Ashley Hathaway, Product Manager at IBM Watson, will look at the services being offered by the IBM Watson Developer Cloud and what that means for developers and Big Data. She'll explore how IBM Watson and its partnerships will continue to grow and help define what it means to be a cognitive service, as well as take a look at the offerings on Bluemix. She will also check out how Watson and the Alchemy API team up to off...
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding bu...
SYS-CON Events announced today that kintone has been named “Bronze Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY, and the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. kintone promotes cloud-based workgroup productivity, transparency and profitability with a seamless collaboration space, build your own business applic...
SYS-CON Events announced today that ProfitBricks, the provider of painless cloud infrastructure, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY., and the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. ProfitBricks is the IaaS provider that offers a painless cloud experience for all IT users, with no learning curve. ...
SYS-CON Events announced today that Column Technologies, a global technology solutions company, will exhibit at SYS-CON's DevOps Summit 2015 New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. Established in 1998, Column Technologies is a leader in application performance and infrastructure management for commercial and federal markets. The company is headquartered in the United States, with a diverse and talented team of more than 350 employees around th...
SYS-CON Media announced today that John Treadway’s blog has exceeded 475,000 page views. John Treadway, Vice President at Cloud Technology Partners, has surpassed 475,000 page views on the SYS-CON family of online magazines, which includes Cloud Computing Journal, Internet of Things Journal, Big Data Journal, Microservices Journal, and several others. His blog home page at SYS-CON can be found at JohnTreadway.SYS-CON.com.
SYS-CON Events announced today that MangoApps will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY., and the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. MangoApps provides private all-in-one social intranets allowing workers to securely collaborate from anywhere in the world and from any device. Social, mobile, and eas...
SYS-CON Events announced today that AIC, a leading provider of OEM/ODM server and storage solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. AIC is a leading provider of both standard OTS, off-the-shelf, and OEM/ODM server and storage solutions. With expert in-house design capabilities, validation, manufacturing and production, AIC's broad selection of products are highly flexible and are conf...
of cloud, colocation, managed services and disaster recovery solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. TierPoint, LLC, is a leading national provider of information technology and data center services, including cloud, colocation, disaster recovery and managed IT services, with corporate headquarters in St. Louis, MO. TierPoint was formed through the strategic combination of some of t...
SYS-CON Events announced today that Soha will exhibit at SYS-CON's DevOps Summit New York, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Soha delivers enterprise-grade application security, on any device, as agile as the cloud. This turnkey, cloud-based service enables customers to solve secure application access and delivery challenges that traditional or virtualized network solutions cannot solve because they are too expensive, inflexible and operational...
SYS-CON Events announced today that Vicom Computer Services, Inc., a provider of technology and service solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. They are located at booth #427. Vicom Computer Services, Inc. is a progressive leader in the technology industry for over 30 years. Headquartered in the NY Metropolitan area. Vicom provides products and services based on today’s requirements...
Buzzword alert: Microservices and IoT at a DevOps conference? What could possibly go wrong? Join this panel of experts as they peel away the buzz and discuss the important architectural principles behind implementing IoT solutions for the enterprise. As remote IoT devices and sensors become increasingly intelligent, they become part of our distributed cloud environment, and we must architect and code accordingly. At the very least, you’ll have no problem filling in your buzzword bingo cards.
"Jelastic is focused on getting people to the cloud sooner, easier, without having to go to new APIs or different standards, to give them the full benefit of the cloud right away," explained John Derrick, CEO of Jelastic, in this SYS-CON.tv interview at the 14th International Cloud Expo®, held June 10-12, 2014, at the Javits Center in New York City
SYS-CON Events announced today that Dyn, the worldwide leader in Internet Performance, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Dyn is a cloud-based Internet Performance company. Dyn helps companies monitor, control, and optimize online infrastructure for an exceptional end-user experience. Through a world-class network and unrivaled, objective intelligence into Internet conditions, Dyn ensures...
The only place to be June 9-11 is Cloud Expo & @ThingsExpo 2015 East at the Javits Center in New York City. Join us there as delegates from all over the world come to listen to and engage with speakers & sponsors from the leading Cloud Computing, IoT & Big Data companies. Cloud Expo & @ThingsExpo are the leading events covering the booming market of Cloud Computing, IoT & Big Data for the enterprise. Speakers from all over the world will be hand-picked for their ability to explore the economic...
SYS-CON Events announced today that O'Reilly Media has been named “Media Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York City, NY. O'Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O'Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurring their adoption...
Learn about the current state of security breaches and what it is costing businesses. One of my recent conversations with a Cloud security expert at a conference opened up doors to some new information for me personally. Our conversation was around the current status of the Cloud Industry, the Security Challenges and how we can make the Cloud more secure and so on. Did you know that security breaches have huge financial implications such as at an average:
One of the hottest areas in cloud right now is DRaaS and related offerings. In his session at 16th Cloud Expo, Dale Levesque, Disaster Recovery Product Manager with Windstream's Cloud and Data Center Marketing team, will discuss the benefits of the cloud model, which far outweigh the traditional approach, and how enterprises need to ensure that their needs are properly being met.
SYS-CON Events announced today the IoT Bootcamp – Jumpstart Your IoT Strategy, being held June 9–10, 2015, in conjunction with 16th Cloud Expo and Internet of @ThingsExpo at the Javits Center in New York City. This is your chance to jumpstart your IoT strategy. Combined with real-world scenarios and use cases, the IoT Bootcamp is not just based on presentations but includes hands-on demos and walkthroughs. We will introduce you to a variety of Do-It-Yourself IoT platforms including Arduino, Ras...