@DXWorldExpo Authors: Elizabeth White, Liz McMillan, Pat Romanski, Yeshim Deniz, William Schmarzo

Related Topics: @DXWorldExpo, Microservices Expo

@DXWorldExpo: Blog Feed Post

Big Data Redefined By @TonyShan | @CloudExpo [#BigData]

Big Data is a loose term for the collection, storage, processing, and sophisticated analysis of massive amounts of data

Big Data is a loose term for the collection, storage, processing, and sophisticated analysis of massive amounts of data, far larger and from many more kinds of sources than ever before. The definition of Big Data can be traced back to the 3Vs model defined by Doug Laney in 2001: Volume, Velocity, and Variety. The fourth V was later added in different fashions, such as “Value” or “Veracity”.

Interestingly the conceptualization of Big Data in the beginning of this century seems to gain wider use now after nearly 14 years. This sounds a little strange as the present dynamic world has evolved so much with so many things changed. Does the old definition still fit?

A recent report revealed that more than 80% of the executives surveyed thought that the term of Big Data was overstated, confusing, or misleading. They liked the concept, but hated the phrase. As Tom Davenport pointed out, nobody likes the term and almost everybody wishes for a better, more descriptive name for it.

The big problem of Big Data is that the V-model ineffectively describes the phenomenon and is outdated for the new paradigm. Even the original author admitted that he was simply writing about the burgeoning data in the data warehousing and business intelligence world. It is necessary to redefine the term.

Big Data in today’s world is essentially the ability to parse more information, faster and deeper, to provide unprecedented insights of the business world. The concept is more about 4Rs than 4Vs in the current situation: Real-time, Relevance, Revelation and Refinery.

  • Real-time: With the maturing and commoditization of distributed file systems and parallel processing functions, real-time is realistic. Instant response is a must for most online applications. Fast analysis is compulsory for any size of data nowadays. Batch mode becomes history now, except for cost constraints and due diligence reasons. Anything less than (near) real-time brings significant competitive disadvantages.

  • Relevance: Data analysis must be context-aware, semantic, and meaningful. Simple string match or syntactic equality is no longer enough. Unrelated data is useless as a distraction. It is mandatory for data analytics to be knowledge-based with relevant information analyzed. Interdisciplinary science and engineering must be leveraged to quantify the level of relevance in the data and user’s interest areas. Simply put, what matters the most is not how much data is delivered in the fastest way, but how applicable and useful the content is to an end user’s needs at the right time and in the right place.

  • Revelation: Previously unknown things are uncovered and disclosed in some form of knowledge not before realized. Hidden patterns are identified to correlate data elements and events at massive scale. Ambiguous, vague and obscure  data sets can be crystalized to provide better views and statistics. Seemingly random data can be mined to signal the potential linkage and interlock. User behaviors are analyzed via machine learning to find and understand the collaborative influence and sentiments.

  • Refinery: Raw data are extracted and transformed into relevant and actionable information effectively on demand. The refined data is timely, clean, aggregated, insightful and well understood. Data refinery takes the uncertainty out of the data and filter/reform the data for meaningful analysis and operations. The refinement output can be multi-structured to unlock the potential value and deepen the understanding. Data may be re-refined in a self-improved process based on the downstream needs and consumption context.

It is obvious that Big Data can be better characterized by 4Rs in the new era. For more information, please contact Tony Shan ([email protected]). ©Tony Shan. All rights reserved.

Slides: Tony Shan ‘Thinking in Big Data’

Download Slide Deck: ▸ Here

An effective way of thinking in Big Data is composed of a methodical framework for dealing with the predicted shortage of 50-60% of the qualified Big Data resources in the U.S.

This holistic model comprises the scientific and engineering steps that are involved in accelerating Big Data solutions: problem, diagnosis, facts, analysis, hypothesis, solution, prototype and implementation.

In his session at Big Data Expo®, Tony Shan focused on the concept, importance, and considerations for each of these eight components.

He will drill down to the key techniques and methods that are commonly used in these steps, such as root cause examination, process mapping, force field investigation, benchmarking, interview, brainstorming, focus group, Pareto chart, SWOT, impact evaluation, gap analysis, POC, and cost-benefit study.

Best practices and lessons learned from the real-world Big Data projects will also be discussed.

Read the original blog entry...

More Stories By Tony Shan

Tony Shan works as a senior consultant, advisor at a global applications and infrastructure solutions firm helping clients realize the greatest value from their IT. Shan is a renowned thought leader and technology visionary with a number of years of field experience and guru-level expertise on cloud computing, Big Data, Hadoop, NoSQL, social, mobile, SOA, BI, technology strategy, IT roadmapping, systems design, architecture engineering, portfolio rationalization, product development, asset management, strategic planning, process standardization, and Web 2.0. He has directed the lifecycle R&D and buildout of large-scale award-winning distributed systems on diverse platforms in Fortune 100 companies and public sector like IBM, Bank of America, Wells Fargo, Cisco, Honeywell, Abbott, etc.

Shan is an inventive expert with a proven track record of influential innovations such as Cloud Engineering. He has authored dozens of top-notch technical papers on next-generation technologies and over ten books that won multiple awards. He is a frequent keynote speaker and Chair/Panel/Advisor/Judge/Organizing Committee in prominent conferences/workshops, an editor/editorial advisory board member of IT research journals/books, and a founder of several user groups, forums, and centers of excellence (CoE).

@BigDataExpo Stories
"There's plenty of bandwidth out there but it's never in the right place. So what Cedexis does is uses data to work out the best pathways to get data from the origin to the person who wants to get it," explained Simon Jones, Evangelist and Head of Marketing at Cedexis, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term.
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
With privacy often voiced as the primary concern when using cloud based services, SyncriBox was designed to ensure that the software remains completely under the customer's control. Having both the source and destination files remain under the user?s control, there are no privacy or security issues. Since files are synchronized using Syncrify Server, no third party ever sees these files.
"We are an integrator of carrier ethernet and bandwidth to get people to connect to the cloud, to the SaaS providers, and the IaaS providers all on ethernet," explained Paul Mako, CEO & CTO of Massive Networks, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
I believe that this may finally be the year that the CIO role ‘crosses the Rubicon,' leaving behind its traditional, IT-focused orientation. But I don't believe that either of the previous predictions of this outcome — fading into oblivion or rising to a business executive level — is correct. Instead, I think this is the year that we will see the role of the CIO transformed into something altogether different.
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
Andi Mann, Chief Technology Advocate at Splunk, is an accomplished digital business executive with extensive global expertise as a strategist, technologist, innovator, marketer, and communicator. For over 30 years across five continents, he has built success with Fortune 500 corporations, vendors, governments, and as a leading research analyst and consultant.
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Internet-of-Things discussions can end up either going down the consumer gadget rabbit hole or focused on the sort of data logging that industrial manufacturers have been doing forever. However, in fact, companies today are already using IoT data both to optimize their operational technology and to improve the experience of customer interactions in novel ways. In his session at @ThingsExpo, Gordon Haff, Red Hat Technology Evangelist, shared examples from a wide range of industries – including en...
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Rodrigo Coutinho is part of OutSystems' founders' team and currently the Head of Product Design. He provides a cross-functional role where he supports Product Management in defining the positioning and direction of the Agile Platform, while at the same time promoting model-based development and new techniques to deliver applications in the cloud.
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of bus...
delaPlex is a global technology and software development solutions and consulting provider, deeply committed to helping companies drive growth, revenue and marketplace value. Since 2008, delaPlex's objective has been to be a trusted advisor to its clients. By redefining the outsourcing industry's business model, the innovative delaPlex Agile Business Framework brings an unmatched alliance of industry experts, across industries and functional skillsets, to clients anywhere around the world.
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.