Welcome!

Big Data Journal Authors: Elizabeth White, Sematext Blog, Liz McMillan, Jason Bloomberg, Roger Strukhoff

Related Topics: Big Data Journal, SOA & WOA, Web 2.0, Cloud Expo, Apache, SDN Journal

Big Data Journal: Article

What Is the Definition of Big Data?

Big Data is data which cannot be handled by traditional technologies

Is Big Data a buzzword with no clear definition? Wikipedia defines Big Data as...

...a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications...

13 More Definitions of Big Data
Here is a collection of 13 (unlucky?) other definitions of "Big Data" - from analyst firms, from government organizations, from technology publication and from technology vendors.

1. Gartner

...is defined as high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making...

2. Forrester

...is the frontier of a firm's ability to store, process, and access (SPA) all the data it needs to operate effectively, make decisions, reduce risks, and serve customers...

Big data

3. O’Reilly Media

…is data that exceeds the processing capacity of conventional database systems…

4. IDC (picked from Mary Ludloff’s blog)

…describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis…

5. TechAmerica Foundation Big Data Commission

…is a term that describes large volumes of high velocity, complex, and variable data that require advanced techniques and technologies to enable the capture, storage, distribution, management, and analysis of the information…

6. NIST – US Department of Commerce

…is where the data volume, acquisition velocity, or data representation limits the ability to perform effective analysis using traditional relational approaches or requires the use of significant horizontal scaling for efficient processing…

7. PC Mag

…is the massive amounts of data that collect over time that are difficult to analyze and handle using common database management tools…

8. Tech Target – Search Cloud Computing

…is a general term used to describe the voluminous amount of unstructured and semi-structured data a company creates — data that would take too much time and cost too much money to load into a relational database for analysis…

9. Forbes

…is, Ill-defined, Intimidating, Immediate…

10. Webopedia

…is a buzzword, or catch-phrase, used to describe a massive volume of both structured and unstructured data that is so large that it’s difficult to process using traditional database and software techniques…

11. EMC

…is fundamentally about massively parallel processing using commodity building blocks to manage and analyze the data…

12. IBM

…is Volume, Velocity, Variety, Veracity…

13. Amazon (as stated by John Rauser – picked from Network World)

…any amount of data that’s too big to be handled by one computer…

Looks like there IS a clear consensus!

Big Data is data which cannot be handled by traditional technologies

Whether it is useful, usable, meaningful … that is a different question.

Related Articles

More Stories By Udayan Banerjee

Udayan Banerjee is CTO at NIIT Technologies Ltd, an IT industry veteran with more than 30 years' experience. He blogs at http://setandbma.wordpress.com.
The blog focuses on emerging technologies like cloud computing, mobile computing, social media aka web 2.0 etc. It also contains stuff about agile methodology and trends in architecture. It is a world view seen through the lens of a software service provider based out of Bangalore and serving clients across the world. The focus is mostly on...

  • Keep the hype out and project a realistic picture
  • Uncover trends not very apparent
  • Draw conclusion from real life experience
  • Point out fallacy & discrepancy when I see them
  • Talk about trends which I find interesting
Google

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@BigDataExpo Stories
In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect at GE, and Ibrahim Gokcen, who leads GE's advanced IoT analytics, focused on the Internet of Things / Industrial Internet and how to make it operational for business end-users. Learn about the challenges posed by machine and sensor data and how to marry it with enterprise data. They also discussed the tips and tricks to provide the Industrial Internet as an end-user consumable service using Big Data Analytics and Industrial C...
SYS-CON Media announced that Splunk, a provider of the leading software platform for real-time Operational Intelligence, has launched an ad campaign on Big Data Journal. Splunk software and cloud services enable organizations to search, monitor, analyze and visualize machine-generated big data coming from websites, applications, servers, networks, sensors and mobile devices. The ads focus on delivering ROI - how improved uptime delivered $6M in annual ROI, improving customer operations by minin...
SYS-CON Events announced today that ActiveState, the leading independent Cloud Foundry and Docker-based PaaS provider, has been named “Silver Sponsor” of SYS-CON's DevOps Summit New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. ActiveState believes that enterprises gain a competitive advantage when they are able to quickly create, deploy and efficiently manage software solutions that immediately create business value, but they face many challenges that ...
The Industrial Internet revolution is now underway, enabled by connected machines and billions of devices that communicate and collaborate. The massive amounts of Big Data requiring real-time analysis is flooding legacy IT systems and giving way to cloud environments that can handle the unpredictable workloads. Yet many barriers remain until we can fully realize the opportunities and benefits from the convergence of machines and devices with Big Data and the cloud, including interoperability, ...
SYS-CON Media announced that Cisco, a worldwide leader in IT that helps companies seize the opportunities of tomorrow, has launched a new ad campaign in Cloud Computing Journal. The ad campaign, a webcast titled 'Is Your Data Center Ready for the Application Economy?', focuses on the latest data center networking technologies, including SDN or ACI, and how customers are using SDN and ACI in their organizations to achieve business agility. The Cisco webcast is available on-demand.
Datapipe has acquired GoGrid, a provider of multi-cloud solutions for Big Data deployments. GoGrid’s proprietary orchestration and automation technologies provide 1-Button deployment for Big Data solutions that speed creation and results of new cloud projects. “GoGrid has made it easy for companies to stand up Big Data solutions quickly,” said Robb Allen, CEO, Datapipe. “Datapipe customers will achieve significant value from the speed at which we can now create new Big Data projects in the clou...
Dale Kim is the Director of Industry Solutions at MapR. His background includes a variety of technical and management roles at information technology companies. While his experience includes work with relational databases, much of his career pertains to non-relational data in the areas of search, content management, and NoSQL, and includes senior roles in technical marketing, sales engineering, and support engineering. Dale holds an MBA from Santa Clara University, and a BA in Computer Science f...
The Internet of Things (IoT) is rapidly in the process of breaking from its heretofore relatively obscure enterprise applications (such as plant floor control and supply chain management) and going mainstream into the consumer space. More and more creative folks are interconnecting everyday products such as household items, mobile devices, appliances and cars, and unleashing new and imaginative scenarios. We are seeing a lot of excitement around applications in home automation, personal fitness,...
The Internet of Things (IoT) promises to evolve the way the world does business; however, understanding how to apply it to your company can be a mystery. Most people struggle with understanding the potential business uses or tend to get caught up in the technology, resulting in solutions that fail to meet even minimum business goals. In his session at @ThingsExpo, Jesse Shiah, CEO / President / Co-Founder of AgilePoint Inc., showed what is needed to leverage the IoT to transform your business. ...
SYS-CON Events announced today that CodeFutures, a leading supplier of database performance tools, has been named a “Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. CodeFutures is an independent software vendor focused on providing tools that deliver database performance tools that increase productivity during database development and increase database performance and scalability during production.
Things are being built upon cloud foundations to transform organizations. This CEO Power Panel at 15th Cloud Expo, moderated by Roger Strukhoff, Cloud Expo and @ThingsExpo conference chair, addressed the big issues involving these technologies and, more important, the results they will achieve. Rodney Rogers, chairman and CEO of Virtustream; Brendan O'Brien, co-founder of Aria Systems, Bart Copeland, president and CEO of ActiveState Software; Jim Cowie, chief scientist at Dyn; Dave Wagstaff, VP ...
“We help people build clusters, in the classical sense of the cluster. We help people put a full stack on top of every single one of those machines. We do the full bare metal install," explained Greg Bruno, Vice President of Engineering and co-founder of StackIQ, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
"People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Performance is the intersection of power, agility, control, and choice. If you value performance, and more specifically consistent performance, you need to look beyond simple virtualized compute. Many factors need to be considered to create a truly performant environment. In his General Session at 15th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, discussed how to take advantage of a multitude of compute options and platform features to make cloud the cornerstone of your onlin...
Hardware will never be more valuable than on the day it hits your loading dock. Each day new servers are not deployed to production the business is losing money. While Moore's Law is typically cited to explain the exponential density growth of chips, a critical consequence of this is rapid depreciation of servers. The hardware for clustered systems (e.g., Hadoop, OpenStack) tends to be significant capital expenses. In his session at Big Data Expo, Mason Katz, CTO and co-founder of StackIQ, disc...
Software Defined Storage provides many benefits for customers including agility, flexibility, faster adoption of new technology and cost effectiveness. However, for IT organizations it can be challenging and complex to build your Enterprise Grade Storage from software. In his session at Cloud Expo, Paul Turner, CMO at Cloudian, looked at the new Original Design Manufacturer (ODM) market and how it is changing the storage world. Now Software Defined Storage companies can build Enterprise grade ...
Companies today struggle to manage the types and volume of data their customers and employees generate and use every day. With billions of requests daily, operational consistency can be elusive. In his session at Big Data Expo, Dave McCrory, CTO at Basho Technologies, will explore how a distributed systems solution, such as NoSQL, can give organizations the consistency and availability necessary to succeed with on-demand data, offering high availability at massive scale.
Vormetric on Wednesday announced the results of its 2015 Insider Threat Report (ITR), conducted online on their behalf by Harris Poll and in conjunction with analyst firm Ovum in fall 2014 among 818 IT decision makers in various countries, including 408 in the United States. The report details striking findings around how U.S. and international enterprises perceive security threats, the types of employees considered most dangerous, environments at the greatest risk for data loss and the steps or...
CodeFutures has announced Dan Lynn as its new CEO. Lynn assumes the role from Founder Cory Isaacson, who has joined RMS and will now serve as chairman of CodeFutures. Lynn brings more than 14 years of advanced technology and business success experience, and will help CodeFutures build on its industry leadership around its Agile Big Data initiatives. His technical expertise will be invaluable in advancing CodeFutures’ AgilData platform and new processes for streamlining and gaining value from gro...
Cloud and Big Data present unique dilemmas: embracing the benefits of these new technologies while maintaining the security of your organization's assets. When an outside party owns, controls and manages your infrastructure and computational resources, how can you be assured that sensitive data remains private and secure? How do you best protect data in mixed use cloud and big data infrastructure sets? Can you still satisfy the full range of reporting, compliance and regulatory requirements? In...