Welcome!

@DXWorldExpo Authors: Elizabeth White, Yeshim Deniz, Zakia Bouachraoui, Liz McMillan, Pat Romanski

Blog Feed Post

[2b2k] Is big data degrading the integrity of science?

Amanda Alvarez has a provocative post at GigaOm:

There’s an epidemic going on in science: experiments that no one can reproduce, studies that have to be retracted, and the emergence of a lurking data reliability iceberg. The hunger for ever more novel and high-impact results that could lead to that coveted paper in a top-tier journal like Nature or Science is not dissimilar to the clickbait headlines and obsession with pageviews we see in modern journalism.

The article’s title points especially to “dodgy data,” and the item in this list that’s by far the most interesting to me is the “data reliability iceberg,” and its tie to the rise of Big Data. Amanda writes:

…unlike in science…, in big data accuracy is not as much of an issue. As my colleague Derrick Harris points out, for big data scientists the abilty to churn through huge amounts of data very quickly is actually more important than complete accuracy. One reason for this is that they’re not dealing with, say, life-saving drug treatments, but with things like targeted advertising, where you don’t have to be 100 percent accurate. Big data scientists would rather be pointed in the right general direction faster — and course-correct as they go – than have to wait to be pointed in the exact right direction. This kind of error-tolerance has insidiously crept into science, too.

But, the rest of the article contains no evidence that the last sentence’s claim is true because of the rise of Big Data. In fact, even if we accept that science is facing a crisis of reliability, the article doesn’t pin this on an “iceberg” of bad data. Rather, it seems to be a melange of bad data, faulty software, unreliable equipment, poor methodology, undue haste, and o’erweening ambition.

The last part of the article draws some of the heat out of the initial paragraphs. For example: “Some see the phenomenon not as an epidemic but as a rash, a sign that the research ecosystem is getting healthier and more transparent.” It makes the headline and the first part seem a bit overstated — not unusual for a blog post (not that I would ever do such a thing!) but at best ironic given this post’s topic.

I remain interested in Amanda’s hypothesis. Is science getting sloppier with data?

Read the original blog entry...

More Stories By David Weinberger

David is the author of JOHO the blog (www.hyperorg.com/blogger). He is an independent marketing consultant and a frequent speaker at various conferences. "All I can promise is that I will be honest with you and never write something I don't believe in because someone is paying me as part of a relationship you don't know about. Put differently: All I'll hide are the irrelevancies."

DXWorldEXPO Digital Transformation Stories
As the fourth industrial revolution continues to march forward, key questions remain related to the protection of software, cloud, AI, and automation intellectual property. Recent developments in Supreme Court and lower court case law will be reviewed to explain the intricacies of what inventions are eligible for patent protection, how copyright law may be used to protect application programming interfaces (APIs), and the extent to which trademark and trade secret law may have expanded relev...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It's clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Th...
Docker and Kubernetes are key elements of modern cloud native deployment automations. After building your microservices, common practice is to create docker images and create YAML files to automate the deployment with Docker and Kubernetes. Writing these YAMLs, Dockerfile descriptors are really painful and error prone.Ballerina is a new cloud-native programing language which understands the architecture around it - the compiler is environment aware of microservices directly deployable into infra...
When Enterprises started adopting Hadoop-based Big Data environments over the last ten years, they were mainly on-premise deployments. Organizations would spin up and manage large Hadoop clusters, where they would funnel exabytes or petabytes of unstructured data.However, over the last few years the economics of maintaining this enormous infrastructure compared with the elastic scalability of viable cloud options has changed this equation. The growth of cloud storage, cloud-managed big data e...
Your applications have evolved, your computing needs are changing, and your servers have become more and more dense. But your data center hasn't changed so you can't get the benefits of cheaper, better, smaller, faster... until now. Colovore is Silicon Valley's premier provider of high-density colocation solutions that are a perfect fit for companies operating modern, high-performance hardware. No other Bay Area colo provider can match our density, operating efficiency, and ease of scalability.
The Japan External Trade Organization (JETRO) is a non-profit organization that provides business support services to companies expanding to Japan. With the support of JETRO's dedicated staff, clients can incorporate their business; receive visa, immigration, and HR support; find dedicated office space; identify local government subsidies; get tailored market studies; and more.
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
In an age of borderless networks, security for the cloud and security for the corporate network can no longer be separated. Security teams are now presented with the challenge of monitoring and controlling access to these cloud environments, at the same time that developers quickly spin up new cloud instances and executives push forwards new initiatives. The vulnerabilities created by migration to the cloud, such as misconfigurations and compromised credentials, require that security teams t...
AI and machine learning disruption for Enterprises started happening in the areas such as IT operations management (ITOPs) and Cloud management and SaaS apps. In 2019 CIOs will see disruptive solutions for Cloud & Devops, AI/ML driven IT Ops and Cloud Ops. Customers want AI-driven multi-cloud operations for monitoring, detection, prevention of disruptions. Disruptions cause revenue loss, unhappy users, impacts brand reputation etc.