Click here to close now.

Welcome!

BigDataExpo® Blog Authors: Pat Romanski, Elizabeth White, Ian Khan, Liz McMillan, Carmen Gonzalez

Related Topics: CloudExpo® Blog, Industrial IoT, @MicroservicesE Blog, @ContainersExpo Blog, Agile Computing, BigDataExpo® Blog

CloudExpo® Blog: Article

Semantic Interoperability: The Pot of Gold Under the Rainbow

The problem with semantic interoperability is that human communication is inherently vague, ambiguous, and relative

Our ZapThink 2020 poster lays out our complex web of predictions for enterprise IT in the year 2020. You might think that semantic operability is an important part of this story; after all, several groups have been heads down working on the problem of how to teach computers to agree on the meaning of the information they exchange for years now. But look again: we relegate semantics to the lower right corner, where we point out that we don’t believe there will be much progress in this area by 2020. Eventually, maybe, but even though semantic interoperability appears to be within our grasp, it behaves more like the pot of gold under the rainbow. The closer you get to the rainbow, the farther away it appears.

What gives? ZapThink usually takes an optimistic perspective about the future of technology, but we’re decidedly pessimistic about the prospects for semantic interoperability. The problem as we see it comes down to the human understanding of language. All efforts to standardize meanings in order to facilitate semantic interoperability strip out vagueness and ambiguity from data, and presume a single, universal underlying grammar. After all, isn’t the goal to foster precise, unambiguous, and consistent communication between systems? The problem is, human communication is inherently vague, ambiguous, and relative. The way humans understand the world, the way we think, and the way we put our thoughts into language require both vagueness and ambiguity. Without them, we lose important aspects of meaning. Furthermore, how we structure our language is culturally and linguistically relative. As a result, current semantic interoperability efforts will be able to address a certain class of problems, but in the grand scheme of things, that class of problems is a relatively small subset of the types of communication we would prefer to automate between systems.

The Importance of Vagueness
Ironically, to discuss semantics we must first define our terms. A term is vague when it’s impossible to say whether the term applies in certain circumstances, for example, “my face is red.” Just how red does it have to be before we’re sure it’s red? In contrast, a term is ambiguous when it’s possible to interpret it in more than one way. For example, “I’m going to a bank” might mean that I’m going to a financial institution or to the side of a river.

Vagueness leads to knotty problems in philosophy that impact our ability to provide semantic interoperability. So, let’s go back to philosophy class, and study the sorites paradox. If you have a heap of sand and you take away a single grain, do you still have a heap of sand? Certainly. OK, repeat the process. Clearly, when you get down to a single grain of sand remaining, you no longer have a heap. So, when did the heap cease to be a heap?

Philosophers and linguists have been arguing over how to solve the sorites paradox for over a century now (yes, I know, they should find something more useful to do with their time). One answer: put your foot down and establish a precise boundary. 1,000 or more grains of sand are a heap, but 999 or less are not. Our computers will have no problem with such a resolution to the paradox, but it doesn’t accurately represent what we really mean by a heap. After all, if 1,000 grains constitutes a heap, wouldn’t 999? Central to the meaning of the term “heap” is its inherent vagueness.

Another solution: yes, there is some number of grains of sand where a heap ceases to be a heap, but we can’t know what it is. This resolution might satisfy some philosophers, but it doesn’t help our computers make sense out of our language. A third approach: instead of considering “is a heap” and “isn’t a heap” as the only two possible values, define a spectrum of intermediary values, or perhaps a continuum of values. The computer scientists are likely to be happy with this answer, as it lends itself to fuzzy logic: the statement “this pile of sand is a heap” might be, say, 40% true. Yes, we can do our fuzzy logic math now, but we’ve still lost some fundamental elements of meaning.

To bring back our natural language-based understanding of the sorites paradox, let’s step away from an overly analytical approach to the problem and try to look at the paradox from a human perspective. How, for example, would a seven-year-old describe the heap of sand as you take away a grain of sand at a time? They might answer, “well, it’s a smaller heap” or “it’s kinda a heap” or “it’s a little heap” or “it’s not really a heap,” etc. Such expressions are clearly not precise. Our computers wouldn’t be able to make much sense out of them. But these simple, even childish expressions are how people really speak and how people truly understand vagueness.

The important takeaway here is that vagueness isn’t a property relegated to heaps and blushing faces. It’s a ubiquitous property of virtually all human communication, even within the business context. Take for example an insurance policy. Insurance policies have a number of properties (policy holder, underwriter, insured property, deductible, etc.) and relationships to other business entities (policy application, underwriting documentation, claims forms, etc.) Now let’s add or take away individual properties and relationships from our canonical understanding of an insurance policy one at a time. Is it still a policy? Clearly, if we take away everything that makes a policy a policy then it’s no longer a policy. But if we take away a single property, we’re likely to say it’s still a policy. So where do we draw the line? If philosophers and linguists haven’t solved this problem in over a century, don’t expect your semantic interoperability tool to make much headway either.

The Problem of Linguistic Relativity
Another century-long battle in the world of linguistics is the fray over linguistic relativity vs. linguistic universality. Linguistic relativity is the position that language affects how speakers see their world, and by extension, how they think. In the other corner is Noam Chomsky’s universal grammar, the linguistic theory that grammar is hardwired into the brain, and hence universal across all peoples regardless of their language or their culture. Theoretical work on a universal grammar has led to dramatic advances in natural language translation, and we all get to use and appreciate Google Translate and its brethren as a result. But while Google Translate is a miraculous tool indeed (especially for us Star Trek fans who marveled at the Universal Translator), it doesn’t take a polyglot to realize that the state of the art for such technology still leaves much to be desired.

Linguistic relativity, however, goes at the heart of the semantic interoperability challenge. Take for example, one of today’s most useful semantic standards: the Resource Description Framework (RDF). RDF is a metadata data model intended for making statements about resources (in particular, Web-based resources) in the form of subject-predicate-object expressions. For example, you might be able to express the statement “ZapThink wrote this ZapFlash” in the triplet consisting of “ZapThink” (the subject); “wrote” (the predicate); and “this ZapFlash” (the object). Take this basic triplet building block and you can build semantic webs of arbitrary complexity, with the eventual goal of describing the relationships among all business entities within a particular business context.

The problem with the approach RDF takes, however, is that the subject-predicate-object structure is Eurocentric. Non-European languages (and hence, non-European speakers) don’t necessarily think in sentences that follow this structure. And furthermore, this problem isn’t new. In fact, the research into this phenomenon dates back to the 1940s, with the work of linguist Benjamin Lee Whorf. Whorf conducted linguistic research among the Hopi and other Native American peoples, and thus established an empirical basis for linguistic relativity. The illustration below comes from one of his seminal papers on the subject:



In the graphic above, Whorf compares a simple sentence, “I clean it with a ramrod,” where “it” refers to a gun, in English and Shawnee. The English sentence predictably follows the subject-predicate-object format that RDF leverages. The Shawnee translation, however, translates literally to “dry space/interior of hole/by motion of tool or instrument.” Not only is there no one-to-one correspondence between parts of speech across the two sentences, but the entire context of the expression is different. If you were in the unenviable position of establishing RDF-based semantic interoperability between, say, a British business and a Shawnee business, you’d find RDF far too culturally specific to rise to the challenge.

The ZapThink Take
We have tools for semantic interoperability today, of course – but all such tools require the human step of configuring or training the tool to understand the properties and relationships among entities. Once you’ve trained the tool, it’s possible to automate many semantic interactions. But to get this process started, we must get together in a room with the people we want to communicate with and hammer out the meanings of the terms we’d like to use.

This human component to semantic interoperability actually dates to the Stone Age. How did we do business in the Stone Age? Say your tribe was on the coast, so you had fish. You were getting tired of fish, so you and your tribemates decided to pack up some fish and bring the bundle to the next village where they had fruit. You showed up at the village market, only you had no common language. So what did you do? You held up some fish, pointed to some fruit, grunted, and waved your hands. If you established a basis of communication, you conducted business, and went home with some fruit. If not, then you went home empty handed (or you pulled out your clubs and attacked, but that’s another story). Cut to the 21st century, and little has changed. People still have to get together and establish a basis of communication as human beings in order to facilitate semantic interoperability. But fully automating such interoperability is as close as the next rainbow.

More Stories By Jason Bloomberg

Jason Bloomberg is the leading expert on architecting agility for the enterprise. As president of Intellyx, Mr. Bloomberg brings his years of thought leadership in the areas of Cloud Computing, Enterprise Architecture, and Service-Oriented Architecture to a global clientele of business executives, architects, software vendors, and Cloud service providers looking to achieve technology-enabled business agility across their organizations and for their customers. His latest book, The Agile Architecture Revolution (John Wiley & Sons, 2013), sets the stage for Mr. Bloomberg’s groundbreaking Agile Architecture vision.

Mr. Bloomberg is perhaps best known for his twelve years at ZapThink, where he created and delivered the Licensed ZapThink Architect (LZA) SOA course and associated credential, certifying over 1,700 professionals worldwide. He is one of the original Managing Partners of ZapThink LLC, the leading SOA advisory and analysis firm, which was acquired by Dovel Technologies in 2011. He now runs the successor to the LZA program, the Bloomberg Agile Architecture Course, around the world.

Mr. Bloomberg is a frequent conference speaker and prolific writer. He has published over 500 articles, spoken at over 300 conferences, Webinars, and other events, and has been quoted in the press over 1,400 times as the leading expert on agile approaches to architecture in the enterprise.

Mr. Bloomberg’s previous book, Service Orient or Be Doomed! How Service Orientation Will Change Your Business (John Wiley & Sons, 2006, coauthored with Ron Schmelzer), is recognized as the leading business book on Service Orientation. He also co-authored the books XML and Web Services Unleashed (SAMS Publishing, 2002), and Web Page Scripting Techniques (Hayden Books, 1996).

Prior to ZapThink, Mr. Bloomberg built a diverse background in eBusiness technology management and industry analysis, including serving as a senior analyst in IDC’s eBusiness Advisory group, as well as holding eBusiness management positions at USWeb/CKS (later marchFIRST) and WaveBend Solutions (now Hitachi Consulting).

@BigDataExpo Stories
You use an agile process; your goal is to make your organization more agile. But what about your data infrastructure? The truth is, today's databases are anything but agile - they are effectively static repositories that are cumbersome to work with, difficult to change, and cannot keep pace with application demands. Performance suffers as a result, and it takes far longer than it should to deliver new features and capabilities needed to make your organization competitive. As your application an...
As cloud gives an opportunity to businesses to buy services externally – how is cloud impacting your customers? In his General Session at 15th Cloud Expo, Fabio Gori, Director of Worldwide Cloud Marketing at Cisco, provided answers to big questions: Do you see hybrid cloud as where the world is going? What benefits does it bring? And how does Cisco connect all of these clouds? He also discussed Intercloud and Cisco’s investment on it.
SYS-CON Events announced today that MetraTech, now part of Ericsson, has been named “Silver Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. Ericsson is the driving force behind the Networked Society- a world leader in communications infrastructure, software and services. Some 40% of the world’s mobile traffic runs through networks Ericsson has supplied, serving more than 2.5 billion subscribers.
Software is eating the world. Companies that were not previously in the technology space now find themselves competing with Google and Amazon on speed of innovation. As the innovation cycle accelerates, companies must embrace rapid and constant change to both applications and their infrastructure, and find a way to deliver speed and agility of development without sacrificing reliability or efficiency of operations. In her Day 2 Keynote DevOps Summit, Victoria Livschitz, CEO of Qubell, discussed...
The OpenStack cloud operating system includes Trove, a database abstraction layer. Rather than applications connecting directly to a specific type of database, they connect to Trove, which in turn connects to one or more specific databases. One target database is Postgres Plus Cloud Database, which includes its own RESTful API. Trove was originally developed around MySQL, whose interfaces are significantly less complicated than those of the Postgres cloud database. In his session at 16th Cloud...
In their general session at 16th Cloud Expo, Michael Piccininni, Global Account Manager – Cloud SP at EMC Corporation, and Mike Dietze, Regional Director at Windstream Hosted Solutions, will review next generation cloud services, including the Windstream-EMC Tier Storage solutions, and discuss how to increase efficiencies, improve service delivery and enhance corporate cloud solution development. Speaker Bios Michael Piccininni is Global Account Manager – Cloud SP at EMC Corporation. He has b...
Working with Big Data is challenging, especially when decision makers depend on market insights and intelligence from your data but don't have quick access to it or find it unusable. In their session at 6th Big Data Expo, Ian Khan, Global Strategic Positioning & Brand Manager at Solgenia; Zel Bianco, President, CEO and Co-Founder of Interactive Edge of Solgenia; and Ermanno Bonifazi, CEO & Founder at Solgenia, discussed how a revolutionary cloud-based BI along with mobile analytics is already c...
While there are hundreds of public and private cloud hosting providers to choose from, not all clouds are created equal. If you’re seeking to host enterprise-level mission-critical applications, where Cloud Security is a primary concern, WHOA.com is setting new standards for cloud hosting, and has established itself as a major contender in the marketplace. We are constantly seeking ways to innovate and leverage state-of-the-art technologies. In his session at 16th Cloud Expo, Mike Rivera, Seni...
Hardware will never be more valuable than on the day it hits your loading dock. Each day new servers are not deployed to production the business is losing money. While Moore's Law is typically cited to explain the exponential density growth of chips, a critical consequence of this is rapid depreciation of servers. The hardware for clustered systems (e.g., Hadoop, OpenStack) tends to be significant capital expenses. In his session at Big Data Expo, Mason Katz, CTO and co-founder of StackIQ, disc...
The 4th International Internet of @ThingsExpo, co-located with the 17th International Cloud Expo - to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA - announces that its Call for Papers is open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
The enterprise market will drive IoT device adoption over the next five years. In his session at @ThingsExpo, John Greenough, an analyst at BI Intelligence, division of Business Insider, will analyze how companies will adopt IoT products and the associated cost of adopting those products. John Greenough is the lead analyst covering the Internet of Things for BI Intelligence- Business Insider’s paid research service. Numerous IoT companies have cited his analysis of the IoT. Prior to joining B...
As the Internet of Things unfolds, mobile and wearable devices are blurring the line between physical and digital, integrating ever more closely with our interests, our routines, our daily lives. Contextual computing and smart, sensor-equipped spaces bring the potential to walk through a world that recognizes us and responds accordingly. We become continuous transmitters and receivers of data. In his session at @ThingsExpo, Andrew Bolwell, Director of Innovation for HP's Printing and Personal S...
All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades. With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo, June 9-11, 2015, at the Javits Center in New York City. Learn what is going on, contribute to the discussions, and ensure that your enter...
SYS-CON Events announced today that BMC will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. BMC delivers software solutions that help IT transform digital enterprises for the ultimate competitive business advantage. BMC has worked with thousands of leading companies to create and deliver powerful IT management services. From mainframe to cloud to mobile, BMC pairs high-speed digital innovation with robust...
The world is at a tipping point where the technology, the device and global adoption are converging to such a point that we will see an explosion of a world where smartphone devices not only allow us to talk to each other, but allow for communication between everything – serving as a central hub from which we control our world – MediaTek is at the heart of both driving this and allowing the markets to drive this reality forward themselves. The next wave of consumer gadgets is here – smart, con...
The multi-trillion economic opportunity around the "Internet of Things" (IoT) is emerging as the hottest topic for investors in 2015. As we connect the physical world with information technology, data from actions, processes and the environment can increase sales, improve efficiencies, automate daily activities and minimize risk. In his session at @ThingsExpo, Ed Maguire, Senior Analyst at CLSA Americas, will describe what is new and different about IoT, explore financial, technological and re...
The Internet of Things is not only adding billions of sensors and billions of terabytes to the Internet. It is also forcing a fundamental change in the way we envision Information Technology. For the first time, more data is being created by devices at the edge of the Internet rather than from centralized systems. What does this mean for today's IT professional? In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will addresses this very serious issue o...
"People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that O'Reilly Media has been named “Media Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York City, NY. O'Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O'Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurring their adoption...
There will be 150 billion connected devices by 2020. New digital businesses have already disrupted value chains across every industry. APIs are at the center of the digital business. You need to understand what assets you have that can be exposed digitally, what their digital value chain is, and how to create an effective business model around that value chain to compete in this economy. No enterprise can be complacent and not engage in the digital economy. Learn how to be the disruptor and not ...