Welcome!

Big Data Journal Authors: Trevor Parsons, Lori MacVittie, Cynthia Dunlop, Elizabeth White, Bob Gourley

Related Topics: Cloud Expo, XML, SOA & WOA, Virtualization, Web 2.0, Big Data Journal

Cloud Expo: Article

Semantic Interoperability: The Pot of Gold Under the Rainbow

The problem with semantic interoperability is that human communication is inherently vague, ambiguous, and relative

Our ZapThink 2020 poster lays out our complex web of predictions for enterprise IT in the year 2020. You might think that semantic operability is an important part of this story; after all, several groups have been heads down working on the problem of how to teach computers to agree on the meaning of the information they exchange for years now. But look again: we relegate semantics to the lower right corner, where we point out that we don’t believe there will be much progress in this area by 2020. Eventually, maybe, but even though semantic interoperability appears to be within our grasp, it behaves more like the pot of gold under the rainbow. The closer you get to the rainbow, the farther away it appears.

What gives? ZapThink usually takes an optimistic perspective about the future of technology, but we’re decidedly pessimistic about the prospects for semantic interoperability. The problem as we see it comes down to the human understanding of language. All efforts to standardize meanings in order to facilitate semantic interoperability strip out vagueness and ambiguity from data, and presume a single, universal underlying grammar. After all, isn’t the goal to foster precise, unambiguous, and consistent communication between systems? The problem is, human communication is inherently vague, ambiguous, and relative. The way humans understand the world, the way we think, and the way we put our thoughts into language require both vagueness and ambiguity. Without them, we lose important aspects of meaning. Furthermore, how we structure our language is culturally and linguistically relative. As a result, current semantic interoperability efforts will be able to address a certain class of problems, but in the grand scheme of things, that class of problems is a relatively small subset of the types of communication we would prefer to automate between systems.

The Importance of Vagueness
Ironically, to discuss semantics we must first define our terms. A term is vague when it’s impossible to say whether the term applies in certain circumstances, for example, “my face is red.” Just how red does it have to be before we’re sure it’s red? In contrast, a term is ambiguous when it’s possible to interpret it in more than one way. For example, “I’m going to a bank” might mean that I’m going to a financial institution or to the side of a river.

Vagueness leads to knotty problems in philosophy that impact our ability to provide semantic interoperability. So, let’s go back to philosophy class, and study the sorites paradox. If you have a heap of sand and you take away a single grain, do you still have a heap of sand? Certainly. OK, repeat the process. Clearly, when you get down to a single grain of sand remaining, you no longer have a heap. So, when did the heap cease to be a heap?

Philosophers and linguists have been arguing over how to solve the sorites paradox for over a century now (yes, I know, they should find something more useful to do with their time). One answer: put your foot down and establish a precise boundary. 1,000 or more grains of sand are a heap, but 999 or less are not. Our computers will have no problem with such a resolution to the paradox, but it doesn’t accurately represent what we really mean by a heap. After all, if 1,000 grains constitutes a heap, wouldn’t 999? Central to the meaning of the term “heap” is its inherent vagueness.

Another solution: yes, there is some number of grains of sand where a heap ceases to be a heap, but we can’t know what it is. This resolution might satisfy some philosophers, but it doesn’t help our computers make sense out of our language. A third approach: instead of considering “is a heap” and “isn’t a heap” as the only two possible values, define a spectrum of intermediary values, or perhaps a continuum of values. The computer scientists are likely to be happy with this answer, as it lends itself to fuzzy logic: the statement “this pile of sand is a heap” might be, say, 40% true. Yes, we can do our fuzzy logic math now, but we’ve still lost some fundamental elements of meaning.

To bring back our natural language-based understanding of the sorites paradox, let’s step away from an overly analytical approach to the problem and try to look at the paradox from a human perspective. How, for example, would a seven-year-old describe the heap of sand as you take away a grain of sand at a time? They might answer, “well, it’s a smaller heap” or “it’s kinda a heap” or “it’s a little heap” or “it’s not really a heap,” etc. Such expressions are clearly not precise. Our computers wouldn’t be able to make much sense out of them. But these simple, even childish expressions are how people really speak and how people truly understand vagueness.

The important takeaway here is that vagueness isn’t a property relegated to heaps and blushing faces. It’s a ubiquitous property of virtually all human communication, even within the business context. Take for example an insurance policy. Insurance policies have a number of properties (policy holder, underwriter, insured property, deductible, etc.) and relationships to other business entities (policy application, underwriting documentation, claims forms, etc.) Now let’s add or take away individual properties and relationships from our canonical understanding of an insurance policy one at a time. Is it still a policy? Clearly, if we take away everything that makes a policy a policy then it’s no longer a policy. But if we take away a single property, we’re likely to say it’s still a policy. So where do we draw the line? If philosophers and linguists haven’t solved this problem in over a century, don’t expect your semantic interoperability tool to make much headway either.

The Problem of Linguistic Relativity
Another century-long battle in the world of linguistics is the fray over linguistic relativity vs. linguistic universality. Linguistic relativity is the position that language affects how speakers see their world, and by extension, how they think. In the other corner is Noam Chomsky’s universal grammar, the linguistic theory that grammar is hardwired into the brain, and hence universal across all peoples regardless of their language or their culture. Theoretical work on a universal grammar has led to dramatic advances in natural language translation, and we all get to use and appreciate Google Translate and its brethren as a result. But while Google Translate is a miraculous tool indeed (especially for us Star Trek fans who marveled at the Universal Translator), it doesn’t take a polyglot to realize that the state of the art for such technology still leaves much to be desired.

Linguistic relativity, however, goes at the heart of the semantic interoperability challenge. Take for example, one of today’s most useful semantic standards: the Resource Description Framework (RDF). RDF is a metadata data model intended for making statements about resources (in particular, Web-based resources) in the form of subject-predicate-object expressions. For example, you might be able to express the statement “ZapThink wrote this ZapFlash” in the triplet consisting of “ZapThink” (the subject); “wrote” (the predicate); and “this ZapFlash” (the object). Take this basic triplet building block and you can build semantic webs of arbitrary complexity, with the eventual goal of describing the relationships among all business entities within a particular business context.

The problem with the approach RDF takes, however, is that the subject-predicate-object structure is Eurocentric. Non-European languages (and hence, non-European speakers) don’t necessarily think in sentences that follow this structure. And furthermore, this problem isn’t new. In fact, the research into this phenomenon dates back to the 1940s, with the work of linguist Benjamin Lee Whorf. Whorf conducted linguistic research among the Hopi and other Native American peoples, and thus established an empirical basis for linguistic relativity. The illustration below comes from one of his seminal papers on the subject:



In the graphic above, Whorf compares a simple sentence, “I clean it with a ramrod,” where “it” refers to a gun, in English and Shawnee. The English sentence predictably follows the subject-predicate-object format that RDF leverages. The Shawnee translation, however, translates literally to “dry space/interior of hole/by motion of tool or instrument.” Not only is there no one-to-one correspondence between parts of speech across the two sentences, but the entire context of the expression is different. If you were in the unenviable position of establishing RDF-based semantic interoperability between, say, a British business and a Shawnee business, you’d find RDF far too culturally specific to rise to the challenge.

The ZapThink Take
We have tools for semantic interoperability today, of course – but all such tools require the human step of configuring or training the tool to understand the properties and relationships among entities. Once you’ve trained the tool, it’s possible to automate many semantic interactions. But to get this process started, we must get together in a room with the people we want to communicate with and hammer out the meanings of the terms we’d like to use.

This human component to semantic interoperability actually dates to the Stone Age. How did we do business in the Stone Age? Say your tribe was on the coast, so you had fish. You were getting tired of fish, so you and your tribemates decided to pack up some fish and bring the bundle to the next village where they had fruit. You showed up at the village market, only you had no common language. So what did you do? You held up some fish, pointed to some fruit, grunted, and waved your hands. If you established a basis of communication, you conducted business, and went home with some fruit. If not, then you went home empty handed (or you pulled out your clubs and attacked, but that’s another story). Cut to the 21st century, and little has changed. People still have to get together and establish a basis of communication as human beings in order to facilitate semantic interoperability. But fully automating such interoperability is as close as the next rainbow.

More Stories By Jason Bloomberg

Jason Bloomberg is the leading expert on architecting agility for the enterprise. As president of Intellyx, Mr. Bloomberg brings his years of thought leadership in the areas of Cloud Computing, Enterprise Architecture, and Service-Oriented Architecture to a global clientele of business executives, architects, software vendors, and Cloud service providers looking to achieve technology-enabled business agility across their organizations and for their customers. His latest book, The Agile Architecture Revolution (John Wiley & Sons, 2013), sets the stage for Mr. Bloomberg’s groundbreaking Agile Architecture vision.

Mr. Bloomberg is perhaps best known for his twelve years at ZapThink, where he created and delivered the Licensed ZapThink Architect (LZA) SOA course and associated credential, certifying over 1,700 professionals worldwide. He is one of the original Managing Partners of ZapThink LLC, the leading SOA advisory and analysis firm, which was acquired by Dovel Technologies in 2011. He now runs the successor to the LZA program, the Bloomberg Agile Architecture Course, around the world.

Mr. Bloomberg is a frequent conference speaker and prolific writer. He has published over 500 articles, spoken at over 300 conferences, Webinars, and other events, and has been quoted in the press over 1,400 times as the leading expert on agile approaches to architecture in the enterprise.

Mr. Bloomberg’s previous book, Service Orient or Be Doomed! How Service Orientation Will Change Your Business (John Wiley & Sons, 2006, coauthored with Ron Schmelzer), is recognized as the leading business book on Service Orientation. He also co-authored the books XML and Web Services Unleashed (SAMS Publishing, 2002), and Web Page Scripting Techniques (Hayden Books, 1996).

Prior to ZapThink, Mr. Bloomberg built a diverse background in eBusiness technology management and industry analysis, including serving as a senior analyst in IDC’s eBusiness Advisory group, as well as holding eBusiness management positions at USWeb/CKS (later marchFIRST) and WaveBend Solutions (now Hitachi Consulting).

@BigDataExpo Stories
Dyn solutions are at the core of Internet Performance. Through traffic management, message management and performance assurance, Dyn is connecting people through the Internet and ensuring information gets where it needs to go, faster and more reliably than ever before. Founded in 2001 at WPI, Dyn’s global presence services more than four million enterprise, small business and personal customers.
SimpleECM is the only platform to offer a powerful combination of enterprise content management (ECM) services, capture solutions, and third-party business services providing simplified integrations and workflow development for solution providers. SimpleECM is opening the market to businesses of all sizes by reinventing the delivery of ECM services. Our APIs make the development of ECM services simple with the use of familiar technologies for a frictionless integration directly into web applicat...
Samsung VP Jacopo Lenzi, who headed the company's recent SmartThings acquisition under the auspices of Samsung's Open Innovaction Center (OIC), answered a few questions we had about the deal. This interview was in conjunction with our interview with SmartThings CEO Alex Hawkinson. IoT Journal: SmartThings was developed in an open, standards-agnostic platform, and will now be part of Samsung's Open Innovation Center. Can you elaborate on your commitment to keep the platform open? Jacopo Lenzi: S...
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at Internet of @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, will discuss how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will need to convince a skeptical public to participate. Get ready to show them the money! Speaker Bio: ...
Things are being built upon cloud foundations to transform organizations. This CEO Power Panel at 15th Cloud Expo, moderated by Roger Strukhoff, Cloud Expo and @ThingsExpo conference chair, will address the big issues involving these technologies and, more important, the results they will achieve. How important are public, private, and hybrid cloud to the enterprise? How does one define Big Data? And how is the IoT tying all this together?
The only place to be June 9-11 is Cloud Expo & @ThingsExpo 2015 East at the Javits Center in New York City. Join us there as delegates from all over the world come to listen to and engage with speakers & sponsors from the leading Cloud Computing, IoT & Big Data companies. Cloud Expo & @ThingsExpo are the leading events covering the booming market of Cloud Computing, IoT & Big Data for the enterprise. Speakers from all over the world will be hand-picked for their ability to explore the economic...
SYS-CON Events announced today that Cloudian, Inc., the leading provider of hybrid cloud storage solutions, has been named “Bronze Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Cloudian is a Foster City, Calif.-based software company specializing in cloud storage. Cloudian HyperStore® is an S3-compatible cloud object storage platform that enables service providers and enterprises to bui...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, will discuss how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP ...
SYS-CON Events announced today that O'Reilly Media has been named “Media Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. O'Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O'Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurri...
The Internet of Things (IoT) is going to require a new way of thinking and of developing software for speed, security and innovation. This requires IT leaders to balance business as usual while anticipating for the next market and technology trends. Cloud provides the right IT asset portfolio to help today’s IT leaders manage the old and prepare for the new. Today the cloud conversation is evolving from private and public to hybrid. This session will provide use cases and insights to reinforce t...
Samsung promises to be one of the 800-pound gorillas of the IoT, if its success in recent years with Android devices and other consumer electronics is any guide. Showing its willingness to be a big IoT player, the company recently acquired SmartThings, a recent startup that's developed an open smarthome appliation that currently supports 1,000 devices and 8,000 apps. SmartThings will now work under the auspices of Samsung's Open Innovation Center (OIC). SmartThings Founder and CEO Alex Hawkinson...
What process has your provider undertaken to ensure that the cloud tenant will receive predictable performance and service? What was involved in the planning? Who owns and operates the data center? What technology is being used? How is it being supported? In his session at 14th Cloud Expo, Dave Weisbrot, Cloud Business Manager for QTS, will provide the attendees a look into what it takes to stand up and stand behind a highly available certified cloud IaaS.
SYS-CON Events announced today that Gigaom Research has been named "Media Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Ashar Baig, Research Director, Cloud, at Gigaom Research, will also lead a Power Panel on the topic "Choosing the Right Cloud Option." Gigaom Research provides timely, in-depth analysis of emerging technologies for individual and corporate subscribers. Gigaom Research'...
I'll be hosting an SAP HANA Cloud webinar at 11am eastern time, Wednesday, October 29. You can sign up now. Featured speakers will be Allan Adler, Managing Partner, Channel Cloud Consulting, and Thorsten Leiduck, VP ISVs & Digital Commerce, SAP. Attendees will learn about • Cloud economics, hybrid cloud strategy, market size and opportunity • Introduction to SAP HANA Cloud Platform and how to: - Build new next-generation applications - Extend on-premise solutions non-disruptively throu...
Join both SAP and Channel Cloud Consulting for our webcast and uncover how you can extend your reach to capture a piece of the US$17 billion cloud application services market with SAP. Learn about SAPs market-leading SAP HANA Cloud Platform and an exciting opportunity to join SAPs growing ecosystem of Application Development partners. When: October 29, 11:00am EST Speakers: Allan Adler, Managing Partner, Channel Cloud Consulting Thorsten Leiduck, Vice President ISVs & Digital Commerce, SAP
Application Performance Management (APM) has been bred with all the right elements to give us the insights we need to see the health of our applications. Similar to your most trusted watch dog, it alerts us to anomalies when events occur, providing awareness to the environment that only they can observe. As enterprises embrace the DevOps philosophy, and the coalescence of the Development and Operations continues, I foresee the conditions ripening to foster innovative methods of making applicati...
SYS-CON Events announced today that IBM is holding a Bluemix Developer Playground on November 5, 10:30 am to 5:30 pm at 15th Cloud Expo. 15th Cloud Expo, co-located with @ThingsExpo, Big Data Expo, and DevOps Summit is taking place Nov. 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. The labs, for developers of all levels, will highlight the ease of use of Bluemix, its services and functionality and provide short-term introductory projects that developers can complete betw...
The Industrial Internet revolution is now underway, enabled by connected machines and billions of devices that communicate and collaborate. The massive amounts of Big Data requiring real-time analysis is flooding legacy IT systems and giving way to cloud environments that can handle the unpredictable workloads. Yet many barriers remain until we can fully realize the opportunities and benefits from the convergence of machines and devices with Big Data and the cloud, including interoperability, da...
Software AG helps organizations transform into Digital Enterprises, so they can differentiate from competitors and better engage customers, partners and employees. Using the Software AG Suite, companies can close the gap between business and IT to create digital systems of differentiation that drive front-line agility. We offer four on-ramps to the Digital Enterprise: alignment through collaborative process analysis; transformation through portfolio management; agility through process automation...
How do you know when a technology has become mainstream? A good clue may be when politicians start talking about it on the campaign trail and with mainstream media. David Cameron, the UK prime minister, was the latest, indicating that the world was now on “fast-forward” with the Internet of Things (IoT) ushering in the new industrial revolution. No mention of IoT targeted at the masses would be complete without the clichéd example of the communicating fridge. While it is easy to get caught up in...