Welcome!

@BigDataExpo Authors: Elizabeth White, Yeshim Deniz, Dana Gardner, Kevin Benedict, Pat Romanski

Related Topics: @CloudExpo, Java IoT, Microservices Expo, Linux Containers, Containers Expo Blog, @BigDataExpo

@CloudExpo: Article

Take Control of Your Schemalessness with Dynamic Schemas

Addressing the inflexibility of structured data by enabling schemaless data to be dynamically and logically structured

Static data structures have been at the heart of data processing tools since the dawn of computing, but they have always limited the flexibility of the organization leveraging the data. Recently, the rise of flexible formats like JSON have led to schemaless data as an attempt to increase agility. However, schemaless data have proven difficult to work with, because of hidden rigid structure in the form of implied schemas.

EnterpriseWeb addresses the problems of both the inflexibility of structured data as well as the impracticality of schemaless data, by enabling schemaless data to be dynamically and logically structured.

From the fixed-length fields of the 1950s, to the relational structures of modern database management systems, to the semistructured data formats XML and JSON, the structure of our data has always informed code about how it should be processed. Data are defined by their relationships, and we used to hard-code those relationships into rigid structures. That approach allows only one static view, which is difficult to work with, and even more difficult to change. Nevertheless, such rigid data structures - and the models that represent them - are an integral part of enterprise information management.

Traditional relational database management systems (RDBMSs) exemplify this point with their static entity-relationship models (ERMs) and tightly interconnected data structures. XML improves this situation slightly, allowing semi-structured information, but schemas still constrain flexibility and performance. With both approaches, fixed definitions, views, and reports limit the ability for businesses to freely transform information into insight and become obstacles to systemwide change.

The Rise of Schemalessness
This challenge of inflexible data structures has given rise to schemaless data. With JSON in particular, we can create whatever data structure we like when we author data. We don't have to shoehorn data into rigid data structures, thus allowing every record to have its own structure.

But there is a problem with schemaless data. Consider this simple task: how do you create a query for all the addresses in a particular Zip Code if every record has a different name or format for Zip Code? Schemalessness, after all, isn't magic - even schemaless data require some kind of metadata so the code will know how to process such information, what software development guru Martin Fowler calls an implied schema.

Implied schemas represent the structure inherent in any data record. If each address record has its own format, then that format provides the implied schema for that record. Dealing with implied schemas thus falls to the developer, who must figure out how to code software to process these implied schemas, which are different for each and every record.

In Fowler's tutorial on schemalessness, he explains the pros and cons of implied schemas. Despite acknowledging the power of schemalessness to support more flexible and responsive user experiences, he recommends avoiding it and implied schemas for developer convenience. Good advice with respect to traditional software, but the world of data is changing. Today we live in an increasingly schemaless world, where more often than not, the structure of our data is fluid or nonexistent.

Raising the Discussion to Dynamic Schemas
Fowler makes it clear that in the past it has been impractical from the developer's perspective to work systematically with schemaless data, because implied schemas are difficult to deal with. After all, structure is itself useful, and isn't the problem per se. Rather, how to avoid the limitations of static structure without falling into the trap of unmanageable schemaless data that is the real challenge.

EnterpriseWeb's unique approach to modeling solves this critically important challenge by leveraging dynamic schemas that have flexible, metadata-driven relationships with underlying information. Using metadata this way separates concerns, letting people consider relationships from multiple perspectives, rather than from a single static point of view. In addition, it's now possible to change and extend metadata to meet diverse business needs without disruption.

Instead of settling for complex ERMs with their inflexible, tightly coupled data structures or dealing with the coding complexities of implied schemas, developers can project dynamic schemas from the metadata simply by writing different transformations. As a result, dynamic schemas are developer friendly and dynamic - a welcome change from the difficult problem of schemalessness.

Add an Agent for Performance
So far so good, but how do we build software to process all such data in a general way, freeing ourselves from custom coding for implicit schemas? The solution is an intelligent agent.

EnterpriseWeb's intelligent agent, SmartAlex™, is a distributable transaction manager that resolves dynamic schemas for each interaction. Every human or system client interaction is a request for SmartAlex to interpret dynamic schemas (as well as other models and additional metadata) and translate them to a context-specific set of resources in order to construct a custom response.

This Agent-Oriented approach maximizes performance for such dynamic computing. In the background, SmartAlex handles all run time connection and transformation details, sparing programmers from manually integrating resources for varied and unanticipated uses, greatly improving IT productivity while enabling business agility.

SmartAlex logs all system events, indexes all new and updated resources, and tags all changes in relationships for detailed and navigable audit history. This practice creates a feedback loop as SmartAlex leverages the same indexed logs to guide its execution. Data, code, and user interface components, as well as connectors for federated services, systems, databases, and devices, can be updated or replaced without breaking related apps and processes - as SmartAlex is ‘aware' of the changes. In this way EnterpriseWeb supports real time exception and change management for resilient solutions that can evolve naturally.

The EnterpriseWeb Take
Schemalessness was a reaction to the limitations of structured data. People struggled with the constraints of static structure, and figured that if they simply got rid of structure, then the problem would go away. But this move was merely a shell game, as the limitations of fixed schemas shifted to implied schemas, now without the benefits of structure to inform the code responsible for their processing.

The solution is to raise the level of abstraction, and instead of arguing over fixed vs. implied schemas, to work at the dynamic schema level. Such an approach is model-driven, allowing application designers to build models that capture their data structures, and allowing an intelligent agent to use the metadata each model represents to meet the specific needs of each interaction. The real lesson here is that the solution to resolving the challenge of schemalessness combines both dynamic schemas and the action of the agent. Stay tuned to my next newsletter for more information.

More Stories By Jason Bloomberg

Jason Bloomberg is the leading expert on architecting agility for the enterprise. As president of Intellyx, Mr. Bloomberg brings his years of thought leadership in the areas of Cloud Computing, Enterprise Architecture, and Service-Oriented Architecture to a global clientele of business executives, architects, software vendors, and Cloud service providers looking to achieve technology-enabled business agility across their organizations and for their customers. His latest book, The Agile Architecture Revolution (John Wiley & Sons, 2013), sets the stage for Mr. Bloomberg’s groundbreaking Agile Architecture vision.

Mr. Bloomberg is perhaps best known for his twelve years at ZapThink, where he created and delivered the Licensed ZapThink Architect (LZA) SOA course and associated credential, certifying over 1,700 professionals worldwide. He is one of the original Managing Partners of ZapThink LLC, the leading SOA advisory and analysis firm, which was acquired by Dovel Technologies in 2011. He now runs the successor to the LZA program, the Bloomberg Agile Architecture Course, around the world.

Mr. Bloomberg is a frequent conference speaker and prolific writer. He has published over 500 articles, spoken at over 300 conferences, Webinars, and other events, and has been quoted in the press over 1,400 times as the leading expert on agile approaches to architecture in the enterprise.

Mr. Bloomberg’s previous book, Service Orient or Be Doomed! How Service Orientation Will Change Your Business (John Wiley & Sons, 2006, coauthored with Ron Schmelzer), is recognized as the leading business book on Service Orientation. He also co-authored the books XML and Web Services Unleashed (SAMS Publishing, 2002), and Web Page Scripting Techniques (Hayden Books, 1996).

Prior to ZapThink, Mr. Bloomberg built a diverse background in eBusiness technology management and industry analysis, including serving as a senior analyst in IDC’s eBusiness Advisory group, as well as holding eBusiness management positions at USWeb/CKS (later marchFIRST) and WaveBend Solutions (now Hitachi Consulting).

@BigDataExpo Stories
SYS-CON Events announced today that CA Technologies has been named “Platinum Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business – from apparel to energy – is being rewritten by software. From ...
SYS-CON Events announced today that Technologic Systems Inc., an embedded systems solutions company, will exhibit at SYS-CON's @ThingsExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Technologic Systems is an embedded systems company with headquarters in Fountain Hills, Arizona. They have been in business for 32 years, helping more than 8,000 OEM customers and building over a hundred COTS products that have never been discontinued. Technologic Systems’ pr...
SYS-CON Events announced today that Auditwerx will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Auditwerx specializes in SOC 1, SOC 2, and SOC 3 attestation services throughout the U.S. and Canada. As a division of Carr, Riggs & Ingram (CRI), one of the top 20 largest CPA firms nationally, you can expect the resources, skills, and experience of a much larger firm combined with the accessibility and attent...
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
SYS-CON Events announced today that Loom Systems will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2015, Loom Systems delivers an advanced AI solution to predict and prevent problems in the digital business. Loom stands alone in the industry as an AI analysis platform requiring no prior math knowledge from operators, leveraging the existing staff to succeed in the digital era. With offices in S...
In his session at Cloud Expo, Alan Winters, an entertainment executive/TV producer turned serial entrepreneur, will present a success story of an entrepreneur who has both suffered through and benefited from offshore development across multiple businesses: The smart choice, or how to select the right offshore development partner Warning signs, or how to minimize chances of making the wrong choice Collaboration, or how to establish the most effective work processes Budget control, or how to max...
MongoDB Atlas leverages VPC peering for AWS, a service that allows multiple VPC networks to interact. This includes VPCs that belong to other AWS account holders. By performing cross account VPC peering, users ensure networks that host and communicate their data are secure. In his session at 20th Cloud Expo, Jay Gordon, a Developer Advocate at MongoDB, will explain how to properly architect your VPC using existing AWS tools and then peer with your MongoDB Atlas cluster. He'll discuss the secur...
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
SYS-CON Events announced today that Infranics will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Since 2000, Infranics has developed SysMaster Suite, which is required for the stable and efficient management of ICT infrastructure. The ICT management solution developed and provided by Infranics continues to add intelligence to the ICT infrastructure through the IMC (Infra Management Cycle) based on mathemat...
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), will provide an overview of various initiatives to certifiy the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldw...
In his General Session at 16th Cloud Expo, David Shacochis, host of The Hybrid IT Files podcast and Vice President at CenturyLink, investigated three key trends of the “gigabit economy" though the story of a Fortune 500 communications company in transformation. Narrating how multi-modal hybrid IT, service automation, and agile delivery all intersect, he will cover the role of storytelling and empathy in achieving strategic alignment between the enterprise and its information technology.
The essence of cloud computing is that all consumable IT resources are delivered as services. In his session at 15th Cloud Expo, Yung Chou, Technology Evangelist at Microsoft, demonstrated the concepts and implementations of two important cloud computing deliveries: Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). He discussed from business and technical viewpoints what exactly they are, why we care, how they are different and in what ways, and the strategies for IT to transi...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor - all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Extreme Computing is the ability to leverage highly performant infrastructure and software to accelerate Big Data, machine learning, HPC, and Enterprise applications. High IOPS Storage, low-latency networks, in-memory databases, GPUs and other parallel accelerators are being used to achieve faster results and help businesses make better decisions. In his session at 18th Cloud Expo, Michael O'Neill, Strategic Business Development at NVIDIA, focused on some of the unique ways extreme computing is...
My team embarked on building a data lake for our sales and marketing data to better understand customer journeys. This required building a hybrid data pipeline to connect our cloud CRM with the new Hadoop Data Lake. One challenge is that IT was not in a position to provide support until we proved value and marketing did not have the experience, so we embarked on the journey ourselves within the product marketing team for our line of business within Progress. In his session at @BigDataExpo, Sum...
Information technology (IT) advances are transforming the way we innovate in business, thereby disrupting the old guard and their predictable status-quo. It’s creating global market turbulence. Industries are converging, and new opportunities and threats are emerging, like never before. So, how are savvy chief information officers (CIOs) leading this transition? Back in 2015, the IBM Institute for Business Value conducted a market study that included the findings from over 1,800 CIO interviews ...