Welcome!

@BigDataExpo Authors: Elizabeth White, Yeshim Deniz, Pat Romanski, Liz McMillan, William Schmarzo

Related Topics: @BigDataExpo, Linux Containers, Agile Computing, @CloudExpo, FinTech Journal

@BigDataExpo: Blog Post

NoSQL: Filling the Gaps By @MapR | @CloudExpo [#Cloud #BigData]

Today's web based apps demand databases that are flexible, high performing, and easily scalable.

NoSQL: Filling the Gaps in Your Traditional Relational Database

With the many different characteristics of NoSQL databases available today, it's not always clear how to best categorize the different NoSQL offerings. Typically, though, NoSQL databases are labeled according to the associated data model, most commonly: key-value, wide-column, document, and graph. But more important than the differences between them are the reasons why they are growing in popularity as a whole. In general, NoSQL databases are meant to fill some of the capability gaps found in traditional relational database management systems (RDBMS).

Today's web-based applications are highly demanding, so the databases that hold their vast stores of information are not only expected to be very flexible in nature (supporting various data formats), but also to manage extreme performance and scaling. Enterprise architects have many great technology choices today, and it makes more sense than ever to spend the time to assess application requirements thoroughly before suggesting an appropriate database.

Initially promoted as a departure from, and improvement upon, the traditional RDBMS model, NoSQL is considered an independent and specialized database technology to support complex business needs. But NoSQL advantages don't necessarily represent an opportunity for outright replacement of RDBMSs. An enterprise architect's recommendation of RDBMS versus NoSQL comes down to the nature of the operational requirements, in other words, particular use cases. In some cases, an RDBMS is more appropriate, and sometimes quicker, for specific data management tasks compared to NoSQL, and the use of NoSQL could be problematic for the application as well as for the business. A look at some of the fundamental features of database technologies will lay the groundwork for a general understanding of how NoSQL and RDBMSs differ, and how the former can fill in the gaps of the latter.

RDBMS: transactional, consistent

RDBMSs have been around for decades and have become the foundation for many business applications today. They can guarantee good performance with a volume of thousands of transactions per second which, decades ago, was enough. They also support multi-update transactions, known by computer scientists as "ACID transactions." The ACID transaction principles--atomicity, consistency, isolation, and durability--ensure data integrity even when multiple data elements are updated together. But modern internet applications - especially those operating in real time, such as fraud detection, risk analysis, advertising, and gaming - involve millions of transactions per second, and typically don't need multi-update transactional guarantees. RDBMSs struggle to manage the sheer volume of these newer online transaction processing (OLTP) applications. RDBMS vendors are investing time, money, and effort into overcoming these issues, but for now the gap remains.

Another strength of RDBMSs is the implementation of SQL ("Structured Query Language") as the de-facto standard for data processing tasks like data query, data definition, data manipulation, etc. SQL, and thus RDBMSs, require a predefined tabular structure entailing rows and columns to ensure that data is stored in a consistent and expected format. The widespread knowledge of SQL along with the consistency of the data model make RDBMSs well-suited for business intelligence analysts. Any application that requires multi-element updates, such as those dealing with financial data, and that takes advantage of the power and ubiquity of SQL, should use an RDBMS.

NoSQL: scalable, flexible

Most NoSQL databases give up on multi-row ACID transaction capabilities to achieve cost-effective scalability and performance. For a certain class of use cases, this tradeoff is perfectly acceptable, especially when the data set has records that are independent of one another, so there is no requirement to make updates that require additional immediate updates. And with regard to the data model, NoSQL databases, in contrast with RDBMSs, use different formats to store data. This flexibility lets application developers store heterogeneous record formats in the same database, which is often needed for large-scale data sets loaded from numerous distinct sources. The most popular NoSQL data formats are key-value, wide-column, and document (like JSON). Among them, the key-value databases are the simplest, while wide-column databases provide the closest data model to RDBMS tables. NoSQL databases bypass some of the hard constraints used in RDBMS architectures to achieve data storage flexibility, scalability, and performance.

Use cases: NoSQL suitability over RDBMS

There are many applications for which the traditional ACID-driven RDBMS model is not the easiest or best option. Here's another look at what NoSQL can do:

  • A key-value database is ideal if your application requires only the storage of data for the purposes of a quick lookup. In this situation, an RDBMS is unnecessarily complex, and a key-value database will be more than sufficient to meet your application's needs. The value itself can be as simple or as complex as you require.
  • A wide-column database is ideal if your data is structured like an RDBMS table but the columns might be different across rows. This means that different types of records can be stored in the same table, and existing rows can be easily updated to accommodate new columns.
  • A document database is ideal if your data entails hierarchical objects. RDBMSs could accomplish this with the help of object relational mapping (ORM) tools, but the work is often complex, and it's not worth complicating matters when NoSQL solutions are available.
  • A graph database is ideal if your application deals with connected networks of entities or large trees. Graph databases are good at quickly identifying linkages between related entities.

Major influencing factors: scaling and performance

RDBMSs were not originally designed for horizontal scaling - in other words, adding more hardware servers to the mix -- so they get overwhelmed when the load and data increase beyond expectations. RDBMSs are not great at distributing data across servers, in the process known as "sharding." Either they don't have the ability to automatically shard, thus requiring application-level coding to distribute data, or the scale-out architecture is expensive.

Here, NoSQL solutions have an advantage. A NoSQL database does not have to break up records into distinct pieces across servers. They are always logically stored in a single place, thus simplifying the distribution process. And since there is no referential integrity required between these logical entities, NoSQL is easily able to perform automatic data sharding. Though NoSQL solutions have certain limitations when compared to the relational model, they were designed with the intention of providing high scalability. A NoSQL solution can scale horizontally over a distributed environment and support high availability.

Overall performance depends largely on the selection of appropriate technology (NoSQL or RDBMS) for a specific use case. Thus, if NoSQL is selected for the wrong use case, it can hurt the performance of the application. Apart from factors like network, caching, and disk I/O, performance is dictated by the data and its allocation across the distributed storage - and NoSQL is capable of handling large volumes of data in a clustered environment, boosting performance as a result.

It's important to remember that NoSQL is not a substitute for traditional RDBMSs, but an alternate solution to address a different set of use cases for which RDBMSs were not designed. Many NoSQL technologies exist on the market today, and if your NoSQL requirements include enterprise-grade reliability, high performance, and fast responsiveness, a production-proven NoSQL database worth investigating is MapR-DB by MapR Technologies.

More Stories By Dale Kim

Dale is Director of Industry Solutions at MapR. His technical and managerial experience includes work with relational databases, as well as non-relational data in the areas of search, content management, and NoSQL. Dale holds an MBA from Santa Clara University, and a BA in Computer Science from the UC Berkeley.

@BigDataExpo Stories
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Taica manufacturers Alpha-GEL brand silicone components and materials, which maintain outstanding performance over a wide temperature range -40C to +200C. For more information, visit http://www.taica.co.jp/english/.
SYS-CON Events announced today that TidalScale, a leading provider of systems and services, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale has been involved in shaping the computing landscape. They've designed, developed and deployed some of the most important and successful systems and services in the history of the computing industry - internet, Ethernet, operating s...
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
Transforming cloud-based data into a reportable format can be a very expensive, time-intensive and complex operation. As a SaaS platform with more than 30 million global users, Cornerstone OnDemand’s challenge was to create a scalable solution that would improve the time it took customers to access their user data. Our Real-Time Data Warehouse (RTDW) process vastly reduced data time-to-availability from 24 hours to just 10 minutes. In his session at 21st Cloud Expo, Mark Goldin, Chief Technolo...
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
SYS-CON Events announced today that TidalScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale is the leading provider of Software-Defined Servers that bring flexibility to modern data centers by right-sizing servers on the fly to fit any data set or workload. TidalScale’s award-winning inverse hypervisor technology combines multiple commodity servers (including their ass...
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
As popularity of the smart home is growing and continues to go mainstream, technological factors play a greater role. The IoT protocol houses the interoperability battery consumption, security, and configuration of a smart home device, and it can be difficult for companies to choose the right kind for their product. For both DIY and professionally installed smart homes, developers need to consider each of these elements for their product to be successful in the market and current smart homes.
Infoblox delivers Actionable Network Intelligence to enterprise, government, and service provider customers around the world. They are the industry leader in DNS, DHCP, and IP address management, the category known as DDI. We empower thousands of organizations to control and secure their networks from the core-enabling them to increase efficiency and visibility, improve customer service, and meet compliance requirements.
The session is centered around the tracing of systems on cloud using technologies like ebpf. The goal is to talk about what this technology is all about and what purpose it serves. In his session at 21st Cloud Expo, Shashank Jain, Development Architect at SAP, will touch upon concepts of observability in the cloud and also some of the challenges we have. Generally most cloud-based monitoring tools capture details at a very granular level. To troubleshoot problems this might not be good enough.
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Services at NetApp, will describe how NetApp designed a three-year program of work to migrate 25PB of a major telco's enterprise data to a new STaaS platform, and then secured a long-term contract to manage and operate the platform. This significant program blended the best of NetApp’s solutions and services capabilities to enable this telco’s successful adoption of private cloud storage and launchi...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, will discuss how given the magnitude of today's applicati...
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp emp...
In his general session at 21st Cloud Expo, Greg Dumas, Calligo’s Vice President and G.M. of US operations, will go over the new Global Data Protection Regulation and how Calligo can help business stay compliant in digitally globalized world. Greg Dumas is Calligo's Vice President and G.M. of US operations. Calligo is an established service provider that provides an innovative platform for trusted cloud solutions. Calligo’s customers are typically most concerned about GDPR compliance, applicatio...