Click here to close now.



Welcome!

@BigDataExpo Authors: Liz McMillan, Pat Romanski, Elizabeth White, Scott Allen, Harry Trott

Related Topics: Java IoT, Microservices Expo, Linux Containers, IoT User Interface, Agile Computing, @BigDataExpo

Java IoT: Article

Sync Your Timeouts: When Load Balancers Cause Database Deadlocks

Have you seen this error message before “java.sql.Exception: ORA-00060: deadlock detected while waiting for resource”?

Have you seen this error message before "java.sql.Exception: ORA-00060: deadlock detected while waiting for resource"?

This is caused when parallel updates require locks on either rows or tables in your database. I recently ran into this exception on an instance of an IBM eCommerce Server. The first thought was that there are simply too many people hitting the same functionality that updates Sales Tax Summary information - which was showing up in the call stack of the exception:

Exception stack trace showing that createOrderTaxes ran into the deadlock issue on the database

The logical conclusion would be to blame this on too many folks accessing this functionality or outdated table statistics causing update statements to run too long causing others to run into that lock. It turned out to be caused by something that wasn't that obvious and wouldn't have shown up in any Exception stack traces or log files. A misconfigured timeout setting on the load balancer caused a re-execute of the original incoming web request. While the first app server was still updating the table and holding the lock - as it had a longer timeout specified as the load balancer - the second app server tried to do the same thing causing that exception.

In this article I'll show you the steps necessary to analyze the symptoms (timeouts and client errors) and to identify and fix the root cause of the problem.

Step #1: Identifying Who and What Is Impacted
Identifying failing results is easy - start by looking at HTTP Response Codes (Web Server Access Log), Severe Log Messages (App Server Logs) or problematic Exception objects (either in log files or other monitoring tools). In our case I identified the SQL Lock Exception, a corresponding severe log message and the resulting HTTP 500 and also traced it back to the individual users and their actions that caused these issues.

Linking the errors to the User Action reveals that the problem happens when adding items to the shopping cart

This impacts our business.

Now we know that this problem impacts a critical feature in our app: Users can't add items to their cart.

Step #2: Understanding the Transaction Flow
Before drilling deeper I typically get an overview of the flow of the transaction from the browser all the way back to the database. This high-level view lets me understand which application components are involved and how they are interconnected. The transaction flow in this case highlights some interesting issues with the "Add Item to Cart" click. It appears to execute more than 33k SQL Statements for this single user interaction, causing 45% time executed just in Oracle:

Transaction Flow highlights several hotspots such as 33k SQL Executions in Total and Load Balancer (IHS) splitting up a request

This Is an Architectural Problem
Getting the full end-to-end execution path for a single user interaction (Add Item to Cart) and seeing how it "branches" out makes it really obvious how these individual problems (too many SQL, High Execution Time, ...) end up impacting end users. Just spotting the individual hotspots without having this connection would make it harder to understand the real root cause.

For steps 3 & 4, and for a list of key takeaways, click here for the full article

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@BigDataExpo Stories
Cloud Expo, Inc. has announced today that Andi Mann returns to 'DevOps at Cloud Expo 2016' as Conference Chair The @DevOpsSummit at Cloud Expo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited t...
IoT offers a value of almost $4 trillion to the manufacturing industry through platforms that can improve margins, optimize operations & drive high performance work teams. By using IoT technologies as a foundation, manufacturing customers are integrating worker safety with manufacturing systems, driving deep collaboration and utilizing analytics to exponentially increased per-unit margins. However, as Benoit Lheureux, the VP for Research at Gartner points out, “IoT project implementers often ...
Unless your company can spend a lot of money on new technology, re-engineering your environment and hiring a comprehensive cybersecurity team, you will most likely move to the cloud or seek external service partnerships. In his session at 18th Cloud Expo, Darren Guccione, CEO of Keeper Security, revealed what you need to know when it comes to encryption in the cloud.
"We work in the area of Big Data analytics and Big Data analytics is a very crowded space - you have Hadoop, ETL, warehousing, visualization and there's a lot of effort trying to get these tools to talk to each other," explained Mukund Deshpande, head of the Analytics practice at Accelerite, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
"SpeedyCloud's specialty lies in providing cloud services - we provide IaaS for Internet and enterprises companies," explained Hao Yu, CEO and co-founder of SpeedyCloud, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Creating replica copies to tolerate a certain number of failures is easy, but very expensive at cloud-scale. Conventional RAID has lower overhead, but it is limited in the number of failures it can tolerate. And the management is like herding cats (overseeing capacity, rebuilds, migrations, and degraded performance). Download Slide Deck: ▸ Here In his general session at 18th Cloud Expo, Scott Cleland, Senior Director of Product Marketing for the HGST Cloud Infrastructure Business Unit, discusse...
Machine Learning helps make complex systems more efficient. By applying advanced Machine Learning techniques such as Cognitive Fingerprinting, wind project operators can utilize these tools to learn from collected data, detect regular patterns, and optimize their own operations. In his session at 18th Cloud Expo, Stuart Gillen, Director of Business Development at SparkCognition, discussed how research has demonstrated the value of Machine Learning in delivering next generation analytics to imp...
The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, provided an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data profession...
It's easy to assume that your app will run on a fast and reliable network. The reality for your app's users, though, is often a slow, unreliable network with spotty coverage. What happens when the network doesn't work, or when the device is in airplane mode? You get unhappy, frustrated users. An offline-first app is an app that works, without error, when there is no network connection. In his session at 18th Cloud Expo, Bradley Holt, a Developer Advocate with IBM Cloud Data Services, discussed...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, wh...
Whether your IoT service is connecting cars, homes, appliances, wearable, cameras or other devices, one question hangs in the balance – how do you actually make money from this service? The ability to turn your IoT service into profit requires the ability to create a monetization strategy that is flexible, scalable and working for you in real-time. It must be a transparent, smoothly implemented strategy that all stakeholders – from customers to the board – will be able to understand and comprehe...
As organizations shift towards IT-as-a-service models, the need for managing and protecting data residing across physical, virtual, and now cloud environments grows with it. Commvault can ensure protection, access and E-Discovery of your data – whether in a private cloud, a Service Provider delivered public cloud, or a hybrid cloud environment – across the heterogeneous enterprise. In his general session at 18th Cloud Expo, Randy De Meno, Chief Technologist - Windows Products and Microsoft Part...
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life sett...
What does it look like when you have access to cloud infrastructure and platform under the same roof? Let’s talk about the different layers of Technology as a Service: who cares, what runs where, and how does it all fit together. In his session at 18th Cloud Expo, Phil Jackson, Lead Technology Evangelist at SoftLayer, an IBM company, spoke about the picture being painted by IBM Cloud and how the tools being crafted can help fill the gaps in your IT infrastructure.
University of Colorado Athletics has selected FORTRUST, Colorado’s only Tier III Gold certified data center, as their official data center and colocation services provider, FORTRUST announced today. A nationally recognized and prominent collegiate athletics program, CU provides a high quality and comprehensive student-athlete experience. The program sponsors 17 varsity teams and in their history, the Colorado Buffaloes have collected an impressive 28 national championships. Maintaining uptime...
IoT is rapidly changing the way enterprises are using data to improve business decision-making. In order to derive business value, organizations must unlock insights from the data gathered and then act on these. In their session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, and Peter Shashkin, Head of Development Department at EastBanc Technologies, discussed how one organization leveraged IoT, cloud technology and data analysis to improve customer experiences and effi...
The idea of comparing data in motion (at the sensor level) to data at rest (in a Big Data server warehouse) with predictive analytics in the cloud is very appealing to the industrial IoT sector. The problem Big Data vendors have, however, is access to that data in motion at the sensor location. In his session at @ThingsExpo, Scott Allen, CMO of FreeWave, discussed how as IoT is increasingly adopted by industrial markets, there is going to be an increased demand for sensor data from the outermos...
The cloud market growth today is largely in public clouds. While there is a lot of spend in IT departments in virtualization, these aren’t yet translating into a true “cloud” experience within the enterprise. What is stopping the growth of the “private cloud” market? In his general session at 18th Cloud Expo, Nara Rajagopalan, CEO of Accelerite, explored the challenges in deploying, managing, and getting adoption for a private cloud within an enterprise. What are the key differences between wh...
Presidio has received the 2015 EMC Partner Services Quality Award from EMC Corporation for achieving outstanding service excellence and customer satisfaction as measured by the EMC Partner Services Quality (PSQ) program. Presidio was also honored as the 2015 EMC Americas Marketing Excellence Partner of the Year and 2015 Mid-Market East Partner of the Year. The EMC PSQ program is a project-specific survey program designed for partners with Service Partner designations to solicit customer feedbac...
"Avere Systems is a hybrid cloud solution provider. We have customers that want to use cloud storage and we have customers that want to take advantage of cloud compute," explained Rebecca Thompson, VP of Marketing at Avere Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.