Welcome!

@BigDataExpo Authors: Elizabeth White, Yeshim Deniz, Pat Romanski, Liz McMillan, William Schmarzo

Related Topics: @BigDataExpo, Java IoT, Microservices Expo, Machine Learning , Agile Computing, @CloudExpo

@BigDataExpo: Article

Balancing the Load

Why you need to constantly monitor application performance

A question that every online application provider will face eventually is: Does my application scale? Can I add an extra 100 users and still ensure the same user experience? If the application architecture is properly designed the easiest way is to put an additional server behind the load balancer to handle more traffic.

In this article we recount an incident that happened to one of our clients when the cause of poor application performance was eventually attributed to problems with the load balancing of the application servers.

HTTP Server (500) Errors Go Over the Roof
Around 8 am the Operations team at Rendoosia Inc. (name changed for commercial reasons) got an alert from the APM tool that one of three SharePoint servers was generating many HTTP Server (500) errors. All three servers were behind a load balancer; hence why the team decided to analyze the overall performance of all three servers with the report presented in Figure 1.

Figure 1: Overview of the three SharePoint servers behind one load balancer with some KPIs: usage, response time and number of errors; two servers show performance problems

The Operations team noticed the following issues:

  1. The x.x.x.155 server (row marked with the blue box) was under significantly lower load (7k operations compared to almost 30k per each other server) than the other two. Both the load and the number of users were equally shared over two servers: x.x.x.154 and x.x.x.156
  2. Although server x.x.x.155 had the lowest user counts it was reporting the longest processing time.
  3. Server x.x.x.156 was reporting a high number of HTTP 5xx errors (marked with red box).

The team charted the HTTP server errors and the load, counted as number of transactions, for all three server over time (see Figure 2) to get a better understanding of the current situation.

Figure 2: Distribution of the number of server errors and transaction counts over time for all three servers; one server shows a lower load

The team's first observation, based on the above-mentioned reports, was that the x.x.x.155 server, with the lowest number of users, was most likely not connected to the load balancer. In order to determine the cause of the high response time on this server the team analyzed two reports:

  • Response time for x.x.x.155 broken down into network, server and redirect times indicated that almost all the time is spent on the server (see Figure 3).
  • Drill down to the operations report to analyze the load on the server (see Figure 4) shows that one particular transaction took a lot of time to complete, resulting in low application performance and poor user experience.

Figure 3: Response time breakdown for x.x.x.155: most of the time is spent on the server

Figure 4: Drill down in the context of the x.x.x.155 server shows main KPIs per transactions executed on this server; one transaction is affected by performance problems

Next, the team analyzed the 5xx errors produced by the x.x.x.156 server. They drilled down to a PurePath of one of the transactions that were reporting these errors and learned that the problem was caused by a malfunctioning database connection pool (see Figure 5)

Figure 5: Drilldown through PurePaths to the Error details reveals that the reason behind 5xx errors is caused by the database connection pool usage

The Operations team was also curious as to how the 5xx errors produced at the  x.x.x.156 server were affecting the actual user experience. The team wondered if user operations were equally distributed between both servers connected to the load balancer. The question was whether users who were unlucky and got served by the x.x.x.156 server were stuck on that server. This kind of question was hard to answer just by looking at a single SharePoint server. The Operations team used the APM tool to answer it.

Figure 6: Users remain on the server at which they have started their session

The report in Figure 6 shows that users were usually served by the same application server. Therefore those who started their session on the x.x.x.156 server remained there, resulting in a constantly poor experience due to the bad performance of that server.

Conclusion
Modern application performance management is not only about making sure that the application and database servers are operating without problems. We also need to set up the load balancer right and monitor the network infrastructure for potential problems that affect the overall application performance.

The Operations team at Rendoosia Inc., using Compuware dynaTrace Data Center Real User Monitoring (DCRUM), could get in just a few clicks from the alert about HTTP Server (500) errors through a holistic overview of application server KPIs to a root cause of the problem.

Based on the unequal load among three application servers, Requests breakdown in Figure 1 and the number of transactions in Figure 2, the team quickly determined that the x.x.x.155 server was not properly connected to the load balancer. Additional analysis illustrated that this server was also affected by low performance of one of the operations.

This story shows us that even though only one server might be experiencing performance problems, caused by many HTTP Server errors, the load balancer will not offload that server because it is not aware of those errors. That is why Operation teams need to constantly monitor, with properly set up alerts, for such outliers in application performance; even on load balanced setups.

More Stories By Sebastian Kruk

Sebastian Kruk is a Technical Product Strategist, Center of Excellence, at Compuware APM Business Unit.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@BigDataExpo Stories
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
SYS-CON Events announced today that Dasher Technologies will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Dasher Technologies, Inc. ® is a premier IT solution provider that delivers expert technical resources along with trusted account executives to architect and deliver complete IT solutions and services to help our clients execute their goals, plans and objectives. Since 1999, we'v...
SYS-CON Events announced today that TidalScale, a leading provider of systems and services, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale has been involved in shaping the computing landscape. They've designed, developed and deployed some of the most important and successful systems and services in the history of the computing industry - internet, Ethernet, operating s...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Services at NetApp, will describe how NetApp designed a three-year program of work to migrate 25PB of a major telco's enterprise data to a new STaaS platform, and then secured a long-term contract to manage and operate the platform. This significant program blended the best of NetApp’s solutions and services capabilities to enable this telco’s successful adoption of private cloud storage and launchi...
Infoblox delivers Actionable Network Intelligence to enterprise, government, and service provider customers around the world. They are the industry leader in DNS, DHCP, and IP address management, the category known as DDI. We empower thousands of organizations to control and secure their networks from the core-enabling them to increase efficiency and visibility, improve customer service, and meet compliance requirements.
SYS-CON Events announced today that TidalScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale is the leading provider of Software-Defined Servers that bring flexibility to modern data centers by right-sizing servers on the fly to fit any data set or workload. TidalScale’s award-winning inverse hypervisor technology combines multiple commodity servers (including their ass...
As hybrid cloud becomes the de-facto standard mode of operation for most enterprises, new challenges arise on how to efficiently and economically share data across environments. In his session at 21st Cloud Expo, Dr. Allon Cohen, VP of Product at Elastifile, will explore new techniques and best practices that help enterprise IT benefit from the advantages of hybrid cloud environments by enabling data availability for both legacy enterprise and cloud-native mission critical applications. By rev...
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
As popularity of the smart home is growing and continues to go mainstream, technological factors play a greater role. The IoT protocol houses the interoperability battery consumption, security, and configuration of a smart home device, and it can be difficult for companies to choose the right kind for their product. For both DIY and professionally installed smart homes, developers need to consider each of these elements for their product to be successful in the market and current smart homes.
Companies are harnessing data in ways we once associated with science fiction. Analysts have access to a plethora of visualization and reporting tools, but considering the vast amount of data businesses collect and limitations of CPUs, end users are forced to design their structures and systems with limitations. Until now. As the cloud toolkit to analyze data has evolved, GPUs have stepped in to massively parallel SQL, visualization and machine learning.
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering ...
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere delivers a more modern architectural approach to storage that doesn't require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbui...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
Though cloud is the future of enterprise computing, a smooth transition of legacy applications and systems is critical for seamless business operations. IT professionals are eager to start leveraging the cost, scale and other benefits of cloud, but with massive investments already in place in existing infrastructure and a number of compliance and resource hurdles, it can be challenging to move to a cloud-based infrastructure.
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.