In an ideal developer/systems administrator’s world, most applications would deploy seamlessly to multiple platforms and scale elastically with minimal effort bringing the unprecedented agility of the cloud within immediate reach of developer teams and IT organizations.
OpenStack, a RackSpace and NASA initiative, is now managed by an independent foundation and is supported by multiple vendors. It defines APIs for compute, storage, networking, services, monitoring, and additional infrastructure...| By Michael Kopp | Article Rating: |
|
| February 9, 2013 12:00 PM EST | Reads: |
2,412 |
The other day I was looking at a web application that was using MongoDB as its central database. We were analyzing the application for potential performance problems and inside five minutes I detected what I must consider to be a MongoDB anti pattern and had a 40% impact on response time. The funny thing: It was a Java best practice that triggered it.
Analyzing the Application
The first thing I always do is look at the topology of an application to get a feel for it.

Overall Transaction Flow of the Application
As we see it's a modestly complex web application and it's using MongoDB as its datastore. Overall MongoDB contributes about 7% to the response time of the application. I noticed that about half of all transactions are actually calling MongoDB so I took a closer look.

Flow of Transactions that access MongoDB, showing 10% response time contribution of MongoDB
Those transactions that actually do call MongoDB spend about 10% of their response time in that popular document database. As a next step I wanted to know what was being executed against MongoDB.

Overview of all MongoDB commands. This shows that the JourneyCollection find and getCount contribute the most to response time
One immediately notices the first two lines, which contribute much more to the response time per transaction than all the others. What was interesting was that thegetCount on the JourneyCollection had the highest contribution time, but the developer responsible was not aware that he was even using it anywhere.
Things get interesting - the mysterious getCount call
Taking things one level deeper, we looked at all transactions that were executing the ominous getCount on the JourneyCollection.

Transactions that call JourneyCollection.getCount spend nearly half their time in MongoDB
What jumps out is that those particular transactions spend indeed over 40% of their time in MongoDB, so there was a big potential for improvement here. Another click and we looked at all MongoDB calls that were executed within the context of the same transaction as the getCount call we found so mysterious.

All MongoDB Statements that run within the same transaction context as the JourneyCollection.getCount
What struck us as interesting was that the number of executions per transaction of thefind and getCount on the JourneyCollection seemed closely connected. At this point we decided to look at the transactions themselves - we needed to understand why that particular MongoDB call was executed.

Single Transactions that execute the ominous getCount call
It's immediately clear that several different transaction types are executing that particulargetCount. What that meant for us is that the problem was likely in the core framework of that particular application rather than being specific to any one user action. Here is the interesting snippet:

The Transaction Trace shows where the getCount is executed exactly
We see that the WebService findJourneys spends all its time in the two MongoDB calls. The first is the actual find call to the Journey Collection. The MongoDB client is good at lazy loading, so the find does not actually do much yet. It only calls the server once we access the result set. We can see the round trip to MongoDB visualized in the call node at the end.
We also see the offending getCount. We see that it is executed by a method called sizewhich turns out to be com.mongodb.DBCursor.size method. This was news to our developer. Looking at several other transactions we found that this was a common pattern. Every time we search for something in the JourneyCollection the getCountwould be executed by com.mongodb.DBCursor.size. This always happens before we would really execute the send the find command to the server(which happens in the callmethod). So we used CompuwareAPM DTM's (a.k.a dynaTrace) developer integration and took a look at the offending code. Here is what we found:
BasicDBObject fields = new BasicDBObject();
fields.put(journeyStr + "." + MongoConstants.ID, 1);
fields.put(MongoConstants.ID, 0);
Collection locations = find(patternQuery, fields);
ArrayList results = new ArrayList(locations.size());
for (DBObject dbObject : locations) {
String loc = dbObject.getString(journeyStr);
results.add(loc);
}
return results;
The code looks harmless enough; we execute a find, create an array for the result and fill it. The offender is the location.size(). MongoDBs DBCursor is similar to the ResultSet in JDBC, it does not return the whole data set at once, but only a subset. As a consequence it doesn't really know how many elements the find will end up with. The only way for MongoDB to determine the final size seems to be to execute a getCountwith the same criteria as the original find. In our case that additional unnecessary roundtrip made up 40% of the web services response time!
An Anti-Patter triggered by a Best Practice
So it turns out that calling size on the DBCursor must be considered an anti-pattern! The real funny thing is that the developer thought he was writing performant code. He was following the best practice to pre-size arrays. This avoids any unnecessary re-sizing. In this particular case however, that minor theoretical performance improvement led to a 40% performance degradation!
Conclusion
The take away here is not that MongoDB is bad or doesn't perform. In fact the customer is rather happy with it. But mistakes happen and similar to other database applications we need to have the visibility into a running application to see how much it contributes to the overall response time. We also need to have that visibility to understand which statements are called where and why.
In addition this also demonstrates nicely why premature micro optimization, without leveraging an APM solution, in production will not lead to better performance. In some cases - like this one - it can actually lead to worse performance.
Published February 9, 2013 Reads 2,412
Copyright © 2013 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Michael Kopp
Michael Kopp has over 12 years of experience as an architect and developer in the Enterprise Java space. Before coming to CompuwareAPM dynaTrace he was the Chief Architect at GoldenSource, a major player in the EDM space. In 2009 he joined dynaTrace as a technology strategist in the center of excellence. He specializes application performance management in large scale production environments with special focus on virtualized and cloud environments. His current focus is how to effectively leverage BigData Solutions and how these technologies impact and change the application landscape.
In an ideal developer/systems administrator’s world, most applications would deploy seamlessly to multiple platforms and scale elastically with minimal effort bringing the unprecedented agility of the cloud within immediate reach of developer teams and IT organizations.
OpenStack, a RackSpace and NASA initiative, is now managed by an independent foundation and is supported by multiple vendors. It defines APIs for compute, storage, networking, services, monitoring, and additional infrastructure...May. 19, 2013 05:00 PM EDT Reads: 1,378 |
By Jeremy Geelan Companies around the world are moving into on-premise private cloud environments. Many connect their private cloud to their public cloud service providers. In his session at 12th Cloud Expo | Cloud Expo New York [June 10-13], Brian Patrick Donaghy will talk about examples of what worked, what failed and why we should think about this evolution.May. 19, 2013 04:00 PM EDT Reads: 1,902 |
By Liz McMillan Enterprise cloud adoption revolves around pushing the BYOD movement and focusing on data security.
In his session at the 12th International Cloud Expo, Ross Brouse, COO and President of Solar VPS, will cover how cloud adoption is driven by consumerism, humanity’s need to socialize, our addiction to new gadgets and the ability of data to stay secure in a growing collaborative world. The cloud is a drug and we’re just getting hooked.
Ross Brouse is the COO and President of Solar VPS. He is a tr...May. 19, 2013 02:00 PM EDT Reads: 1,225 |
By Jeremy Geelan Organizations across the world are increasingly starting to see the benefits of moving more and more services to the cloud. The focus on the cost-saving potential of cloud is rapidly shifting to completely transforming the business with cloud. As organizations are investing enormous sums on technology they are starting to realize that in order to maximize the return on investment and accelerate the business transformation process the first area of focus should be people. By ensuring the organiza...May. 19, 2013 02:00 PM EDT Reads: 1,600 |
By Jeremy Geelan May. 19, 2013 02:00 PM EDT Reads: 2,415 |
By Jeremy Geelan May. 19, 2013 01:00 PM EDT Reads: 3,521 |
By Jeremy Geelan Our more interconnected planet is accelerating the adoption and convergence of next-generation architectures, in the form of cloud, mobile and instrumented physical assets. Organizations that can effectively balance optimization and innovation, will be in a position to leverage new systems of engagement, out maneuver their peers and achieve desired outcomes. In the Opening Keynote at 12th Cloud Expo | Cloud Expo New York, IBM GM & Next Generation Platform CTO Dr Danny Sabbah will detail the crit...May. 19, 2013 01:00 PM EDT Reads: 2,862 |
By Pat Romanski The cloud-enabled data center sits at the center of IT transformation. It facilitates the interconnection and communities that come together, propelling growth for both buyers and sellers.
In his session at the 12th International Cloud Expo, Gerry Fassig, CoreSite’s Vice President of Sales, will discuss how CoreSite is bringing together best-of-breed partners through the Open Cloud Exchange resulting in public, private, and hybrid cloud interconnection and management as well as connectivity to...May. 19, 2013 01:00 PM EDT Reads: 1,293 |
By Jeremy Geelan Companies around the world are collecting massive amounts of data everyday that’s sitting around and not being utilized. Take for example the fact that companies collect demographic and location-based data via mobile devices all the time, but have to figure out how to monetize that data. In this session, Joyent CTO and founder Jason Hoffman will examine the state of Big Data, taking a look at what we're doing now to discussing what's on the horizon, as companies prepare and realign their busines...May. 19, 2013 01:00 PM EDT Reads: 1,114 |
By Jeremy Geelan The massive computing and storage resources that are needed to support big data applications make cloud environments an ideal fit. In Nati Shalom's upcoming session at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], you'll learn how to build your big data "database on-demand" using MongoDB, Cassandra, Solr, MySQL, or any other big data solution, as well as manage your big data application using a new open source framework called “Cloudify.” All this, on top of the OpenStack cloud. May. 19, 2013 12:00 PM EDT Reads: 2,401 |
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York Speaker Profile: Nicos Vekiarides – TwinStrata
- AMD and Adobe Collaborate on Upcoming Version of Adobe Premiere Pro Software to Enable Breakthrough Video Editing Performance Through Open Standards
- Windows Azure IaaS Reaches General Availability
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Big Data Isn’t About the Database, It’s About the Application
- Cloudant to Exhibit at Cloud Expo & Big Data Expo New York
- Cloud Expo New York: Rethink IT and Reinvent Business with IBM SmartCloud
- Predixion Software Announces General Availability of the Latest Version of its Predictive Analytics Platform
- The Accessibility of the Cloud
- Cloud Expo New York | Danger Ahead: Why File Sync Is NOT Endpoint Backup
- Cloud Computing Is Simplifying Things
- Cloud Expo New York: Best CIO Practices Shared from SHI’s Customers
- Examining the True Cost of Big Data
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York: How to Use Google Apps Script
- Cloud Expo New York Speaker Profile: Nicos Vekiarides – TwinStrata
- AMD and Adobe Collaborate on Upcoming Version of Adobe Premiere Pro Software to Enable Breakthrough Video Editing Performance Through Open Standards
- Windows Azure IaaS Reaches General Availability
- Rackspace Hosting Named “Platinum Plus Sponsor” of Cloud Expo New York
- The Cover and the Epilogue of the Upcoming Book
- Cloud Expo New York: Why Big Data Is Really About Small Data
- Scripps Networks Interactive’s Popular Lifestyle Shows from HGTV, DIY Network, Food Network, Cooking Channel and Travel Channel Coming to Prime Instant Video and Amazon Instant Video
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Cloud Expo New York: Best CIO Practices Shared from SHI’s Customers
- Cloud Computing and Big Data in 2013: What's Coming Next?
- Think You Heard It All About The Best of the Best from CES? Well, Think Again ... My eHome® -- the Gotta-Have-It Multi-Play Solution -- Targeted for Launch in First Quarter 2014
- Examining the True Cost of Big Data
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Best Practices: The Role of API Management
- OpenFeint Co-Founder Peter Relan Launches OpenKit: A Backend-as-a-Service for Cross Platform Mobile Developers Seeking Cloud Data Storage, Leaderboards, Social Network Integration and More
- Cloud Expo New York: How to Use Google Apps Script
- MapR Technologies' Senior Principal Technologist to Present at the Upcoming Telecom Analytics Conference
- Cloud Expo New York Speaker Profile: Nicos Vekiarides – TwinStrata
- AMD and Adobe Collaborate on Upcoming Version of Adobe Premiere Pro Software to Enable Breakthrough Video Editing Performance Through Open Standards
- DataStax Announces Community Edition 1.2 -- Latest Version of Apache Cassandra(TM) Includes Free Version of OpsCenter, the #1 Visual Management and Monitoring Solution for Cassandra








Companies around the world are moving into on-premise private cloud environments. Many connect their private cloud to their public cloud service providers. In his session at 12th Cloud Expo | Cloud Expo New York [June 10-13], Brian Patrick Donaghy will talk about examples of what worked, what failed and why we should think about this evolution.
Enterprise cloud adoption revolves around pushing the BYOD movement and focusing on data security.
In his session at the 12th International Cloud Expo, Ross Brouse, COO and President of Solar VPS, will cover how cloud adoption is driven by consumerism, humanity’s need to socialize, our addiction to new gadgets and the ability of data to stay secure in a growing collaborative world. The cloud is a drug and we’re just getting hooked.
Ross Brouse is the COO and President of Solar VPS. He is a tr...
Organizations across the world are increasingly starting to see the benefits of moving more and more services to the cloud. The focus on the cost-saving potential of cloud is rapidly shifting to completely transforming the business with cloud. As organizations are investing enormous sums on technology they are starting to realize that in order to maximize the return on investment and accelerate the business transformation process the first area of focus should be people. By ensuring the organiza...
Our more interconnected planet is accelerating the adoption and convergence of next-generation architectures, in the form of cloud, mobile and instrumented physical assets. Organizations that can effectively balance optimization and innovation, will be in a position to leverage new systems of engagement, out maneuver their peers and achieve desired outcomes. In the Opening Keynote at 12th Cloud Expo | Cloud Expo New York, IBM GM & Next Generation Platform CTO Dr Danny Sabbah will detail the crit...
The cloud-enabled data center sits at the center of IT transformation. It facilitates the interconnection and communities that come together, propelling growth for both buyers and sellers.
In his session at the 12th International Cloud Expo, Gerry Fassig, CoreSite’s Vice President of Sales, will discuss how CoreSite is bringing together best-of-breed partners through the Open Cloud Exchange resulting in public, private, and hybrid cloud interconnection and management as well as connectivity to...
Companies around the world are collecting massive amounts of data everyday that’s sitting around and not being utilized. Take for example the fact that companies collect demographic and location-based data via mobile devices all the time, but have to figure out how to monetize that data. In this session, Joyent CTO and founder Jason Hoffman will examine the state of Big Data, taking a look at what we're doing now to discussing what's on the horizon, as companies prepare and realign their busines...
The massive computing and storage resources that are needed to support big data applications make cloud environments an ideal fit. In Nati Shalom's upcoming session at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], you'll learn how to build your big data "database on-demand" using MongoDB, Cassandra, Solr, MySQL, or any other big data solution, as well as manage your big data application using a new open source framework called “Cloudify.” All this, on top of the OpenStack cloud.
New technologies allow schools, colleges and universities to analyze absolutely everything that happens. From student behavior, testing results, career development of students as well as educational needs based on changing societies. A lot of this data has already been stored and is used for statist...
A recent Gartner study states that the function of the modern CIO is in flux and that his or her future focus must incorporate digital assets (aka cloud-based data and applications) to remain relevant. Towards the goal of riding the sea change a compiler of stacks to a broker of business needs, secu...
In the coming years, big data will change the way organisations and societies are operated and managed. Big data however, is not the only trend that will impact significantly how organisations operate. Another major trend at the moment is gamification. Gamification will change the way organisations ...
We all talk about cloud differently, but is there a way we should be speaking about this tech?
Cloud computing is now a widely reported, if not accepted, IT movement that, depending on who you talk to, has changed or is changing the way businesses utilize infrastructure.
The age of data center automation is upon us. Whether it's cloud or SDN or devops in general, automation as a means to achieve efficiency and, one hopes, free up resources that can be then redirected to focus on innovation.
As is always the case when we begin to move further upwards, abstracting ...
Windows Azure Virtual Networks offers the power to open up several cross-premises use case scenarios, including Active Directory Disaster Recovery, SQL Database Replication, Windows Server 2012 DFS-R File Replication, Accelerated Cloud File Services with BranchCache, Hybrid Web Applications and MORE...
As the infrastructure cloud market (IaaS and PaaS) continues to grow rapidly, we are seeing quite a few customers who are delivering an application – whether it is a mission-critical or SaaS application – and basing their solution on VMware.
VMware Security Cloud Encryption cloud keyboard Cloud Enc...
Have you heard of products like IBM’s InfoSphere Streams, Tibco’s Event Processing product, or Oracle’s CEP product? All good examples of commercially available stream processing technologies which help you process events in real-time.
I’ve been asked what I consider as “Big Data” versus “Small Dat...
My fellow Technical Evangelists and I have authored a content series that steps through building your very own Private Cloud by leveraging Windows Server 2012, our FREE Hyper-V Server 2012, Windows Azure Infrastructure Services ( IaaS ) and System Center 2012 Service Pack 1.
Week-by-week, we walk ...















