Welcome!

@BigDataExpo Authors: Elizabeth White, Talend Inc., Liz McMillan, Ferhat Hatay, Anders Påls

Related Topics: @CloudExpo, Java IoT, Linux Containers, Containers Expo Blog, Cloud Security, SDN Journal

@CloudExpo: Article

Why Intelligent VM Routing Is Critical to Your Private Cloud’s Success

Hosting decisions are far too important to be left to simplistic, best-efforts approaches

Virtualized and private cloud infrastructures are all about sharing resources - compute, storage and network. Optimizing these environments comes down to the ability to properly balance capacity supply and application demand. In practical terms, this means allocating the right amount of resources and putting workloads in the right places. These decisions are critical to ensuring performance, compliance and cost control.

Yet most organizations are using antiquated methods such as home-grown spreadsheets and best guesses to determine which infrastructure to host workloads on and how much capacity to allocate. Not only do these approaches hinder operational agility, but as hosting decisions become more and more complex, they are downright dangerous. The typical strategy employed to stave off risk is to over-provision infrastructure, and the thinking behind this is that having an excess of capacity on hand will ensure that enough resource is available to avoid any performance problems. This is not only expensive, but it actually doesn't prevent key operational issues and many of the performance and compliance issues that are caused by incorrectly combining workloads.

In essence, this management challenge is the same one faced by hotel operators. Hoteliers need to constantly align guest demands with hotel resources and amenities. A hotel could not operate without a reservation system to manage resource availability and match that with guest needs, and yet this is exactly how companies manage their virtual and internal cloud environments. Imagine if a hotel didn't have the operational control provided by their reservation system, and was constantly forced to build more rooms than necessary in order to meet "potential" guest demands, rather than basing their decision on an actual profile of historical and predicted demand. Or if they put clients in rooms without enough beds or required amenities. This should start sounding familiar to anyone who has managed a production virtual environment.

Hotels have had the luxury of a long history to refine their operations, and by using reservations systems to properly place guests and manage current and future bookings, they have gained a complete picture of available resources at any point in time. In doing so, they have optimized their ability to plan for and leverage available capacity, achieving the right balance between supply and demand.

Why Workload Routing and Reservations are Important
By applying the same principles used to manage a hotel's available capacity to their own operations, IT organizations can significantly reduce risk and cost while ensuring service levels in virtual and cloud infrastructures. There are five reasons why the process of workload routing and capacity reservation must become a core, automated component of IT planning and management:

1. Complexity of the Hosting Decision
Hosting decisions are all about optimally aligning supply with demand. However, this is very complex in modern infrastructures, where capabilities can vary widely, and the requirements of the workloads may have a significant impact on what can go where. To make the optimal decision, there are three important questions that must be asked:

  • Do the infrastructure capabilities satisfy the workload requirements? This is commonly referred to as "fit for purpose," and is required to determine whether the hosting environment is suitable for the kind of workload being hosted. This question has not always been top of mind in the past, as the typical process to deploy new applications has been to procure new infrastructure with very detailed specifications. But the increasing use of shared environments is changing this, and understanding the specifications of the currently running hosting environments is critical. Unfortunately, early virtual environments tended to be one-size-fits-all, and early internal clouds tended to focus on dev/test workloads, so fit for purpose decisions rarely extended beyond ensuring the environment has the right CPU architecture.
  • Will the workloads fit? While the fit for purpose analysis is concerned with whether a target environment has the right kind of capacity, this aspect of making hosting decisions is concerned with whether there is sufficient free capacity to host the workloads. This is a more traditional capacity problem, but with a twist, as virtual and cloud environments are by nature shared environments, and the capacity equation is multi-dimensional. Resources such as CPU, memory, disk, I/O, network I/O, storage capacity, etc., must be considered, as well as looking at the levels and patterns of activity to ensure that the new workloads are "dovetailing" with the existing ones. Furthermore, any analysis of capacity must also ensure that the workload will fit at the point in time it will be deployed and it must continue to fit beyond that time.
  • What is the relative cost? While fit and suitability are critical to where to host a workload, in a tiebreaker the main issue becomes relative cost. While many organizations are still not sophisticated enough to have an accurate chargeback model in place, a more precise way to determine cost may be to consider the relative cost of hosting a workload as a function of policy and placement.

2. Capacity Supply and Application Demand are Dynamic
Nothing stands still in virtualized IT environments, and any decisions must be made in the context of ever-changing technologies, hardware specs, service catalogs, application requirements and workloads. This is becoming even more prevalent in the age of the software-defined data center.

Because of this, capacity must be viewed as a pipeline, with inbound demands, inbound supply side capacity, outbound demands and decommissioned capacity all being part of the natural flow of activity. Handling this flow is a key to achieving agility, which is a goal in the current breed of virtual and cloud hosting infrastructure. The ability to efficiently react to changing needs is critical, and the lack of agility in legacy environments is really a reflection of the fact that previous approaches did not operate as a pipeline. If it currently takes two to three months to get capacity, then it is a clear indication that there is no pipeline in place.

3. Meeting Your Customers Expectations
Application owners today have expectations that capacity will be available when required, so it's necessary for IT to have a way to hold capacity for planned workload placements to be available on the date of deployment (like advance booking a hotel room).

Sometimes the concept of a capacity reservation is equated with the draw-down on a pool of resources or a quota that has been assigned to a consumer or internal group. This is dangerous, as it simply ensures that a specific amount of resources will not be exceeded, and does not guarantee that actual resources will be available. This is analogous to getting a coupon from a store that says "limit 10 per customer" - it in no way guarantees that there will be any product left on the shelf. Organizations should beware of these types of reservations, as they can give a false sense of security.

Capacity reservations are extremely useful to those managing the infrastructure capacity. They provide an accurate model of the pipeline of demand, which allows for much more efficient, accurate and timely purchasing decisions. Simply put, less idle capacity needs to be left on the floor. It also allows infrastructure to be managed as a portfolio, and if a certain mix of resources is needed to satisfy the overall supply and demand balance (such as buying servers with more memory), then procurement can factor this in.

4. Even Self-Service Needs Reservations
Self-service can create a highly volatile demand pipeline. But a bigger issue with self-service models is the way organizations perceive them. Many early cloud implementations focus on dev/test users or more grid-type workloads, and the entire approach to delivering capacity takes on a last-minute, unplanned flavor. But these are not the only kinds of workloads - or even the most common - and for a cloud to become a true "next-generation" hosting platform it must also support enterprise applications and proper release planning processes.

The heart of the issue is a tendency for organizations to equate self-service with instant provisioning. Although instant provisioning is useful for dev/test, grid and other horizontal scaling scenarios, it is not the only approach. For example, an online hotel reservation site provides self-service access to hotel rooms, but these rooms are not often being booked for that night. For business trips, conferences and even vacations, you book ahead. The same process must be put into place for hosting workloads.

Rather than narrowly defining self-service as the immediate provisioning of capacity, it is better to focus on the intelligent provisioning of capacity, which may or may not be immediate. For enterprise workloads with proper planning cycles and typical lead times, reservations are far more important than instant provisioning. And deciding where the application should be hosted in the first place is a solution critical decision that is often overlooked. Unless an organization has only one hosting environment, the importance (and difficulty) of this should not be underestimated.

5. Demand Is Global
There is a huge benefit to thinking big when it comes to making hosting decisions. The long-term trend will undoubtedly be to start thinking beyond the four walls of an organization and make broader hosting decisions that include external cloud providers, outsourcing models and other potential avenues of efficiency. But the use of external capacity is still a distant roadmap item in many IT organizations, and the current focus tends to be on making the best use of existing capacity and purchasing dollars.

Operating in scale also allows certain assumptions to be challenged, such as the requirement for an application to be hosted at a specific geographical location. Geographical constraints should be fully understood and properly identified, and not simply assumed based on past activity or server-hugging paranoia. Some workloads do have specific jurisdictional constraints, compliance requirements or latency sensitivities, but many have a significant amount of leeway in this regard, and to constrain them unnecessarily ties up expensive data center resources.

Unfortunately, the manual processes and spreadsheet-based approaches in use in many organizations are simply not capable of operating at the necessary scale, and cannot properly model the true requirements and constraints of a workload. This not only means that decisions are made in an overly narrow context, but that the decisions that are made are likely wrong.

Moving Past Your "Gut"
Hosting decisions are far too important to be left to simplistic, best-efforts approaches. Where a workload is placed and how resources are assigned to it is likely the most important factor in operational efficiency and safety, and is even more critical as organizations consider cloud hosting models. These decisions must be driven by the true requirements of the applications, the capabilities of the infrastructure, the policies in force and the pipeline of activity. They should be made in the context of the global picture, where all supply and demand can be considered and all hosting assumptions challenged. And they should be made in software, not brains, so they are repeatable, accurate and can drive automation.

More Stories By Andrew Hillier

Andrew Hillier is CTO and co-founder of CiRBA, Inc., a data center intelligence analytics software provider that determines optimal workload placements and resource allocations required to safely maximize the efficiency of Cloud, virtual and physical infrastructure. Reach Andrew at [email protected]

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@BigDataExpo Stories
Predictive analytics tools monitor, report, and troubleshoot in order to make proactive decisions about the health, performance, and utilization of storage. Most enterprises combine cloud and on-premise storage, resulting in blended environments of physical, virtual, cloud, and other platforms, which justifies more sophisticated storage analytics. In his session at 18th Cloud Expo, Peter McCallum, Vice President of Datacenter Solutions at FalconStor, will discuss using predictive analytics to ...
Let’s face it, embracing new storage technologies, capabilities and upgrading to new hardware often adds complexity and increases costs. In his session at 18th Cloud Expo, Seth Oxenhorn, Vice President of Business Development & Alliances at FalconStor, will discuss how a truly heterogeneous software-defined storage approach can add value to legacy platforms and heterogeneous environments. The result reduces complexity, significantly lowers cost, and provides IT organizations with improved effi...
The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, will provide an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data profes...
SYS-CON Events announced today that Men & Mice, the leading global provider of DNS, DHCP and IP address management overlay solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. The Men & Mice Suite overlay solution is already known for its powerful application in heterogeneous operating environments, enabling enterprises to scale without fuss. Building on a solid range of diverse platform support,...
Data-as-a-Service is the complete package for the transformation of raw data into meaningful data assets and the delivery of those data assets. In her session at 18th Cloud Expo, Lakshmi Randall, an industry expert, analyst and strategist, will address: What is DaaS (Data-as-a-Service)? Challenges addressed by DaaS Vendors that are enabling DaaS Architecture options for DaaS
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Avere delivers a more modern architectural approach to storage that doesn’t require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbuilding of data centers ...
With the proliferation of both SQL and NoSQL databases, organizations can now target specific fit-for-purpose database tools for their different application needs regarding scalability, ease of use, ACID support, etc. Platform as a Service offerings make this even easier now, enabling developers to roll out their own database infrastructure in minutes with minimal management overhead. However, this same amount of flexibility also comes with the challenges of picking the right tool, on the right ...
SYS-CON Events announced today that Catchpoint Systems, Inc., a provider of innovative web and infrastructure monitoring solutions, has been named “Silver Sponsor” of SYS-CON's DevOps Summit at 18th Cloud Expo New York, which will take place June 7-9, 2016, at the Javits Center in New York City, NY. Catchpoint is a leading Digital Performance Analytics company that provides unparalleled insight into customer-critical services to help consistently deliver an amazing customer experience. Designed...
SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 ad...
Recognizing the need to identify and validate information security professionals’ competency in securing cloud services, the two leading membership organizations focused on cloud and information security, the Cloud Security Alliance (CSA) and (ISC)^2, joined together to develop an international cloud security credential that reflects the most current and comprehensive best practices for securing and optimizing cloud computing environments.
Companies can harness IoT and predictive analytics to sustain business continuity; predict and manage site performance during emergencies; minimize expensive reactive maintenance; and forecast equipment and maintenance budgets and expenditures. Providing cost-effective, uninterrupted service is challenging, particularly for organizations with geographically dispersed operations.
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
SYS-CON Events announced today that VAI, a leading ERP software provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. VAI (Vormittag Associates, Inc.) is a leading independent mid-market ERP software developer renowned for its flexible solutions and ability to automate critical business functions for the distribution, manufacturing, specialty retail and service sectors. An IBM Premier Business Part...
In most cases, it is convenient to have some human interaction with a web (micro-)service, no matter how small it is. A traditional approach would be to create an HTTP interface, where user requests will be dispatched and HTML/CSS pages must be served. This approach is indeed very traditional for a web site, but not really convenient for a web service, which is not intended to be good looking, 24x7 up and running and UX-optimized. Instead, talking to a web service in a chat-bot mode would be muc...
It's easy to assume that your app will run on a fast and reliable network. The reality for your app's users, though, is often a slow, unreliable network with spotty coverage. What happens when the network doesn't work, or when the device is in airplane mode? You get unhappy, frustrated users. An offline-first app is an app that works, without error, when there is no network connection.
With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things. In his session at @ThingsExpo, Don DeLoach, CEO and President of Infobright, will discuss the infrastructures businesses will need to implement to handle this explosion of data by providing specific use cases for filte...
Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
Fortunately, meaningful and tangible business cases for IoT are plentiful in a broad array of industries and vertical markets. These range from simple warranty cost reduction for capital intensive assets, to minimizing downtime for vital business tools, to creating feedback loops improving product design, to improving and enhancing enterprise customer experiences. All of these business cases, which will be briefly explored in this session, hinge on cost effectively extracting relevant data from ...
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies adopt disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevO...