Disorder and inefficiency are the stepchildren of variability, as most IT professionals know. For decades, IT operations...
has faced problems with scaling applications under load and resiliency in the face of faults. Virtualization and cloud computing yielded two dimensions of variability: application elasticity and IT elasticity, which relate to the host resources for the app. With these interdependent things simultaneously fluctuating, is there any hope for IT capacity management to control costs and still ensure QoE?
Elastic application component deployments scale to meet demand or to increase resilience in the event of outages by spinning up multiple copies of a given component. This scalability in the app design, however, is pointless without sufficient IT resources available to support deployments. Almost a quarter of companies that deploy application components that scale under load fail to define limits for that scaling, and thus the resources needed, according to CIMI Corp. enterprise surveys.
Three steps define IT capacity management in the virtual age with a scalable app:
- Establish the range of scaling required.
- Estimate the effect on resources of scaling within the range.
- Ensure that the resources are available at an acceptable cost.
IT capacity planning and management graphs
To determine how much scalability you should build into an application, measure quality of experience (QoE) with realistic software load testing that tracks response time. Perform this exercise for every application you plan to scale, and when it's completed, plot the response times against deployed component instances to get a curve. At some point, the curve will plateau.
If enough instances of a given component are spun up to deal with the work, further copies won't improve response time, which causes a performance plateau. Another reason for a plateau is that some form of congestion has degraded QoE, and additional scaling cannot improve performance under these conditions. To determine which situation the app is in, measure the resource assignments. Look at hosting resource availability, as well as the capacity of connections to the resources, particularly to public cloud providers. Congestion in the network between hosting resources lowers QoE just as much as running out of hosting capacity does. Determine what level of resources would eliminate the congestion by examining the resource use metrics just before the plateau point. If resources are congested, the app requires new IT capacity decisions. But capacity isn't free.
When you've determined the application scaling limits, look at the cost-to-QoE relationship. QoE depends in part on user expectations, and generally speaking, the pace of productivity is related to application response time. A changing response time only matters if it actually affects productivity. As a starting point, use the curves on the graph from software load tests that show response time versus resources. Run tests at various points, starting with that plateau point and moving above and below it, and observe user productivity at these levels.
The goal is to establish a productivity range, between the point at which users' work is impaired and the point where further improvements in response time don't garner higher work output. In nearly all cases, the goal of IT capacity management is to have enough resources supporting the app to sustain operation within about 70% of that productivity range -- 20% below the impact point and 10% above the no improvement point. Tally the resource costs to maintain operation within that range. Factor in any economies of scale or volume discounts that cloud providers offer. If resource costs are too high, adjust the application's scalability range, as the business case mandates. If you've set your scaling limits as described in the IT capacity management graphs, there'd be no QoE benefit to increasing scalability beyond the initial limit.
No application runs in a vacuum
Once you've completed the IT capacity planning on a per-application basis, size the total resource needs of the enterprise. Broaden your projections to account for the fact that many different applications run and compete for resources at a given time. There's another elastic element to IT capacity management: peak demand times. The total resource requirements for a group of applications isn't always the sum of per-application demands; differences in the way users access applications through the day change how resource requirements mesh over time.
To see how use varies over time, look at application-level activity logs to determine at what rates and at what times the applications run. To track use, either measure a live operation's resource use or test against data injections at scale, timed to simulate real operation. Plot this application data against the resource usage on a time plot. The resulting graph shows total IT capacity use across all applications over time. At this point, adjust the scaling limits again to ensure that consumption does not exceed the total budget for hosting. Slip the scaling ranges for the applications that won't show resource congestion under the new range. Be sure to include network connections among data centers and to the cloud in the assets under capacity management.
This graph also helps to plot the effects of total resource economies. If total IT capacity consumption, at the all-application level, hits a better cloud pricing tier that the hosting provider offers, you can increase the higher end of the acceptable scaling range on certain applications, such as those that were set closer to the congestion point.
The individual graphs that relate application response time to resources and total resource consumption and cost over time are your primary tools for IT capacity management in an architecture with multiple forms of elasticity. The first graph set establishes a safe range for each application to run in, and the second shows the way those ranges combine to create resource demand and, therefore, anticipated cost. You can safely assign resource requirements, finalize the plan initially and then reuse the same techniques to keep capacity levels up to date.