It's a nice problem to have: An application or business service unexpectedly explodes in popularity to the point where it stresses, if not breaks, existing infrastructure.
Business applications are volatile, spiking at different times of the day, month or year, or in response to external events, such as product promotions or favorable media mentions. Operations teams use application load balancing and modern IT platforms to dial up and throttle down capacity in such conditions.
Few organizations have a breakout of Pokémon GO proportions, but any poorly managed resource utilization spike can turn users away as quickly as they arrived. The flip side is equally dangerous: overbuilt infrastructure leads to idle systems and wasted money, whether in dedicated servers or rented cloud instances.
The options for application load balancing differ when deployed on dedicated servers versus in the cloud, both of which are examined below.
Demand management for owned virtual infrastructure
Server and network resource exhaustion is one of the oldest connectivity problems, and the traditional solution still applies: Scale out resources and distribute the load. On physical servers, resource scaling occurs at the network layer, with clusters and best available distribution.
Sophisticated application delivery controllers (ADCs) route and offload traffic based on data higher up in the IP stack, not just in the network address. Hardware ADCs remain among the most common load-balancing methods because of their convenience and advanced acceleration features, such as Secure Sockets Layer offload, TCP multiplexing and global distribution.
Hardware appliances work equally well to manage loads across both physical and virtual servers, but VMs in enterprise data centers enable the actual ADC to run as a virtual appliance on an application cluster. Whether a hardware or software load balancer is a better option depends on the capacity, features and support that the enterprise needs.
VMs and cloud instances have displaced fixed-purpose physical servers, yielding the flexibility to rapidly adjust capacity in response to demand and to uniformly spread users across the available resource pool. ADCs solve the load-distribution problem, but they don't affect the predicament of capacity vs. demand: IT operations teams must regulate the number of VMs and the database capacity for volatile loads.
This application load-balancing challenge is the domain of a VM management system, such as VMware vSphere and Microsoft Windows Server for Hyper-V, or a third-party cloud management platform, like IBM Cloud Orchestrator or RightScale Cloud Management.
VMware arguably pioneered high-availability features with vMotion Live Migration technology, which provides the foundation for nonintrusive VM addition and deletion, and Distributed Resource Scheduler (DRS), which automatically allocates workloads across a pool of resources in a vSphere cluster. DRS adjusts resource pools by demand and redirects workloads to VMs as demand changes.
Although DRS is primarily designed to optimize workload placement with fine-tuned algorithms, it often has enough capacity available with a large enough VM cluster to rebalance workloads during moderate usage spikes. For general purpose resource scaling and VM problem repair, VMware vRealize Orchestrator enables a more fully automated private cloud.
Microsoft offers similar features for Windows shops. Windows Server 2016 includes several technologies to improve scalability and uptime: rolling upgrades across a cluster; the ability to add and remove disk, memory and network resources without shutting down a server; and integrated VM load balancing. The Windows Server Performance and Resource Optimization feature monitors resource use in a VM cluster and works with System Center to automatically improve utilization.
Traditional enterprise apps that experience predictable seasonal or weekly variances are best managed with this combination of VM clusters, management automation software and ADCs.
VMware and Windows Server manage VMs running full OSes with static workloads, rather than dynamic cloud stacks that change in a matter of seconds. This works with modest application load-balancing needs, such as temporarily adding a few web front ends while the data warehouse batch application is idle. Unfortunately, it won't handle exponential growth or flash-mob usage spikes. Cloud services are better suited to those tasks.
Scale and load balancing for cloud apps
Cloud services also integrate load balancers and automatic scaling, which allow for rapid, seamless changes -- up or down -- to the resource pool. For example, one Google Cloud Platform policy uses a target utilization level as the trigger to scale an instance group. The AWS and Azure auto-scaling features work similarly. Load-balancing and automatic-scaling features typically support public-facing servers, such as web front ends, but work equally well for other application tiers, including mobile back ends and business logic servers.
CDN and other specialized services
Highly unpredictable consumer-facing applications are best deployed on the public cloud; a hybrid design in which on-premises servers supply baseline capacity and public cloud resources handle bursts is feasible, albeit harder to implement.
A content delivery network (CDN) is often ideal for public-facing applications with variable and globally distributed demand for rich media. A cloud service or a dedicated CDN provider will offload storage servers and improve client performance. Ops can significantly boost the performance of large transactional databases used by public-facing apps via read-only, in-memory caches from Redis or equivalent cloud services.
Designing infrastructure for volatile application load balancing still isn't easy, but VMs, automation software, virtual appliances and cloud services make the task easier.
Longer term, refactoring volatile applications into event-driven designs, such as the reactor pattern, can improve performance and resource efficiency. Applications built to dynamically and asynchronously respond to events open up the possibility of relying on cloud functions -- serverless computing -- such as AWS Lambda or Google Cloud Functions, rather than application load balancing.
Serverless computing does not mean workloads do not run on servers. Instead, the connection between workload and hardware is so abstracted as to take the server out of consideration. The application consumes IT resources on an extremely as-needed basis, usually on a public cloud provider's platform.