Features and automation capabilities in modern cloud platforms make it easy to scale applications. In the cloud, teams can define simple rules that enable the application infrastructure to dynamically expand and contract based on common utilization metrics. Or, with the click of a button, they can manually scale up their compute resources.
Many organizations, however, run workloads in their own on-premises data centers and don't have access to the sophisticated controls that enable app scaling. On top of that, not every application is built in a way that's suitable for cloud scalability. So, which key considerations do IT teams need to remember when it comes to app scaling, both in the cloud and in on-premises data centers?
Manually scaling application infrastructure
In an earlier era, IT teams deployed static applications architectures that were built with enough overhead to accommodate two to three years of growth. Sometimes the number of end users grew faster than expected, and IT would add more memory, storage or CPU resources to the existing servers. Other times, they'd need to add one or more servers to the application environment to meet the increase in demand.
These concepts still apply today, and virtualization technology has made it even easier to adapt to these types of changes. In an on-premises data center, administrators can typically add memory, storage or CPU resources to existing virtual machines with ease.
Running servers on public cloud platforms provides a similar option. For example, an Amazon Web Services (AWS) Elastic Compute Cloud instance uses a pre-defined VM size with a set number of virtual CPUs and fixed amount of memory. If needed, an administrator can change that VM size at any time. There's a wide array of VM sizes available, and scaling up to a size with more virtual CPUs and memory requires only a couple of clicks or a single API call from the command line. The same idea goes for storage in the cloud. Administrators can easily add capacity or increase performance for existing storage volumes.
The allure of autoscaling
The ability to automatically scale compute resources has been a powerful weapon for many organizations using the public cloud. Though far from a typical user, Netflix provides a useful example. The video streaming giant famously uses AWS cloud services, an arrangement that saves it a nearly incalculable amount of money. Why? Because autoscaling enables administrators to define rules so that compute resources scale dynamically. This is invaluable for Netflix since many more users access its content in the evening hours than in the daytime.
Autoscaling has changed the game because a business only pays for compute resources as required. Once demand for those applications drops, those additional resources can simply be terminated.
Autoscaling groups were popularized by AWS, but other platforms provide this concept as well. Microsoft Azure has a similar feature called Virtual Machine Scale Sets. On-premises private cloud platforms support this, too. Open Stack has support for autoscaling, and Microsoft's upcoming Azure Stack platform will support Virtual Machine Scale Sets.
In general, each platform enables administrators to define rules to automatically provision new servers. For example, say a web application has a minimum of two web servers sitting behind a load balancer. If CPU usage rises above 70% for five minutes, one or more web servers can be added to the environment to alleviate the pressure. When CPU activity drops back to, say, 30%, one or more of those extra servers terminates automatically.
It's also possible to conduct app scaling on a schedule.
Not all workloads are created equal
When evaluating an approach to application scaling, teams need to take a close look at the workload in question. For manual-scale scenarios, this isn't hard. Simply monitor your servers, add resources or change the VM size to accommodate the additional load as needed.
Autoscaling is typically where people get confused.
Some applications are intended to have a long lifespan. This is common in business operations. Inflexible workloads with a long life expectancy aren't good candidates for automated app scaling. The idea with autoscaling is that servers can come and go as needed. Stateless servers that persist their data outside of the autoscaling group will be easier to deal with.
The other important consideration for servers in an autoscaling group is launch time. If you want to conduct app scaling based on demand for a particular application, new servers need to come online quickly. If it takes 45 minutes to bootstrap a new server, you’ve defeated the purpose.
So, you're thinking about deploying apps in containers?
Application portability is workable in certain situations
How to modernize that legacy application