IT organizations have been told for years to treat servers like cattle, not pets. Industry leaders espouse this mindset: Automate everything you can; touch as little as possible if you want to scale up. But with advancements in cloud hosting, the server farm stocked with cattle is no longer ideal -- consider instead a fleet of varied-weight vehicles designed to dynamically handle workloads.
To treat servers as pets, admins manually tweak individual servers, rather than apply the same changes across an entire collection of instances. No one enjoys keeping a server as a pet, but it is a simple trap to fall into. It's far more efficient to maintain a herd of standardized, identical servers, which is the genesis of the cattle-not-pets mantra. Admins apply updates and patches via scripts rather than handiwork -- even if the patch is only immediately necessary on one server. To prevent issues from configuration drift to security breaches, admins update all instances simultaneously and quarantine problem servers.
A prime reason to treat servers as cattle is to combat configuration drift, said Edward Haletky, CEO of AstroArch Consulting in Austin, Texas. But it's easy for admins to break up a perfectly uniform herd of cattle into a bunch of pet cows, even when everyone acts with the best intentions. The operations administrator's key concern should be to identify the root cause of a server problem and ensure it doesn't happen again, on any server. If each server is handled individually, it becomes impossible to know what caused a given security or performance issue. "You're trying to solve it for each individual machine, instead of your whole herd of cattle," he said.
Even for an organization that handles servers as cattle rather than pets in the data center, manipulated through automated scripts from startup to spin down, admins can still personalize instances too much. It's easy to tailor a group of servers dedicated to one application. Even if everything boils down to a single scripting process, the ability to specialize configurations for those servers, effectually, makes them pet cows.
Hand control to the cloud vendor
Cloud providers push IT admins away from this track.
"It's difficult for [admins] to get used to the idea that, if you're on a platform, it can spin up instances or shut them off very abruptly," said Ken Birman, computer science professor at Cornell University in New York.
An admin may be an expert in a particular type of Linux server, able to efficiently run a specific program. But once that application moves to the cloud, the incumbent perspective must change. Cloud providers strip out personalization options from cloud instances, so IT organizations must compensate for that restriction in their application code, Birman said. "But people who try to overcustomize are defeating the point."
A highly customized cloud instance probably isn't going to work anyway. IT professionals want to squeeze the most performance out of a certain amount of money budgeted in a world of escalating costs. But a minimally configured server is also minimally efficient, and the ability to customize a resource profile, kernel settings or network topology is practically nonexistent in the cloud. If it does work, admins should expect poor performance because they're working against the design of Amazon Web Services (AWS), Microsoft Azure or the other cloud hosting vendors. A modestly overprovisioned but standardized cloud instance will offer better performance and, likely, better savings when compared to the physical costs to run that server in-house.
Public cloud providers' acceptable use policies guarantee a certain level of service "but also require that you not try to game their system," Birman said. The provider generally understands how a given application will act and runs that application on a topology of nodes best suited to fit that profile. If the admin continues to handle these server instances as pets -- or even as prized cows, managed carefully -- it will cause problems.
Here, Birman suggested, the cattle not pets metaphor must be updated to not cattle, not pets, but a fleet.
Different-sized servers that host a variety of applications are akin to a fleet of vehicles in a delivery business. Uniformity is still king, but it's inefficient to automate the exact same server for a simple application as a complex one. So, admins likely will choose each application's underlying server images differently. Bigger and more complex applications require heavier-duty servers -- call these semitrailer trucks. Lightweight applications that scale up and down quickly can reside on much smaller servers -- call these delivery cars. Translated to AWS cloud options, these resemble the difference between Elastic Compute Cloud M4 and M3 instances. Administrators must be able to template semitrailer trucks as efficiently as delivery cars.
Ken BirmanProfessor of computer science, Cornell University
"You're really managing a large collection of machines that you purchased as [an elastic] fleet of computers," Birman said. "It's going to be varied in size automatically."
Enterprises that require this mix of deployment profiles for their applications in a fleet still rely on the benefits of standardization gleaned from the cattle-not-pets mentality.
Don't let cows leave the server farm
Despite the apparent ease of managing a herd or a fleet of servers, neither method is without maintenance. For example, IT operations admins must master the load balancer to manage workload across a dynamic fleet and also learn the back-end storage systems and distributed cache tables that store the application's data. Maintenance should be as hands-off as possible to prevent inconsistencies; changes go through an infrastructure-as-code repository to ensure accurate deployment. If you follow the rules, you get steady, predictable scalability, but you can't ignore the servers.
"No rancher sets and forgets; why should you as an IT person set and forget? There's no win there," Haletky said.
Edward HaletkyCEO, AstroArch Consulting
This is where log monitoring and alerting tools come into play. Ideally, policy enforcement and change management tools should prevent variations in server configurations. But if an aberration appears -- say, a cow with a bowtie -- tracking technology provides admins with the necessary information on who made what happen at what time and to which servers.
With appropriate automation practices and protocols, admins can locate the source of a given issue, update the necessary code and subsequently rewrite or edit the deployment script and redeploy the hosted application. Thus, admins who run thousands of machines can solve server problems and protect others from that same deviation. For applications that aren't so easily redeployed, admins must update the install script and the update script separately -- this allows new servers to spin up to the current specification and existing servers update to match.
Script a DevOps meeting
DevOps is a two-way street: Operations professionals must understand their applications, and developers must understand how those applications deploy.
Both ops and dev teams should convene with the security, compliance and legal teams. Component A might be necessary in application Z for regulatory compliance, but don't assume developers know that. "There's a lot more to creating a well-scripted environment than developers saying, 'We're going to script the install of this application,'" Haletky said.
If a vet says a specific bran mash is vital to the health of your cattle, you're not going to feed them apples instead. If project leaders say the application needs items A, B and C, developers shouldn't instead put in D, E and F. And if legal says the application must be physically segregated from other apps, operations won't deploy it on multi-tenant public cloud, no matter how solid the security measures are. Everybody who has a vested interest in application Z should give input for its base image creation to ensure that the deployment meets all its needs and standards the first time.
Cooperation and communication keep servers working like cattle, not pets. Coupled with the right scripting and monitoring practices, servers can reach the status of a unified fleet.