The IT automation concept is simple to understand. The idea is to have a faucet for computing resources. If you...
need more resources today than yesterday you simply turn the faucet for a longer period. If a business can automate provisioning of and changes to its computing infrastructure then those resources will always be allocated according to business needs. This concept can be applied to servers, virtual machines, software, network bandwidth, storage capacity, databases or any computing-related resource.
It has taken the computing industry a long time to implement server, software, storage and networking technologies that can be provisioned rapidly. As recently as ten years ago, it could take a company several weeks to deploy an additional server. IT administrators would wait for the purchasing department to buy the new server and software licenses, then wait for the equipment, CDs and installation manuals to be shipped, then manually install the software, manually match the hardware, software, networking and access rights configurations with the existing implementation, and physically deploy the server onto the network.
Today it is possible to fully provision an application server in under an hour. This is possible because of technology advances delivered by blade servers, virtual machines, internet-based purchasing, software downloads, and software configuration template libraries. These advances speed technology acquisition and automate the provisioning process. The mean-time-to-change (MTTC) shrinks dramatically. The computing infrastructure can become like the ocean, vast and constantly in motion.
What are the benefits of IT automation?
The benefits of "tap water computing" are extremely compelling for both business units and IT organizations. IT productivity improves in many ways. Mundane provisioning tasks can be completed rapidly, allowing business needs to directly drive infrastructure changes. For example, spikes in online demand can be rapidly met by increasing resources applied to the service. The business gains more agility as IT resources become more closely matched to business goals.
Similarly technical expertise is not wasted on those mundane tasks. Operational costs decline as senior technical expertise is paid for more strategic activities than correctly installing servers. More time can be spent doing real-time capacity, configuration and resource analysis so IT organizations can proactively manage business resources. There will be fewer cases of ad-hoc changes in reaction to a service meltdown because senior staff will have the time to analyze key performance indicators, identify the preconditions of a service bottleneck, and determine if adding resources will help without impacting some other application.
Automated tasks are completed the same way every time, therefore, the risk in making a commonly repeated change is lower, which should in turn reduce overall downtime of business services. In addition, tasks completed by software can be clearly logged and tracked, simplifying auditing. This reduces the cost of continuous compliance with multiple industry and governmental regulations.
What are the pitfalls of IT automation?
Companies have been promised IT automation solutions and benefits before, yet IT organizations never seem to get the closed-loop automation that would realize the full benefits. There are several reasons for the lackluster performance of IT automation.
Dynamic infrastructure jeopardizes dynamic provisioning: This is one of the most interesting paradoxes of IT management. The easier it becomes to change the infrastructure, the harder it becomes to control changes to the infrastructure.
For example, when server deployments occurs only once a month, an administrator could run a discovery scan in the morning to determine exactly which physical machines are running a particular version of Windows and schedule an overnight patch installation. However, when the server environment is virtualized, new virtual servers can be provisioned at will. The server list discovered during the morning discovery scan may not be the same that night. Without an accurate list, the patching automation is likely to have many failed attempts and administrators would fail to accrue the full benefits of an automated patching solution.
It is difficult to realize the full benefits of automation if IT organizations speed up the MTTC without also giving IT administrators a way to speed up their other administrative tasks.
Lack of coordination becomes a serious flaw: Another common pitfall is taking a piecemeal approach to automation, where enterprises buy separate automation tools for every IT task. This approach to automation is flawed because there are so many sources of change and often those changes can clash with each other.
For example, consider a service with a capacity-related bottleneck in which performance is suffering. Resolving this problem typically involves adding a software stack to a cluster. A problem resolution tracking solution or IT helpdesk solution is the source of this change request. If an installation script was created, the IT administrator must find it, apply it, and hope that the script does not automatically deploy an outdated configuration to a cluster. Simultaneously, a different IT administrator is tasked with responding to another source of change, such as the compliance office requiring the immediate implementation of a software patch, and uses different automation tools.
Simply put, the potential for configuration conflicts is higher when there is no coordination between the different change tasks and tools that automate those tasks. Piecemeal IT automation speeds up the time between change request and implementation and makes the multiple sources of change more efficient. Because there are so many sources of change that can be implemented rapidly, it is not surprising that infrastructure configuration changes are the leading cause of business service availability and performance problems.
Scripting approach is too fragile: Another problem is that once you strip away the slick interfaces for many of these task-specific automation products, you are left with a scripting engine. In many cases this scripting engine must recreate the same workflow over and over because scripts are fragile. Many things can break a script: heterogeneous technology, upgrades, patches, specialized configurations, task sequence changes, location changes, or password changes. Building something from scratch every time is not very cost effective.
Scripting is also a black box. There is minimal documentation. It cannot be easily visualized, cataloged or audited which makes it difficult to reuse. For example, many manually-built automation scripts and many automation tools used by administrators assume that a device's physical address does not change. However, that is no longer the case. VMware's VMotion capabilities give virtual machines mobility. HP's Insight Dynamics VSE uses the concept of logical servers to make it easier to move software stacks between physical servers. This mobility will break many of IT's existing automation systems.
The scripting approach works best in homogeneous environments that do not change very much. There are many beautifully written scripts that are fragile as ice sculptures in May because they have location information, task sequencing information, configuration information and device-specific information all mixed together and there is no easy way to untangle it if something changes. No administrator has time to fully document their scripts to simplify the untangling. Therefore, you end up having to rewrite everything, and if you have tried to reuse the script in another script you have to rework both of them.
Using automation to devalue IT staff: Many IT administrators have approached automation with fear and distrust. The fear is that the automation would replace their jobs. This fear is understandable given enterprise's history of viewing IT only as a cost center and IT productivity improvements is a strongly stated benefit of all automation solutions. Enterprises that do not back up their "business-IT alignment" strategies by shifting staff activities to proactive decision-making once automation is in place will find their automation efforts undermined at every turn.
The distrust is that the solution can accurately identify and handle the myriad of false-positive situations that may occur while attempting to complete technology-specific tasks. This distrust is rooted in the ideas that closed-loop automation completely removes all human involvement in a process, and once a process is automated it never changes. Since administrators must handle a changing environment with competing change requests, it is easy to see why they took a dim view of automation solutions that cannot demonstrate their adaptability or allow for human intervention.
Why will it be different this time?
Shifts are occurring within IT organizations and solution providers that increase the potential for IT automation that works.
There has been a distinct change in administrator attitude towards automation from fear and distrust towards "I wish it would actually work." The infrastructure has become so large and complex that many IT administrators are looking to software to help them get through their daily to-do-lists because their enterprises are simply not hiring additional staff. For example, a database administrator will find that they must complete their performance management tasks on a growing number of databases after they spend time deploying new databases (increasing the number of databases they must manage). The administrator's daily activity list is growing too long to complete without automation.
Secondly, IT management vendors have started to understand that "just script it" is not a workable approach to automation for dynamic infrastructure. Management vendors are spending more time developing technologies that give IT automation a different set of characteristics.
Robust automation is where tasks can be cleanly separated from the details of specific technology instances. Administrators and tools should not have to create a new deployment script for every golden image or every time the golden image changes with updates or patches. If administrators have put together a sequence of database provisioning tasks it should not matter what type of databases it's being applied to, what the version is, what the patch level is, or where they are located. The automation solution should make it work.
It is not the ability to drag and drop task icons and connect them with workflow lines that makes a good IT automation solution. Instead, what makes a good IT automation solution is knowledge behind that interface that is maintained for each technology, every version, every patch, etc. for the automation to work well in today's environments.
Another idea is to replace scripting with building a catalog of flexible workflows and processes. To be flexible, automation should be easy to find, easy to understand, easy to see where it has been reused, easy to change, and any changes should not adversely affect other places that it has been reused.
The automation should be usable by a range of people. Junior administrators will be able to implement complex workflows designed by senior engineers. Administrators would be able to distribute automated workflows to other administrators who are technical but probably not a silo expert. For example, allowing security managers to update database passwords every thirty days. Similarly, non-technical people like business managers can be provided interfaces to specific IT workflows, allowing administrators to focus on exception cases.
Automation that is dynamic can interact with the multiple sources of change, including IT workflows that cross multiple technology silos. Therefore, integration across multiple silo-specific management products has to work. Integration is not just products understanding each other's APIs, it means that information, data, custom forms etc. can be translated from one tool to another. It does IT staff no good to have a screen of pretty icons representing automated tasks with lines connecting them if the automation cannot pass all the necessary information from one icon to the other.
Finally the automation must be intelligent to support better decision making and more fluid interaction between IT staff. Many technologies interact with each other to deliver a business service; therefore, IT organizations must have now have many different people involved in making provisioning and change decisions. Coordinating decision making and activities across all these different is difficult. The increasing adoption of process standards like ITIL and COBIT is aimed at making the interaction between people clearer and more efficient.
Automation solutions become involved because they often must provide the information that people need to make those decisions. Remember the example of trying to do a discovery scan of moving virtual machines so that the patches can be deployed. In that case, the automation solution doing the virtual machine moving should report those moves to the automation solution used for patching so that people can make scheduling decisions so that the patching activities do not conflict with other planned deployments or moves or anything else.
In addition, different people will be involved in different parts of a management process. For example, the people planning the changes can be different from the change approval staff, who can be different from the implementers. A truly closed-loop process involves both people and automation. Intelligent automation solutions streamline not only the implementation of a specific change workflow, but also help people understand the impact of what was changed.
IT automation can work
When the computing infrastructure is extremely easy to change, automating the control of those changes requires a more sophisticated approach. IT organizations must have dynamic provisioning (tap water infrastructure) that does not collapse because the infrastructure is too dynamic (an ocean constantly in motion).
IT automation that works must be robust enough to deal with multiple variants of a particular technology, dynamic enough to handle the multiple sources of change to that environment, flexible enough for many different types of people to find and use, and intelligent enough to help IT people work with each other better and make good decisions.
ABOUT THE AUTHOR: Jasmine Noel is founder and partner of Ptak, Noel & Associates. With more than 10 years experience in helping clients understand how adoption of new technologies affects IT management, she tries to bring pragmatism (and hopefully some humor) to the alignment of business and IT operations discussion. Send any comments, questions or rants to firstname.lastname@example.org.