Edelweiss - Fotolia

Scale up DevOps processes against the odds

DevOps is the change everyone wants, until it means new processes. The difficult but rewarding shift to a DevOps process flow reimagines everything from releases to incident response.

Sometimes, the numbers tell the story. By 2020, Gartner predicts at least 80% of large enterprises will use DevOps...

or have a pilot project, up from 38% today. And 87% of the organizations that implement DevOps processes today are already disappointed.

"There's going to be a lot of disillusionment [with DevOps] in the next three or four years," due to pilots that don't scale, vendors that DevOps-wash products and other difficulties, said Ian Head of Gartner, presenting at the firm's IT Operations Strategies & Solutions Summit in Orlando, Fla., in May 2017.

In Gartner's survey of 400 organizations, 50% of respondents blamed people and culture issues for holding DevOps back, and 37% said they could not find existing processes that worked for DevOps. Only 8% reported technological roadblocks.

To avoid the all-too-common fate of DevOps victims, organizations must understand DevOps processes and design both those processes and teams to scale. IT shops must ask: What are my processes for, what are my process objectives and how will I measure the outcome?

Define DevOps processes and platforms

At least half of orgs don't hit their goals with DevOps, Head said. Part of the blame lies with operational processes that are unsuitable for DevOps -- too much control in the wrong areas and a lack of trust and shared responsibility among development and operations.

Head recommended that ops borrow the Agile concept of a minimum viable product and create its own minimum viable processes. DevOps processes should dismantle overly complex tool sets, provide just enough protection so that speed doesn't lead to disaster and fight unmanaged changes.

A minimum viable set of DevOps processes makes the platform as iterative as the products it supports. "Your endpoint can be just a small win," said Larry Herz, senior data center engineer at AHEAD LLC, a cloud adoption consulting firm with headquarters in Chicago.

Start with absolute buy-in from all the teams involved and a good architectural footprint. Embrace the minimum viable product, and build on it. For example, if you build servers manually, develop processes to create and deploy a golden image in your virtualization platform of choice. The next step: Implement a secure and compliant base image across all Windows systems and another across all Linux systems, then generalize the application stack. Entrenched organizations can have 50,000 servers with as many different configurations, he pointed out, so iterative platform changes must happen before DevOps processes can translate to, "I can push a button and get my application stack."

"Ultimately, you want to get [to end-to-end automation], but don't go in expecting it," said Herz. "Do a little bit, make that little bit better and move up the stack." As DevOps processes build upon each other, resistance melts away.

Stick to the source, and empower it

Rather than force people along a workflow, let them make judgments based on information that automated processes generate. DevOps processes keep decisions closely tied to the information.

The highest-paid people don't make all the decisions; the people responsible for the product do. Release managers should decide on a course of action and fix any failure caused by the decision.

"Don't create a new process -- integration into the standard working practice is really important," said Jon Williams, CTO of Niu Solutions, an IT consultancy and cloud infrastructure services provider in the U.K. Keep within normal work queues to encourage DevOps adoption.

Change management means release early and often

DevOps processes for release flip the waterfall mentality on its head. In waterfall processes, big changes with complex relationships increase risks due to dependencies, so updates are rolled up and assigned a person to manage them in one big package, ferreting out as much risk as possible.

Instead, DevOps demands that teams excel at releases and do them often. Business product owners should define risk and service acceptance criteria, Head said. DevOps processes favor small sprints in infrastructure and operations over big projects, so issues are easily found and remediated.

"We don't want to design a big perfect process that applies everywhere, [and] that's consistent across the entire organization," Head said. "We may design a new process just for this one product that we're working on; get it right, and understand it and allow the process to expand in scale and duty."

In other words, emulate the Agile Scrum mentality in operations.

Not all changes are equal, so change impact analysis is necessary in a DevOps process flow. Create a process wherein product teams can change the software product without operations' approval, provided they affect no infrastructure configurations. Developers inform operations of these code changes and associated dangers as part of the process, and then changes come through a tool chain with governed scripts. The infrastructure and operations teams, meanwhile, build a common platform without customization for code dependencies. To understand, expose and mitigate risk of changes, release automation and strategic deployment techniques, such as canary and blue-green deployment, come into play.

Niu applied the DevOps approach of small, constant updates to its audit process. The service provider specializes in financial cloud and, due to a large legacy footprint, found compliance to be a better DevOps starting point than infrastructure automation. Niu used the Chef InSpec tool to map out build standards and industry standards and test compliance of the technologies they deliver.

"The massive win is continuous," Williams said. Before, audits involved a frenzy of preparatory activity and a sense of relief when they were completed -- until the next one.

DevOps processes for compliance drove down rework and unplanned downtime for the operations team, said Gary Bright, infrastructure developer at Niu, who presented at ChefConf in May 2017.

Niu used the blueprint from compliance to start automation with delivery code. "We were knocking out Windows machines [and] firewalls and then were a little upset to find they were still Windows machines [that needed constant tending]," Williams said. Niu built upon the already written compliance checks to improve deployment, configuration, integration and test and get rid of lots of handoffs. Then, they built on that to create DevOps processes for detection and correction with the goal of automated remediation.

What you monitor matters

DevOps demands new metrics. In every decision, DevOps organizations must also consider business metrics, such as customer retention and profit per customer.

"Historically, we focused on the metrics of mean time between failure, number of incidents per month," Head said. That won't do for the DevOps world, where things like mean time to detect and mean time to restore service affirm quality of performance.

Some notional measurements helped Niu's staff prove the value of DevOps processes over manual work. A manual firewall configuration took approximately 15 minutes in the best-case scenario, which totaled 3.125 days for a single run. With this baseline, the team would know if the pipelines and infrastructure as code would improve operations speed.

Niu also estimated lost productivity on the back end for every critical issue to demonstrate how controlled processes could reduce unplanned work for everyone -- engineers, service and account managers, help desk support and executives. The team is now investigating how to measure the deleterious effect on business revenue and profit from an operations issue.

"We've got multiple audiences for monitoring data because the application teams are customers of the monitoring functions," Head said. Product teams should live and die by the architecture they've defined, which means they need to know what's happening. Mature DevOps teams promulgate monitoring data through a combination of self-instrumentation in the app and shared analytics via common tools.

Measured risk and responsibility replace blame

DevOps processes tolerate the inevitability of risk and protect value against acceptable levels of risk.

Failed changes cause most incidents in pre-DevOps production deployments, despite a focus from infrastructure and operations on resilient architectures. Many companies -- even those that subscribe to a service methodology, such as COBIT or ITIL -- put in a tool and follow its workflow without critical analysis. They also surround ops with process-choking governance to avoid failure.

Incident management processes in DevOps shops reshape the service desk process. An incident call leads to triage rather than a first-round fix attempt. The business product owner and scrum master decide how an incident drives the queue of work: resolve now or wait for the next release? If possible, stop immediately, gather the team and figure out how to detect that problem sooner next time -- and resolve it.

Head warned against moral hazard -- a risk that's borne by somebody else -- since that somebody else always seems to be ops. It's an all-too-familiar scenario: The project comes out of design late, delays occur in dev, test takes longer than expected and responsibility for an on-time launch falls to ops.

What happens and when

Organizations should list all the activities along the DevOps process flow from product plan, code creation and verification to preproduction and release through configuration and monitoring, back to planning, Head recommended. This exercise will clarify whose skills belong on the team and where automation will enable accelerated, scalable changes.

Are people willing to own app code, storage and databases, or is ownership subjecting yourself to blame? DevOps is about built-to-run products, but not because teams avoid failure.

"The longer period of time you go between failures, the bigger the failure is going to be," Head said. "Small failures are good, even in our infrastructures." Create failure-resilient infrastructures, but shirk overly complicated platforms.

Integrated teams and integrated tools get the right information into the right format at the right time. Death by process can absolutely kill an organization, so understand the goals and outcomes you want to deliver, Williams advised.

Next Steps

Better automation is essential for better DevOps. A human should only do a task if there's no capable tool for the job. With processes in place, apply shrewd controls to automation scripts and tools, as well as software-defined infrastructure components. Automation is more auditable and reliable than manual work, but only if it's kept in line by thoughtful managers.

Dig Deeper on Application Rollout Planning and Problems