IT ops pros evolve into SREs with the right DevOps skill set

IT ops pros often hear that DevOps calls for them to script, code and serve up automated infrastructure for developers. But where should aspiring SREs begin?

NEW YORK -- IT ops pros must learn to script and code to have the right DevOps skill set. Those who've made the leap from sys admin to SRE have advice on where to get started.

Sys admins who don't adapt their DevOps skill set risk extinction, but site reliability engineers (SREs) are in heavy demand. At progressive organizations, SREs are a "force multiplier" for DevOps teams that want to move fast, said Jonah Horowitz, a site reliability engineer at San Francisco-based cloud payments provider Stripe, in a session at Velocity Conference here these first few days of October.

At their best, SREs advise development teams on application design to smooth production deployments and create development frameworks and configuration templates to ensure new apps are production-ready, Horowitz said.

Scripting and coding for SRE wannabes

SRE can be a great career path for IT ops specialists, but it requires extensive professional development -- from the ability to write scripts that automate infrastructure to, eventually, the ability to code.

In the largest enterprises that embrace DevOps, SREs spend relatively little time troubleshooting and more time advising development teams at the start of the application design process, Horowitz said. Ideally, SREs directly contribute features to applications that help with reliability.

For those at the beginning of the process, configuration management tools are a good place to learn to code. Chef, for example, is written in the Ruby programming language. The ability to write good Bash or Shell scripts is a leg up on learning such a language, but it can be helpful to dive into the deep end with configuration management.

"I would work backward and learn configuration management, and where it doesn't fit [the application], learn Bash," said Bryan Liles, principal engineer for Capital One Financial Corp., the largest digital bank in the U.S., headquartered in McLean, Va., in a Q&A session following his Velocity presentation.

Liles recommended Google's online Shell scripting guide as a place to get started with scripting. Once a sys admin learns one language or scripting mode, they can learn other languages by rewriting familiar scripts in them, he said.

For those who don't want to take on Chef and Ruby right away, the Ansible configuration management tool, built on its own relatively straightforward domain-specific language, may be more accessible, said David Phruksukarn, senior software engineer at an online retailer in San Francisco.

DevOps skill set includes new ways to think

To deploy things correctly, you really have to understand how the application works.
David Phruksukarnsenior software engineer at an online retailer in San Francisco

SREs also deeply understand distributed systems architectures, according to some Velocity attendees.

"To deploy things correctly, you really have to understand how the application works," Phruksukarn said.

The most important thing about learning to script and code is to develop logical thinking about how an application should function, and then translate that into code that a machine will understand, said Kevin Beaudreau, manager of SaaS operations for a major business software maker in New York. SaltStack, which uses Python, is a good configuration management tool to start with, he said.

Online coursework is available for many programming languages, but, ultimately, team collaboration contributes most to professional development and solid skills for DevOps, said Jon Moore, chief software architect and senior fellow at Comcast Cable, in a Q&A after his Velocity presentation.

"We have a strong pairing culture," he said. "A benefit of pairing with others is cross-training, especially when it's coupled with rotations."

Everyone on Moore's team at Comcast rotates through various roles on the DevOps team, and that helps them meet new people, form new relationships and pick up new skills, he said.

SREs within the most advanced organizations eventually move beyond configuration management and scripting to build globally distributed systems that can auto scale and heal themselves, Horowitz said in his session. Advanced SREs may also move into more of a consulting role for release teams staffed by developers, rather than being involved in production rollouts day to day.

Beth Pariseau is senior news writer for TechTarget's Data Center and Virtualization Media Group. Write to her at [email protected] or follow @PariseauTT on Twitter.

Next Steps

DevOps works better with site reliability engineers

Site reliability engineering keeps Uber up and running

Ops admins can become engineers more easily than they think

Dig Deeper on DevOps Team Organization