IT organizations hear plenty of myths about isolation and security when they compare containerization vs. virtualization. One DevOps and containerization consultant sets the record straight in this podcast.
Containers enable consistent deployment from development through production, as well as lighter-weight, scalable applications. But containerization can also tighten security, because container isolation exposes a far smaller hack surface for a given application.
Security myths about containerization vs. virtualization arise from the misconception that containers are simply smaller, more plentiful versions of VMs, says Will Kinard in this podcast interview with SearchITOperations. Kinard is the CTO of BoxBoat Technologies, a DevOps and containerization consultancy that works with diverse businesses to modernize IT estates. Containers exercise isolation on a host, not virtualization of the host assets.
"Whether there's a hypervisor there or not, regardless, a container runs on operating system primitives," he says. Multiple containers share an OS kernel but use their own file system. "When a process is running in a container, it's actually in the process tree of the operating system itself. It's merely been isolated," he says in the podcast.
Container isolation pares down the file system for each container to exactly what it needs to run the application code and nothing more. It also contributes to a slimmer OS, rather than a general-purpose OS with unused components that still require patching. "The host operating system that is running all of these containers only needs the binaries -- in this case, Docker and maybe containerd -- and only needs the components of that file system to run containers," Kinard explains.
Another misconception of containerization vs. virtualization is application suitability. Kinard suggests that IT organizations examine the payoff from container isolation and orchestration, not simply whether an app fits the mold of stateless and cloud-native microservices where container adoption thrives.
"There's no reason you can't run a database [with containers]; it's done all the time," he says. The data goes to a persistent store on disk or elsewhere. Likewise, containerization is a fit for stateful applications that need faster development cycles, portability or other benefits, as long as the app is architected to access its data mounted external to the containers.
"At the end of the day, if we're implementing something that's not allowing you to either save money or become more efficient ... then we're not doing the right thing," Kinard says.
Listen to the podcast for Kinard's explanation of cattle vs. pets, virtualization and container isolation, or read the transcript below.
Transcript - Container isolation compels IT orgs to rewrite security practices
Will Kinard: My name is Will Kinard. I'm the CTO at a new startup technology consulting firm out of Washington, D.C., called BoxBoat Technologies. The name is a play off of container ships, which is part of this whole new market trend toward containerization and implementing mainly Docker containers and CI [continuous integration] workflows and helping modernize corporate infrastructures.
I'm the tech lead of the company; I run a crew of a little over 10 engineers, and we're growing right now. We target almost anybody and everybody from small business to large business to help them with their CI workflows and putting these modern tool sets in -- especially containerization.
Meredith Courtemanche: So, let me ask you a basic question: What do you do with containers?
Kinard: That can be a pretty broad-based answer. But I think, all in all, we help organizations leverage containers to make those workflows more efficient. At the end of the day, if we're implementing something that's not allowing you to either save money or become more efficient ... then we're not doing the right thing. That can take many forms; that can look like modernizing your current development processes. Everyone has a development workflow, whether it's formalized or not. It could be homegrown, or it could have been some type of formal process that was charted up in the beginning. But we like to come in and assess that, and we'll use containerization to wrap up all of your development products and artifacts into single Docker container images that can be deployed throughout your pipeline. It will increase speed and improve efficiency and allow for a lot more organization and sanity.
I would say that's where we implement it the most. We also bring containerization to increase security, depending on the client's needs -- to decrease their attack surface. Those are the two main places where we implement containerization.
Courtemanche: The phrase that we hear a lot with containers is that containers are just a smaller version of a VM. And security is going to be harder because, instead of hundreds of VMs, now you've got thousands of them, because you've got containers. What's wrong with that statement?
Kinard: Right. There's certainly a lot wrong with that statement. I think that [misconception] comes out of trying to bring the idea of containerization up to a higher level to make it easily explainable, although technically it's very incorrect. A virtual machine by its nature ... has the hypervisor, right? It's a translation of the underlying hardware or virtualization of the underlying hardware to an operating system. In terms of something like VMware, that would be using ESX -- a hypervisor to abstract the underlying hardware so you can run multiple operating systems. So, the operating systems themselves think that they're all running on the same hardware, where, in fact, they're just looking at an abstraction of it. With a virtual machine, you're actually running on the hypervisor, but you still have direct access to all the components -- the underlying hardware components. You have your own file system, your own operating system. It is an isolated operating system for all intents and purposes, but it's isolated at the hypervisor level.
When we look at a container, it's a very different paradigm. It's not taking the same components and making them smaller, in any regard. Containerization is simply an isolation of a process on the same operating system. Whether there's a hypervisor there or not, regardless, a container runs on operating system primitives. In other words, when you have multiple containers running in the same operating system, yes, they do have their own file system, but they're all operating within the same kernel. There's no virtualization there at all. When a process is running in a container, it's actually in the process tree of the operating system itself. It's merely been isolated. It has cgroups and some other kernel mechanisms, and it also has its own file system. So, in a way, because they both have their own file systems, in a VM and a container, the parallels can be drawn. Because, when you jump into a container, you see that it looks like its own workspace -- and it is -- but it's actually technically, under the hood, very different.
Courtemanche: You also could use a smaller operating system for containers than you would use running a physical machine workload or virtualization. Does a smaller operating system affect security?
Kinard: Yes, absolutely. I have to tease that question apart in a couple places. The container doesn't actually run its own operating system. Where [with] a virtual machine obviously ... you can run a Windows virtual machine and a Linux 3.10 kernel on the same hypervisor on the same bare-metal machine. A container just uses the operating system of the host. So, if I'm running 10 containers on Red Hat [Enterprise Linux] 7.2, which is running Linux kernel 3.10, they're all running Linux kernel 3.10. Containers just don't share the same file system. There's quite a difference there. I can run an Ubuntu container or a Red Hat container, CentOS container ... all on that same Red Hat operating system, but they're all going to share the same kernel.
Certainly, because containers are not running their own operating system -- they're simply running their own file system -- you only need in that file system for that container exactly what that container needs for its single software that it's running. What this means is -- let's say I just have [a] single web service, like Apache, for instance. You only need the library, the binaries and, of course, your application that you're running within that file system that is that container. From a hack surface perspective, it's dramatically reduced.
If you take that a level deeper, now the host operating system that is running all of these containers only needs the binaries -- in this case, Docker and maybe containerd -- and only needs the components of that file system to run containers. So, now we've removed a lot of the excess from the underlying host operating system as well. So, certainly the attack surface is smaller.
Courtemanche: One of the questions that we have for IT operations is: How do you manage containers in production? How do you keep them secure? How do you monitor them? And it's going to be very different than what you're doing with virtual machines, right?
Kinard: It is. We at BoxBoat have a lot of different ways that we try to transfer that knowledge, and we've come up with a stack of technologies that tries to educate the enterprise, as well as empower them to handle this new stack of components. But, in many ways, it's not going to be that different. It's just simply a new paradigm of thinking. A lot of IT operators and administrators are very familiar with running virtualized environments now and using those tools, like VMware vSphere -- that's almost a given skill set. You need to know how to run vSphere for a VMware shop. With containerization, it's very similar, where there [are] several technologies to choose from in terms of running a containerized platform.
If you're actually running enterprise container orchestration, like Kubernetes, like Docker Swarm, it would be analogous to running vSphere, if you're trying to keep up with many different virtual machines. It's a matter of learning that [containerization] tool set and learning how containers interoperate with each other and with the structure. It's certainly a new set of tools, it's a new set of skills and it's also a new way of thinking. There's a lot of different ways to go about it; we've developed an opinionated way to help enterprises tackle that issue.
Courtemanche: So, you're using a lot of the same concepts and applying them in a different way?
Kinard: You are, in effect, but containers themselves are a paradigm shift, and I can dig into that. When we launch a virtual machine and we launch an application on it, it has some degree of making that application up and running. A metaphor that's used a lot -- and I hate to overuse, it but it does apply here -- is pets vs. cattle. We treat our machines -- and even our virtual machines -- like pets. When we bring one up and we raise it and now it's mature and it's running in production -- you have to keep your pet alive. You don't have a lot of them. It's your pet; you care for it. And so, you're going to continue to do whatever you need to do to keep that pet happy and up and running. With containers, we take what is referred to as a cattle mentality. You have hundreds of cattle; they're not your pets. If one was to unfortunately die, it's not the end of the world. You have many of them, and you'll bring another up to replace it. You look at containers as cattle: We have lots of them. They go down. It's not a problem; containers can spin right back up.
It's a paradigm shift for not only running production apps for high availability, but it even changes how we look at our service levels in terms of what we need to do to meet those. Again, it's maintenance, and with containers, if they go down, it's not that big of an issue. Our orchestrator is either going to schedule new containers for us on healthy nodes, or we're going to do that manually. In most cases, it's automation. It's very different [from VMs] in that regard.
Data is persisted outside the containers; it is persisted within a VM or database. VMs will take a long time to start back up; containers can take seconds. It is a different way of thinking, as in that pet vs. cattle metaphor.
Courtemanche: What technology do you have the most concerns about right now in containerization? What's the most unstable/least secure/still evolving thing that you are keeping an eye on before you implement in production? And -- possibly the same answer -- what are you most excited about that's coming down the pipe that you are really looking forward to bringing into container environments?
Kinard: Yeah, great question. We at BoxBoat have run into almost about everything in terms of: Can it be containerized, from a difficulty standpoint and the easy applications? Usually, when we look at containerization, it's a stateless application that's the most readily containerized. Think of your web servers or if you have other homegrown apps that tend to be intuitive without worrying about its last state. Because of the ease in doing this and because of the fact that containers are ephemeral -- meaning that they might go away [and] their data does not persist, where, in fact, you can store data external to the container and have it remount -- there's kind of a misnomer out there that containerization is only for stateless apps, but this is not true at all. There are many tried-and-true techniques to retaining state, whether pushing that off to its own database or using some type of foreign construct.
I'm saying this to answer that question: When you start tackling heavy and monolithic stateful apps, certainly this can be a problem. You have to discover all of the places where that application wants to store its data. You have to determine if the application is going to be happy running in an updated kernel environment, because containers right now -- the kernel for containerization -- are only stable in a Linux 3.10 or higher environment. That's the biggest obstacle we see when we look at legacy applications. There isn't anything that can't be containerized. Containerization is just a process running on a host OS -- which is what the app was before. There's absolutely nothing that can't be containerized, but it really comes down to the older applications that are not already running on the latest OS. And these applications usually look like old, homegrown Java apps or even some of these middle-tier enterprise applications, web servers like JBoss [WildFly], even though there are certainly newer technologies, like WebSphere [from IBM], that certainly can be containerized.
The trick with these types of technologies is that they support a lot of their own primitive scaling and high availability and resource management, so when we enter into environments where they want to containerize these types of frameworks, there's a lot of overlap between what a container can provide and the primitive that they've written in place. And, at that point, you're looking at refactoring code, which can be difficult. That's the most difficulty we see: when apps have to be refactored because of a framework primitive. It's not that you can't just completely containerize the entire framework, which you can -- and we've done -- but it's whether you really want to. What are you gaining? So, those are generally the most difficult.
I'll also throw out at the same time ... There's also a large misnomer about databases running in containers -- they absolutely can. A process in a container is the same as a process on a host. There's no reason you can't run a database; it's done all the time! It's just going to persist its data to a persistent store on disk or elsewhere, just like it would normally. So, there is a large misnomer that databases shouldn't be containerized, but that's not the case at all. Many of the production containerized ... that you'll deploy store their space within containers. Most people think that that's the most difficult, but in actuality, it's not.
In terms of what I'm excited about, coming around the pike, it's actually even lower-level construct than containers in unikernels. It's probably a whole different thread of conversation, but it's compiling a kernel -- a Linux kernel itself -- a file system and then the application that's necessary to run into a single binary that can be run as the host operating system itself. This is very promising technology. It's going to drastically reduce the footprint of what's running in production, but it's a little more of a topic for a different day.
Container management tools vary widely -- pick the right one
Things to know about containers before running them in production
Get persistent container data right before you lose it