kentoh - Fotolia

Docker persistent storage is the next frontier in containerization

Persistent storage is a final hurdle for many enterprises hoping to deploy containers for traditional applications, and new plug-ins for Docker are emerging to help.

SEATTLE -- Docker persistent storage is emerging as the next bridge to cross for enterprises deploying containers.

Persistent storage refers to storage volumes -- usually associated with stateful applications, such as databases -- that remain available beyond the life of individual containers, as opposed to the more common ephemeral storage volumes that live and die with containers and are associated with stateless apps.

Still, containers live and die quickly and can be densely packed onto hosts, so time-honored problems in server virtualization, such as hot spots in disk arrays and the need to rapidly create and tear down volumes, are rearing their ugly heads again with the Docker craze.

For example, Inc., a division of Kroger Inc. that sells vitamins and supplements through an e-commerce site, is looking to containerize Oracle's Endeca e-commerce search engine.

"[Containerizing Endeca] is going to be a big issue in e-commerce," said Gary Davidson, senior solution architect for Vitacost, based in Boca Raton, Fla. "We can put the brains of the app in a container, but what do we do with the indexes?"

Docker is counting on its partners to offer persistent storage plug-ins, and a host of those partners here at DockerCon 2016 paraded products meant to simplify Docker persistent storage. Vendors, including EMC {code}, Robin Systems, ClusterHQ, Portworx, CoreOS and Nutanix, all touted their approaches to the issue, each with a slightly different angle.

However, enterprise users predicted the company will flesh out its own offerings, similar to how it built Swarm orchestration into Docker Engine.

"It would be nice to have storage discovery like they have service discovery," Davidson said.

Persistent storage is the last big hurdle to containerizing traditional apps at some enterprises.

"It's the scary part," said Mario Cruz, director at Watsco Ventures, a heating, ventilation and air conditioning distributor based in Coconut Grove, Fla. "Today, we don't do anything with containers that requires persistent storage."

In part, this is a security issue, as it's difficult to apply storage policies to disappearing and reappearing containers.

"Who has access to those volumes? Where do they sit in the data center? And how do you attach an encrypted volume to a particular container?" Cruz said.

At General Electric, engineers also struggle with these questions, said senior developer Andy Lim in a presentation here this week. The company is testing out ClusterHQ's Flocker, an open source container data volume manager, as a potential answer to a big set of challenges: The company has more than 9,000 traditional applications to refactor or rewrite to embrace microservices, and is also grappling with how to apply containerization to database applications.

So-called unicorn organizations, such as Netflix, have spoken publicly in great detail about their container infrastructure and even issued some open source software releases to help other organizations achieve similar goals. However, information about how persistent storage might work in such an environment is scarce, according to Nirmal Mehta, senior lead technologist for the strategic innovation group at Booz Allen Hamilton Inc., a consulting firm based in McLean, Va., who works with government organizations to establish a DevOps culture.

"There hasn't been enough detail on data repositories, except that usually it's a hardcore distributed database system like [Apache] Cassandra," Mehta said.

Enterprises wrestling with Docker persistent storage immaturity should take a step back and question whether containers are the answer for everything, said Donnie Berkholz, an analyst with 451 Research.

Though enterprises are increasingly containerizing traditional apps, said Docker CEO Ben Golub on the keynote stage here, the existence of such varied proposed plug-ins for the problem from partners shows the answer to the issues with stateful apps has not yet become clear.

For example, CoreOS recently launched a project called Torus that uses the etcd distributed database to manage cluster coordination as a "single source of truth" for large-scale stateful clusters. At the other end of the spectrum, Portworx touts scale-out block and file services optimized for containers as the way to go.

"There are a lot of different approaches," Berkholz said. "That indicates to me we haven't found the right one yet."

Beth Pariseau is senior news writer for TechTarget's Data Center and Virtualization Media Group. Write to her at [email protected] or follow @PariseauTT on Twitter.

Next Steps

Gene Kim says DevOps can't belong to the unicorns

The bare metal vs. VM debate over Docker deployment

A guide to the latest in containerization

Dig Deeper on Managing Virtual Containers