Melpomene - Fotolia

Docker persistent storage startup beamed up to the mother ship

Docker persistent storage and stateful applications are the next front in the container wars, and Docker Inc. has just fired a major salvo with its acquisition of Infinit.

Already synonymous with containerization, Docker has made an acquisition that could help it support the stateful applications that make up most enterprise workloads.

Docker Inc. said it will acquire Paris-based Infinit for an undisclosed sum. The company is in the early stages of creating a distributed Docker persistent storage system that could be a breakthrough in the containerization of the stateful apps that enterprises typically use. Stateful apps include Elasticsearch, WebLogic and SQL databases.

Until now, containers have largely been suited to 12-factor applications which are stateless, meaning they place no application state in the container itself. Stateful applications, on the other hand, need configuration files and other information on the container host to successfully start. It's early enough in this market space that there is some debate as to whether stateful applications should be containerized at all.

This acquisition has Docker planting a definitive stake in the ground in favor of containerized stateful apps, experts said.

As an early stage startup whose technology has also not yet been open-sourced, Infinit is a bit of an unknown quantity, "but storage is still a huge hurdle for enterprise to overcome when moving to containers," said Chris Riley, DevOps analyst at Fixate IO, a content strategy consulting firm based in Livermore, Calif., and a TechTarget contributor. "There has been a lot more this year on supporting persistent storage, and Docker has built more support for the volume API."

However, while the container market is still in its infancy, as are stateful containerized apps, Infinit and Docker are not alone in this pursuit. ClusterHQ has made hay for some time with its Flocker platform for Docker persistent storage, and the open-source project Ceph, now owned by Red Hat, is already used by large companies, such as Bloomberg, to support OpenStack private cloud and Google Kubernetes container orchestration implementations.

"We are big supporters of distributed storage, [and] here at Bloomberg we've been very vocal supporters of the Ceph distributed storage system," said Justin Erenkrantz, head of compute architecture for the global finance, media and tech company, based in New York.

Infinit seems to be more of a globally distributed storage system while Ceph is aimed at data-center-sized deployments, Erenkrantz said.

Docker persistent storage is an emerging market where new technologies will keep popping up, Erenkrantz said. Ceph has worked well supporting, and Kubernetes also has an upstream project called Pet Sets that looks to create a container persistent storage offering as well that Erenkrantz plans to investigate.

"It'll be interesting to see if this Infinit stuff presents itself in Kubernetes as well," Erenkrantz said.

Infinit container storage nitty-gritty

Infinit's platform aggregates container nodes' local storage into a single virtual pool and provides several APIs on top of it, which will include a POSIX mount point that can be shared by all nodes as a file system; a REST API, which would be compatible with Amazon's S3; or a block storage interface, which would allow users to bring their own file system to the table. Further interface projects include iSCSI and NBD, said a Docker corporate blog post.

As a "truly distributed" storage system, there is no specific node management; members "can come and go at will," said Infinit CTO Quentin Hocquet in a presentation in October. A customer could theoretically spin up a new storage system with every container created, without slowing the system down, Hocquet said.

Nodes in the Infinit platform are linked by an overlay network, though in this case it's not an encapsulation-based network layer. Rather, it is a "family of algorithms," Hocquet said, which can be used to pool tens of thousands of nodes together, while creating a distributed key-value store of hashes that evenly distributes blocks among nodes in the network. For each block, a consensus algorithm handles permissions for read and write access, so there is no need for a cluster authority or leader, which would create a performance bottleneck and a single point of failure.

This system could potentially be used to create a default volume plug-in for stateful apps; distributed storage for images in the Docker Trusted Registry; or hyper-converged infrastructure support for Docker Swarm, according to an Infinit blog post.

"It looks like Infinit's approach is abstracted to the software layer, and volumes are less of a concern [there]," said Fixate IO's Riley. "But the problem being solved is the same, and really the only way containers will ever work for back-end applications, which are needed for fully container driven pipelines."

Beth Pariseau is senior news writer for TechTarget's Data Center and Virtualization Media Group. Write to her at [email protected] or follow @PariseauTT on Twitter.

Next Steps

Persistent storage comes with some challenges

Determine whether to converge or hyper-converge infrastructure

Clarification on stateful versus stateless applications

Dig Deeper on Managing Virtual Containers