Docker storage is the hottest new battleground in the container wars as competitors clash over enterprise data...
Early adopters have debated whether persistent storage and stateful applications are appropriate for containerization, but enterprise IT pros will require these features as they adopt containers. Legacy apps won't disappear, they say, and data portability through Docker has become a multicloud management dream, especially for multinational companies.
"People are going to need reliable persistent storage that can handle massive amounts of information," said Barry Libenson, global CIO of the Experian credit bureau. "Those capabilities are going to be necessary for people to build viable large-scale applications."
Experian already orchestrates stateless applications that can reside on either side of a hybrid cloud environment managed with Red Hat's Kubernetes-based OpenShift product, and eventually it will need the same capability with data attached. For example, if Experian sets up a data center in a country that changes its data residency laws, it's onerous to pick up applications and move them from one facility to another.
Analysts say such container use cases are important to the enterprise audience that Docker and Google look to capture with respective Swarm and Kubernetes offerings.
"All the players have seen the writing on the wall and a lot of this has to do with enterprise interest," said Jay Lyman, analyst at 451 Research. "They're interested not only in the net-new cloud-native applications, but in migrating existing applications to the cloud using containers as the vehicle."
Mark Davisformer CEO, ClusterHQ
If it's early days for containers at large enterprises, it's prehistoric times for stateful containers with persistent storage. Two years ago, finding developers who could work with the Hadoop platform was a challenge for organizations such as Experian. Today, container orchestration skill demands outstrip the IT job market's supply.
"Finding developers that are familiar with Kubernetes, containers and OpenShift in general is not the easiest thing in the world," Libenson said. "Hopefully that'll ease up like it did around the Hadoop space, but it continues to be a bit of a challenge."
Experian is no stranger to modernized database platforms, having already replaced an IBM DB2 environment that runs on pSeries hardware with Cloudera's Apache Hadoop, which runs on x86 boxes. But security is paramount for such a financial institution, and encryption product offerings in the Docker storage world still lag behind those available for traditional virtual machines.
"This is an ecosystem that's evolving very quickly, but there are some places where the choices around the technology are not as rich as we would like them to be," Libenson said. "When you encrypt petabytes of information and need to retrieve the information quickly, we have some concerns around what the performance implications would be."
Kubernetes takes lead in Docker storage
Storage orchestration features in the Docker and Kubernetes container management platforms still lag behind the state of the art in the VM world overall. For example, Red Hat OpenShift just this month rolled out automated tiered storage capabilities, which have been available with regular servers for over a decade. Likewise, VM storage live migration runs circles around what's possible with containers now.
"With VMware you can live-migrate the storage and the RAM across the country and have the VM perfectly functional the whole time," said Michael Bishop, CTO of Alpha Vertex, a fintech startup that uses Kubernetes to manage a multicloud environment. "It's hard to top that as a trick."
Kubernetes has also just added dynamic storage provisioning in version 1.4, in which volumes do not have to be pre-provisioned by storage administrators for compute units to consume them. That feature also just made it into OpenShift this month. As with tiered storage, it's been a feature particularly of object-based cloud storage systems for years.
Docker captured headlines with its acquisition of distributed file system startup Infinit in December, but Kubernetes loyalists have since trumpeted the advanced storage orchestration features the platform has offered since 2015 and painted Docker as playing catch-up in this space as it works to release Infinit's technology.
Tiered storage, for example, is not built in Docker natively, unless the user has ZFS. Dynamic provisioning is available with Docker Compose. But Kubernetes can also provision persistent Docker storage so that container environments can be run across multiple clouds, as Experian and Alpha Vertex are doing, a feature Docker does not natively have today.
Still, it's debatable how meaningful the Kubernetes storage orchestration lead really is, or how long it will last.
"Docker is still the center of gravity for application containers -- this all starts with Docker," Lyman said. "That keeps them in a prominent position in the market and helps Docker Swarm."
Most of Lyman's enterprise clients use or have investigated Amazon's EC2 Container Service rather than Docker Swarm or Kubernetes, he said. Amazon ECS abstracts host clusters and storage completely from the enterprise administrator.
Slow and steady may win the Docker storage race
Being ahead of one's time in a technology field that tends to move relatively slowly, as data storage does, can be a disadvantage, too -- just ask ClusterHQ, a Docker storage startup which closed its doors in December despite the early popularity of its Flocker product.
"The Flocker product is useful, but it is a product that in the best case scenario was never going to generate much revenue because what it does will be commoditized by the orchestration systems over time," said Mark Davis, ClusterHQ's now-former CEO.
Meanwhile, the Docker storage ecosystem is fast and furious, which in some ways runs counter to enterprise storage best practices such as reliability and stability.
"The Docker ecosystem wants storage to keep up with fast growth, and that's a challenge that will take patience," Davis said. "All of these things are rapidly evolving, and if you change software a lot you're going to create a lot of bugs -- that's just a law of the universe that you can't get around."
Kubernetes does have a functionality lead for the moment, Davis said, but he called Docker the gold standard in container orchestration ease of use.
"Their catch-up to Docker is, how do we make this thing so you don't have to be a rocket scientist to make it work?" he said.
Still, Docker has its work cut out for it with Infinit, in Davis' estimation.
"I don't believe that anyone, and I don't care how brilliant you are, can build an enterprise-grade reliable scalable distributed file system with half a dozen people in a couple years," Davis said. "That's not a criticism of them, it's just that's how long it takes."
In the meantime, enterprises won't want to deal with storage growing pains all over again that have recently been resolved with VMs, said Nuno Pereira, CTO of iJET International, a risk management company based in Annapolis, Md., which uses containers in development but has stuck to VMs in production so far.
"What I see a lot is people standing on the sidelines and waiting until that whole thing is over and to see who the winner is," Pereira said. "There are significant differences of opinion, and the folks that jump in early could be left holding the bag in terms of being left holding on things like backward compatibility."
In the midst of Docker data storage innovation, ClusterHQ shut down operations. ClusterHQ's CEO explains why Flocker technology was eagerly adopted, but the company did not survive.