Enterprise IT pros should get ready for Kubernetes storage tools, as the Cloud Native Computing Foundation seeks...
ways to support stateful applications.
The Cloud Native Computing Foundation (CNCF) began its quest to develop container storage products this week when it approved an inception-level project called Rook, which connects Kubernetes orchestration to the Ceph distributed file system through the Kubernetes operator API.
The Rook project's approval illustrates the CNCF's plans to emphasize Kubernetes storage.
"It's going to be a big year for storage in Kubernetes, because the APIs are a little bit more solidified now," said CNCF COO Chris Aniszczyk. The operator API and a Container Storage Interface API were released in the alpha stage with Kubernetes 1.9 in December. "[The CNCF technical board is] saying that the Kubernetes operator API is the way to go in [distributed container] storage," he said.
Rook project gave Prometheus a seat on HBO's Iron Throne
HBO wanted to deploy Prometheus for Kubernetes monitoring, and it ideally would have run the time-series database application on containers within the Kubernetes cluster, but that didn't work well with cloud providers' persistent storage volumes.
"You always have to do this careful coordination to make sure new containers only get created in the same availability zone. And if that entire availability zone goes away, you're kind of out of luck," said Illya Chekrygin, who directed HBO's implementation of containers as a senior staff engineer in 2017. "That was a painful experience in terms of synchronization."
Moreover, when containers that ran stateful apps were killed and restarted in different nodes of the Kubernetes cluster, it took too long to unmount, release and remount their attached storage volumes, Chekrygin said.
Rook was an early conceptual project in GitHub at that time, but HBO engineers put it into a test environment to support Prometheus. Rook uses a storage overlay that runs within the Kubernetes cluster and configures the cluster nodes' available disk space as a giant pool of resources, which is in line with how Kubernetes handles CPU and memory resources.
Rather than synchronize data across multiple specific storage volumes or locations, Rook uses the Ceph distributed file system to stripe the data across multiple machines and clusters and to create multiple copies of data for high availability. That overcomes the data synchronization problem, and it avoids the need to unmount and remount external storage volumes.
"It's using existing cluster disk configurations that are already there, so nothing has to be mounted and unmounted," Chekrygin said. "You avoid external storage resources to begin with."
At HBO, a mounting and unmounting process that took up to an hour was reduced to two seconds, which was suitable for the Kubernetes monitoring system in Prometheus that scraped telemetry data from the cluster every 10 to 30 seconds.
However, Rook never saw production use at HBO, which, by policy, doesn't put prerelease software into production. Instead, Chekrygin and his colleagues set up an external Prometheus instance that received a relay of monitoring data from an agent inside the Kubernetes cluster. That worked, but it required an extra network hop for data and made Prometheus management more complex.
"Kubernetes provides a lot of functionality out of the box, such as automatically restarting your Pod if your Pod dies, automatic scaling and service discovery," Chekrygin said. "If you run a service somewhere else, it's your responsibility on your own to do all those things."
Kubernetes storage in the spotlight
Illya Chekryginfounding member, Upbound
The CNCF is aware of the difficulty organizations face when they try to run stateful applications on Kubernetes. As of this week, it now owns the intellectual property and trademarks for Rook, which currently lists Quantum Corp. and Upbound, a startup in Seattle founded by Rook's creator, Bassam Tabbara, as contributors to its open source code. As an inception-level project, Rook isn't a sure thing, more akin to a bet on an early stage idea. It has about a 50-50 chance of panning out, CNCF's Aniszczyk said.
Inception-level projects must update their presentations to the technical board once a year to continue as part of CNCF. From the inception level, projects may move to incubation, which means they've collected multiple corporate contributors and established a code of conduct and governance procedures, among other criteria. From incubation, projects then move to the graduated stage, although the CNCF has yet to even designate Kubernetes itself a graduated project. Kubernetes and Prometheus are expected to graduate this year, Aniszczyk said.
The upshot for container orchestration users is Rook will be governed by the same rules and foundation as Kubernetes itself, rather than held hostage by a single for-profit company. The CNCF could potentially support more than one project similar to Rook, such as Red Hat's Gluster-based Container Native Storage Platform, and Aniszczyk said those companies are welcome to present them to the CNCF technical board.
Another Kubernetes storage project that may find its way into the CNCF, and potentially complement Rook, was open-sourced by container storage software maker Portworx this week. The Storage Orchestrator Runtime for Kubernetes (STORK) uses the Kubernetes orchestrator to automate operations within storage layers such as Rook to respond to applications' needs. However, STORK needs more development before it is submitted to the CNCF, said Gou Rao, founder and CEO at Portworx, based in Los Altos, Calif.
Kubernetes storage seems like a worthy bet to Chekrygin, who left his three-year job with HBO this month to take a position as an engineer at Upbound.
"Kubernetes is ill-equipped to handle data storage persistence," he said. "I'm so convinced that this is the next frontier and the next biggest thing, I was willing to quit my job."