This content is part of the Essential Guide: Data storage considerations for a DevOps environment

Manage disk image files wisely in the face of DevOps sprawl

There are strategies to prevent obsolete disk images from clogging up valuable space, especially in a rapid-change DevOps shop. However, not all VMs will benefit from the same plan.

A disk image is simply a file, but that seemingly innocuous file contains a complete structure that represents applications, storage volumes and even entire disk drives.

The concept of disk image files evolved from cloned backup images to a complete application instance as it exists and runs in a system's memory space -- a VM. Every time the IT team creates a new application or spawns a snapshot to protect the VM's last state, it adds another VM disk image to its inventory.

Disk images are fast and easy to create, but they're not free and they cause administrative problems. Each image file, such as a .vhdx or .vmdk file, requires storage space. Virtualization or IT administrators must constantly manage disk image files to ensure each image is appropriate and relevant to the enterprise. They must archive or delete obsolete or unneeded disk image files to ease management and recover costly storage space.

These problems are multiplied in a modern DevOps environment. DevOps teams rely on continuous delivery models to speed development and experiment with creative and competitive capabilities that could yield the business an advantage. Continuous delivery of new software iterations results in a raft of new VMs that IT operations teams or DevOps team members deploy, track and protect. If left unmanaged, those disk image files can choke valuable storage and confuse busy IT professionals.

VM disk image file management depends upon retention, protection and deployment policies, as well as effective tool use, to counteract sprawl.

Identify and implement retention policies

One of the biggest oversights developers and IT operations staff make is neglecting the importance of lifecycles for software iterations and the VMs that host them. It's not necessary, desirable or even practical to retain all content forever, and there is little value in keeping disk image files for software iterations that become useless and obsolete thanks to continuous development.

New disk image files enter production and then are eventually displaced by new iterations, but the older file versions are never really removed. Even worse, administrators have almost no insight into the relative importance of each disk image to the business. Over time, the number of disk image files proliferates because there is no mechanism in place to constrain or remove unneeded disk images.

The fight against disk image sprawl usually starts with a well-conceived retention policy agreed upon by developers, operations staff and business leaders. Policies include how to submit new instances to operations, how to provision resources to new iterations, how long to maintain and support each iteration, and how to decommission and remove expired iterations. Not only will it improve IT operations, but the policy also contributes to proper business governance and compliance. Make an effort to codify lifecycle and retention policies into the lifecycle management platform implemented as part of the IT administrator's tool set.

One size does not fit all. Different VMs can and usually do have radically different lifecycle and retention policies. Consider implementing one policy for new software releases coming from developers and another policy for enterprise applications such as Microsoft SQL Server and Exchange, for example.

Use tools to identify idle instances

Time, such as a lifecycle period, is not always the sole measure of a disk image file's value. VM activity or utilization can better indicate an instance's value to the business. Application performance management and monitoring tools track VM usage and find instances that may have slipped through deployment without a proper lifecycle policy.

Performance metrics provide feedback -- a tenet of the DevOps methodology -- for future development work and deployments. Tools such as Veeam ONE offer VM performance monitoring, alerting, optimization and configuration tracking. Performance metrics reveal underutilized or unused VMs, allowing administrators to manually decommission disk image files as well as extraneous or miscataloged file copies needlessly consuming compute resources.

Detach backups and snapshots

VMs are included in snapshot and backup regimens to periodically capture system states and save those memory images to files on disk to protect workloads and data. Multiple snapshots are periodically rolled into larger backup jobs, eventually resulting in myriad copies of workload image states in storage long after the actual disk image file is obsolete.

It's not necessary, desirable or even practical to retain all content forever.

Multiple levels of disk image protection suit established enterprise-class workloads, but the disk images produced for testing and development tasks can often forego such copious protection. Just recreate the test workload from its original disk image rather than attempting to create or restore various snapshot or backup copies. Even mitigating the movement of disk image files around the storage environment can reduce the potential for failed copies that result in broken files -- VM fragmentation -- and wasted storage.

An IT organization that strategically reduces and streamlines the data protection processes for temporary DevOps tasks can save time, simplify operations management and reduce storage demands. As needed, disk image files can be manually copied directly to archival storage to protect one iteration of the VM.

Identify and implement deployment policies

The close interaction between developers and IT operations staff in DevOps shops isn't a gimmick -- operations needs a clear picture of the frequency and scope of each new build coming down the line to plan adequate capacity, provision necessary resources in a timely manner, and accommodate current and older builds still in the infrastructure.

Lifecycle and retention policies help define these issues, but lifecycle policies don't replace day-to-day collaboration. For example, developers need a clear service-level agreement from operations that outlines how long it will take to deploy a new build for testing, while operations should enforce established limits for developer resource usage. Well-defined deployment guidelines and strict policy enforcement will limit extraneous use of compute resources and lower the overall volume of content for operations to manage.

Another best practice is to follow sound policies when cataloging and managing the components of each new deployment. For example, commonsense naming conventions for each build component -- VM disk images, data files, folders -- can make content easier to find and associate with responsible departments or projects, preventing sprawl. To further limit disk image proliferation, regular inventory audits should identify obsolete, unused, miscataloged or expired components as candidates for archiving or removal.

Track and manage supporting content

A constant influx of new builds from busy software developers fuels disk image sprawl, but new build images are not the only source of sprawl. Most VMs work in concert with configuration files, databases and other content. No operations professional would spin up a new disk image and connect it to actual production content -- they test new builds with dummy or duplicate content instanced from production. During testing, the build produces detailed test and log data that developers use to troubleshoot bugs, improve performance and enhance future builds.

All of this companion content causes another, oft-overlooked, problem: As disk images proliferate, copies of the databases and other content used by those builds pile up. Add in backups, snapshots and other forms of data protection, and storage undergoes a heavy strain. When administrators focus on tracking and managing disk images, they also must account for any companion testing and validation content and detail the policies and processes for removing that content when the build goes live.

No end to sprawl

Management tools, sound deployment policies and regular usage audits mitigate wasted resources, but development practices are still changing. Just as virtualization spawned VM disk image sprawl, the recent renaissance in container-based virtualization technology promises to complicate development, deployment and management even further. Containers change the software development paradigm by breaking traditionally monolithic projects into independent functional components that communicate across the network through application programming interfaces. Tomorrow's software builds may involve countless containers for operations staff to provision and manage. Now is the time to consider sprawl, evaluate its impact on your infrastructure and take proactive measures to combat it.

Next Steps

Don't count on immortal VMs

Using too much storage can create problems

Avoid VM sprawl with these steps

Dig Deeper on Real-Time Performance Monitoring and Management