IT generates a lot of operational data. Could big data make it more meaningful?
There's nothing quite like managing an environment where the concept of real time is measured by the infinitesimally small standards of electronics. In any one of those instants, things can go from smooth sailing to disaster. And, always, there is data to be moved and managed and data -- lots of it -- about how the data center itself is behaving.
Into that challenging environment comes a new idea: applying new styles of processing and analysis borrowed from the world of big data technologies (e.g., Hadoop, NoSQL and Cassandra) and business analytics to help decision makers better master the challenges of IT management.
Indeed, applying big data concepts to the reams of data created by IT operations tools allows IT management software vendors to address a wide range of business decisions, said Brett Sheppard, director for big data at Splunk, which provides log data intelligence software. IT systems, applications and technology infrastructure generate data every second of every day. All that raw, unstructured or polystructured data represents a categorical record of "all user behaviors, service levels, cybersecurity risks, fraudulent activities and more," he said.
However, Web servers, applications and network devices -- all of the technology infrastructure running an enterprise or organization -- generates massive streams of data in such an array of unpredictable formats that it can be difficult to apply IT analytics using traditional methods or do so in a timely manner. Yet, Sheppard noted, this machine data is valuable.
With big data analytics offerings aimed at IT management, companies can combine real-time streaming data analysis along with terabytes of historical data correlation and analysis to detect patterns that can, in turn, help predict and prevent future outages or performance issues. Furthermore, they can leverage big data to understand usage patterns and geographical trends and gain insights about their heaviest users. They can also track and record Web activity and easily identify business impacts; accelerate profitable growth with insights into service utilization; and gather data across multiple systems to develop an IT services catalog.
At a bottom-line level, those activities can provide transparency for appropriate cost tracking and chargeback. In short, big data approaches have the potential to be game-changing.
Old methods haven't kept up
Torsten Volk, research director for system management at analyst firm Enterprise Management Associates, also sees great potential in big data for IT. As IT environments grow in complexity, scale and heterogeneity, they generate more and more metadata, forcing many organizations to take a purely reactive approach to IT management.
For example, with thousands of virtual machines to monitor, it can be difficult to even determine what you need to know when you put a new application into an environment in terms of security, performance or compliance, Volk noted. Even 'simple' things like assessing how many virtual machines ought to run on a single physical machine can be hard to determine using existing tools.
Taking a new big data approach to IT analytics can provide insights not readily achievable with traditional monitoring and management tools, Volk said. Cloud Physics, for instance, is one such company. "It is like an internal search engine in the mold of Google that can correlate internal metrics and external data, too," he explained.
Volk said the analysis is not necessarily all focused on unstructured data but it can correlate actions that might not be expected to have a correlation. For example, particularly with cloud resources, it can be difficult to anticipate how applications and data movement will affect each other. Cloud Physics allows cross-checking of logs and other indicators in real time to achieve that.
This new approach is "leading edge, not bleeding edge," Volk said. Its value to an organization will depend on the maturity and complexity of a given data center. Small and medium-sized businesses and organizations without much complexity will benefit, he said, "but companies with large and heterogeneous data centers will benefit even more."
For example, organizations with middleware and lots of platforms, with a tendency to add or change applications frequently, will see the most benefit. "The more dynamic the provisioning capability and the more heterogeneity, the more you will need tools like this," Volk said.
Possessing these types of capabilities will become increasingly important, he said, because organizations can't just continually add more staff. Although there are system manufacturers trying to reduce complexity by converging infrastructure, ultimately "you are fighting windmills, because your business and your developers will be coming up with new stuff all the time, so you will be challenged one way or the other," Volk said.
CloudPhysics CEO John Blumenthal said for his company, the longer term focus is on "collecting all the world's VMware operational data and aggregating it into a cloud service" that can support analysis or simulations -- so that users can then mine it for insights that they couldn't get anywhere else.
The inspiration for CloudPhysics came, in part, from the practices of hyperscale IT operations, such as Google and Facebook, Blumenthal said. Those organizations possess relatively homogenous infrastructure that has been heavily instrumented and studied, with massive deposits of data about operations available for analysis. For customers of CloudPhysics, his hope is to capture data about virtualized infrastructures and then apply IT analytics to glean insights that were not previously available.
Blumenthal said one of CloudPhysics' primary insights is that "storage is the root of all evil." Storage tends to be the least virtualized element in virtual environments and is often the origin of performance problems, Blumenthal said. Because of a paucity of the right tools and a "division of labor that has some people focusing on security and some on virtualization, but often no one in particular focusing on storage, you rarely get to root causes," he said.
New take on an old problem
At the same time, the hype around big data is putting a fresh spin on established technologies.
"This whole idea of big data analytics, especially in IT operations, isn't necessarily something that we invented but we have been doing it for a while," said Graham Gillen, marketing director at Netuitive, which provides IT predictive analytics software. "Gartner has been looking at this topic and basically they have said that if IT were starting fresh today they would throw out the old approaches and develop a fresh IT operational analytics platform," he said.
In Gillen's view, monitoring and metrics got short shrift in the past, with lots of "eyeballing" and "guesstimating." The need to do more was, for the most part, not anticipated. But with growing complexity and interdependence, and issues like user response time on a website -- potentially driven by things deep in the infrastructure -- has now become mission critical and has "hit a hockey stick" in terms of its perceived importance, according to Gillen.
"Now IT management isn't just asking if a server is up or down but how an application is running and is it ready for something like Black Friday. They are interested in how response time affects business results," Gillen said.
And, for the first time, IT is looking at what they can learn from the business intelligence and business activity monitoring space. "They now understand that they have been like the blind men each trying to describe the elephant," Gillen said. "What has been missing is a holistic picture of the entire operation."
An efficient data analytics infrastructure will benefit the public sector
Big data and IoT go hand in hand, but beware the implications