Suddenly, everybody is talking about machine learning, AI bots and deep learning. It's showing up in new products to look at "call home data," in cloud-hosted optimization services and even built into new storage arrays!
So what's really going on? Is this something brand new or just the maturation of ideas spawned out of decades-old artificial intelligence research? Does deep learning require conversion to some mystical new church to understand it, or do our computers suddenly get way smarter overnight? Should we sleep with a finger on the power off button? But most importantly for IT folks, are advances in machine learning becoming accessible enough to readily apply to actual business problems -- or is it just another decade of hype?
There are plenty of examples of highly visible machine learning applications in the press recently, both positive and negative. Microsoft's Tay AI bot, designed to actively learn from 18 to 24 year olds on Twitter, Kik and GroupMe, unsurprisingly achieved its goal. Within hours of going live, it became a badly behaved young adult, both learning and repeating hateful, misogynistic, racist speech. Google's AlphaGo beat a world champion at the game of Go by learning the best patterns of play from millions of past games, since the game can't be solved through brute force computation with all the CPU cycles in the universe. Meanwhile, Google's self-driving car hit a bus, albeit at slow speed. It clearly has more to learn about the way humans drive.
Before diving deeper, let me be clear, I have nothing but awe and respect for recent advances in machine learning. I've been directly and indirectly involved in applied AI and predictive modeling in various ways for most of my career. Although my current IT analyst work isn't yet very computationally informed, there are many people working hard to use computers to automatically identify and predict trends for both fun and profit. Machine learning represents the brightest opportunity to improve life on this planet -- today leveraging big data, tomorrow optimizing the Internet of Things (IoT).
Do machines really learn?
First, let's demystify machine learning a bit. Machine learning is about finding useful patterns inherent in a given historical data set. These usually identify correlations between input values that you can observe, and output values that you'd eventually like to predict. Although precise definitions depend on the textbook, a model can be a particular algorithm with specific parameters that are tuned, or one that comes to "learn" useful patterns.
There are two broad kinds of machine learning: supervised and unsupervised. The difference is that supervised learning happens over data sets that have also captured the desired output value as training data, while unsupervised methods try to distill out inherent information, such as natural clusters, hierarchies or association rules. When a trained model is shown new input data, it can predict an output value, numeric or categorical, directly or as a probability.
This all seems great, but machine learning AI robots haven't taken over the world just yet. Why not? Well, one of the things to keep in mind is that if you set out to look for a pattern in a given data set, you will find one. This tendency to find patterns may be the basis for human intelligence -- being able to generalize usefully predictive patterns out of things we've observed. But perhaps the key problem with this type of pattern-finding intelligence, both with real humans and AI computers, is judging whether the patterns we inevitably find are meaningful, true and useful.
A black eye for machine learning, AI
The broader field of AI has long strived to create some human equivalent in judging what is meaningful and true, but this is a hard, if not impossible, problem that has given AI-labeled research a black eye for decades. The more practical machine learning field generally only focuses on the useful part, which is -- well, actually very useful. But keep in mind that machine learning tools are not necessarily finding meaning or truth in data. Correlation is not causality, but finding the absolute scientific truth doesn't matter as much as building a generally effective model.
Useful means creating a model that predicts a future unknown outcome with enough accuracy given some newly observed input data. Previously, I've described some ways to evaluate how good a given model is at solving a specific problem. Here, I'll just restate that to be effective, you must have a solid understanding of the full problem you are trying to solve and be very clear up front about what you actually need to predict. These are good reasons why data scientists are in high demand. Just about anyone can use automated cloud-hosted machine learning services (e.g., in Microsoft Azure, Amazon Web Services, Google Cloud Platform) to quickly generate seemingly accurate models from any available data set, but it takes an experienced data scientist to truly create, understand and validate models that optimally solve the actual predictive problem.
Faced with ever-bigger data sets these days, some data scientists suggest that we stop striving so hard for artificial intelligence. With big data and IoT, we could eventually capture every data point, and, with big data methods, produce dynamic intelligence based on a complete historical data trail -- making the best current guess of what is going on now, given everything that has happened. But I'm not sure we are there just yet, and I'm also a fan of Nicholas Taleb's The Black Swan, which cautions the reader that models can't be trained on past data to predict future events.
Machine learning, AI impact IT
Full-scale artificial intelligence-driven machine learning tools aren't ready just yet, but data center data center infrastructure management (DCIM) tools provide IT teams with a way of collecting and making sense of data from across an IT infrastructure and facilities. Some experts propose that the future of DCIM will use machine learning to not only make sense of the massive amounts of data that modern, complex infrastructures create, but these tools will help find and fix data center equipment failures before they happen. Machine learning tools will also start to power predictive analytics that may help IT ops teams with infrastructure monitoring and management, offering predictive models for processes and strategic decisions.