Small World Big Data
Published: 15 Jun 2017
Algorithms control our lives in many and increasingly mysterious ways. While machine learning algorithms change IT, you might be surprised at the algorithms at work in your nondigital life as well.
When I pull a little numbered ticket at the local deli counter, I know with some certainty that I'll eventually get served. That's a queuing algorithm in action -- it preserves the expected first-in, first-out ordering of the line. Although wait times vary, it delivers a predictable average latency to all shoppers.
Now compare that to when I buy a ticket for the lottery. I'm taking a big chance on a random-draw algorithm, which is quite unlikely to ever go my way. Winning is not only uncertain, but improbable. Still, for many folks, the purchase of a lottery ticket delivers a temporary emotional salve, so there is some economic utility -- as you might have heard in Economics 101.
People can respond well to algorithms that have guaranteed certainty and those with arbitrary randomness in the appropriate situations. But imagine flipping those scenarios. What if your deli only randomly selected people to serve? With enough competing shoppers, you might never get your sliced bologna. What if the lottery just ended up paying everyone back their ticket price minus some administrative tax? Even though this would improve almost everyone's actual lottery return on investment, that kind of game would be no fun at all.
Without getting deep into psychology or behavioral economics, there are clearly appropriate and inappropriate uses of randomization. When we know we are taking a long-shot chance at a big upside, we might grumble if we lose. But our reactions are different when the department of motor vehicles closes after we've already spent four hours waiting.
Now imagine being subjected to opaque algorithms in various important facets of your life, as when applying for a mortgage, a car loan, a job or school admission. Many of the algorithms that govern your fate are seemingly arbitrary. Without transparency, it's hard to know if any of them are actually fair, much less able to predict your individual prospects. (Consider the fairness concept the next time an airline randomly bumps you from a flight.)
Machine learning algorithms overview -- machines learn what?
So let's consider the supposedly smarter algorithms designed at some organizational level to be fair. Perhaps they're based on some hard, rational logic leading to an unbiased and random draw, or more likely on some fancy but operationally opaque big data-based machine learning algorithm.
With machine learning, we hope things will be better, but they can also get much worse. In too many cases, poorly trained or designed machine learning algorithms end up making prejudicial decisions that can unfairly affect individuals.
This is a growing -- and significant -- problem for all of us. Machine learning is influencing a lot of the important decisions made about us and is steering more and more of our economy. It has crept in behind the scenes as so-called secret sauce or as proprietary algorithms applied to key operations.
But with easy-to-use big data, machine learning tools like Apache Spark and the increasing streams of data from the internet of things wrapping all around us, I expect that every data-driven task will be optimized with machine learning in some important way. Soon enough, machine learning algorithms will be part of the checklist for almost every application.
Optimization is by definition good, and machine learning offers a way to optimize almost anything better and faster than before. I'm not exaggerating when I predict that machine learning will touch every facet of human existence. We could be on the edge of a true information-driven renaissance.
But we need to be cautious. I don't believe that artificial intelligence will soon emerge and take over the universe, but the reckless application of naïvely built machine learning algorithm examples is already happening: e.g., financial-trading fiascos, racist loan denials and prison-parole injustices.
What can't be foreseen
Some of the biggest societal problems with the rise of ready-to-roll machine learning include the following:
- Unintended consequences. These are likely when we optimize for only specific parameters -- or without truly relevant data. When you train a machine learning algorithm to optimize one thing and it gets really good at that, you are most likely going to lessen the optimization of something else.
- Inevitable replacement of human operators. This will come at first by machine learning's acceleration and enhancement of fewer, more expert people and the decreased need for less-skilled operators to support them. Then we can predict replacing those experts, in turn, with less-skilled -- i.e., cheaper -- workers when the machine learning is deemed mature enough. Some people will say this frees up human capital to do better things or produces more leisure time. But it seems that so far, such a machine learning trickle-down theory hasn't worked out any better than the disproven trickle-down economic theory.
- Blind reliance on patterns discovered by machine learning. There has been plenty of reporting -- for example, in the book Weapons of Math Destruction -- about how some people have naïvely applied machine learning algorithms to propagate inherent racial, gender and economic prejudices. We simply can't afford to build opaque algorithms that will be used as an excuse to ignore injustice and inequity. If you ever hear someone justify an action or decision because "the algorithm made me do it," without offering any data transparency or actual basis of reasoning, then something is likely fishy.
We often talk about how supposedly blind and fair machine learning algorithms will help the world design more optimal and efficient processes. In practice, algorithms can be both mysterious and malevolent to those they affect most. We need solid data science to go hand-in-hand not just with business acumen and ethics training, but also some deeper study of unintended consequences, perverse incentives and irrational decision-making (or Behavioral Econ 101).
As we continue to scale up big data sources and deploy web-scale machine learning algorithms, those of us with the big powers of processing simply can't forget that there are consequences for actual people -- ourselves included.
UK's Royal Society urges the government to explore machine learning
AI and machine learning hit a few potholes in 2016
Data analytics has come a long way since Moneyball