Machine learning and artificial intelligence seem to be the hot topics these days. From bots that can interact with people to systems that learn and grow as they process more data, it seems that science fiction is becoming reality. At least, in limited ways. Autonomous cars, perhaps the highest profile example of these topics, are advancing and being tested in a few locations around the world, but I think we are a long way from having human controlled and autonomous cars interacting freely at any scale. There are still plenty of issues to work out, and the consequences from mistakes require serious consideration.
I was thinking of this as I read an interesting question: Whose black box do you trust? It’s a look at algorithms and machine learning, and the impact they have on the world around us, despite many of us not understanding how they work. The main examples in the piece are in the area of journalism as it relates to social media (primarily Google and Facebook), but also touches on autonomous vehicles, both autos and planes. The latter was a bit of a shock to me as I assumed humans always handled takeoff and landing, something the author says doesn’t happen at SFO. Some searches around pilot sites seem to note that automated landing is done regularly to test systems, but is used in a minority of cases.
The question is, do we trust the black boxes that run our systems, and really, does it matter? In the piece, Tim O’Reilly says he has four tests for trusting an algorithm:
- the outcome is clear
- success is measurable
- the goals of the creators align with the goals of the consumer
- does the algorithm lead creators and users to better long term decisions.
Those are interesting ways to evaluate a system, though I think the problem is that the last two are a bit nebulous. One of the things that I see more and more as I get older is that the same data or the same facts can lead two different people (or groups) to two different results. Our goals, our interpretation of events, even the weights we place on the various factors in a complex system vary dramatically from person to person. In such a world, can we truly evaluate what the goals of a creator are? Forget about consumers, assume one person building a software system. They will have multiple goals, and do we really think their goals can be easily listed? Or weighted/ranked appropriately? What about when the goals change?
I really think that the black boxes need more disclosure, though I freely admit there isn’t a good way I know of to do this. However, I do know one thing that can be better disclosed: data. We can have more openness and analysis of data from software systems along with some accountability by creators for the impacts of their software. Again, I don’t know how to enforce accountability, especially at scales that encompass millions of consumers and easily cross country borders. That is a problem I think we need to find ways to tackle, at least at some manageable level. Maybe using the 80/20 rule where 80% of consumers and creators find the outcome to be a good one.
The world of technology and software are advancing and growing extremely quickly. Certainly hardware advances, but it seems the last 5-10 years have been more about new and different software applications that fundamentally alter the way humans can interact in social, business, and government situations. Underpinning all the changes is data. New data, more data, and novel ways of working with this data in ways that were unheard of 20 years ago. It’s an amazing time to be a data professional.
The Voice of the DBA Podcast