The Blog of Brian Stanton

Good Robot, Bad Robot: The Challenge of Building a Moral Machine

    • Facebook
    • Twitter
    • reddit
    • Google+

The human brain is often compared, somewhat crudely, to a supercomputer. If we buy this analogy for a moment, human beliefs can be pictured as lines of computer code - vast strings of data that govern our thoughts and behaviors in the real world.


Our beliefs inform what we value, and these values in turn guide our morality. Killing is wrong, morally-speaking, because we value human life. And we value human life (at least in part) because we believe that this life - a mere blip on the geological time scale - is all we have. We might lose sight of this, of course, in the midst of rush hour traffic with an F-150 glued to our bumper.  


Yet my focus here will not be on humans. Questions of hominid values are as old as philosophy itself, and throngs of scholars have covered this ground with more and less skill. Instead, I want to explore how these ancient questions might apply to minds made of silicon.


Already these minds are among us. Driverless cars, for instance, currently must navigate true moral quandaries on the road. Even a simple hypothetical - should the car sacrifice its human payload to save a group of bystanders? - makes one squirm. In fact, when people were asked this question in the abstract, they said yes. Yes, the onboard computer should save maximum lives, even at the expense of the passenger’s. But when asked to imagine it was their car, people weren’t so utilitarian. Apparently, we don’t want our vehicle hurtling off a cliff to save a few strangers.


As computers approach and surpass human-level intelligence, the questions don’t get any easier. Is it ethical to create and destroy beings that are, by all behavioral measures, sentient? What rights should they have? And how can we trust, much less control, a superintelligent AI?* The HBO series Westworld depicts a future in which the creators steam past all of these questions. 


Practically speaking, the ability to trust and control a superintelligent AI will depend on the goals, beliefs, and values of the artificial being - a being that, at some point, will be orders of magnitude smarter than the smartest person in human history. Homo sapiens may not get the respect we desire.


“Just think about how we relate to ants,” warned author and neuroscientist Sam Harris at a recent TED conference. “We don't hate them. We don't go out of our way to harm them. In fact, sometimes we take pains not to harm them. We step over them on the sidewalk. But whenever their presence seriously conflicts with one of our goals, let's say when constructing a building like this one, we annihilate them without a qualm. The concern is that we will one day build machines that, whether they're conscious or not, could treat us with similar disregard.”


Humanity must somehow prevent this from happening. As always, the how is the hard part. We know, more or less, what we want. We want an AI (assume there’s just one) that promotes human flourishing. If we get the specifics wrong, however, this seemingly benign objective could backfire in sinister ways. For example, the AI might decide to maximize human flourishing by replacing our brains with blissful goo. And it might not ask for permission first.


We want an AI with the moral compass to avoid this scenario. We want it to be good in ways that preserve our livelihood. But since we can’t agree upon (much less understand the mechanisms underlying) human goodness, the value-loading problem** - the challenge of programming AI morality - is a difficult problem indeed. Perhaps flesh and blood ethics are merely starting points for discussion:


“It may not be necessary,” writes Oxford philosopher Nick Bostrom, “to give the AI exactly the same evaluative dispositions as a biological human. That may not even be desirable as an aim. Human nature after all is flawed, and all too often reveals a proclivity to evil, which would be intolerable in any system poised to obtain a decisive strategic advantage. Better, perhaps, to aim for a motivation system that departs from the human norm in systematic ways [...]”


An example will help drive Bostrom’s point home. Imagine dialing up the intelligence of any human - let’s say, an especially lovely human - 1,000,000 times. Could she be trusted with this preternatural power? On a good day, she might cure Alzheimer's. But if she awoke one morning in a mood, all of civilization might suffer for it. In dealing with a creature this fantastically smart, we’d need to know she’d do the right thing. And not just some of the time.


The question of how, exactly, to achieve this in silico remains open. “We need something like a Manhattan Project on the topic of artificial intelligence,” said Harris. “Not to build it, because I think we'll inevitably do that, but to understand how to avoid an arms race and to build it in a way that is aligned with our interests.”


In theory, we could build a perfectly moral machine. Whether or not this will happen, we'll find out soon enough.

* Superintelligent is Nick Bostrom’s term (see citation)
** The value-loading problem is Nick Bostrom’s term


Print sources
Bostrom, N. (2016). Superintelligence: paths, dangers, strategies. Oxford: Oxford University Press.

If you enjoyed this article, please share it via the not-so-superintelligent buttons below:

Post a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Share This