Blogs

Measuring the artificial intelligence quotient

Post Comment
Big Data Evangelist, IBM

If we’re going to do the Turing Test right, we might as well use the same intelligence tests we apply to humans. Only by doing so will we ever have a sound basis for claiming that machines are more (or less) intelligent than people.

http://www.ibmbigdatahub.com/sites/default/files/measuringai_blog.jpgWhat intelligence do we measure?

Classic intelligence tests measure something called “intelligence quotient,” a term that, as noted in this article, was coined a century ago by German psychologist William Stern. As the article states, the foundation of IQ testing is to gauge human pattern-matching aptitudes in the following three areas:

  • Logical: identifying patterns in sequences of concepts
  • Mathematical: identifying patterns in sequences of numbers
  • Linguistic: identifying patterns in words, primarily focused on semantic patterns such as analogies, classifications, synonyms and antonyms

These cognitive tasks may not be the only types of intelligence worth considering. But if the standard Turing test is to have any validity as a measure of machine intelligence, we must at least use the same core yardstick we apply to those indisputably brainy creatures called Homo sapiens.

As we consider advances in machine learning algorithms, we recognize that it’s only a matter of time before computers will ace practically any standard IQ test we could throw at them. With their multilayered ability to find patterns in linguistic, mathematical and other patterns within complex content corpora, these and other cognitive-computing algorithms will indeed grow smarter—and smarter, and smarter still.

As a case in point, take reports on advances by researchers in China, who have developed a deep-learning implementation that outperforms average humans in answering verbal reasoning questions. They measured their algorithm’s performance against that of a crowdsourced set of humans participating via Amazon Mechanical Turk. You can read the article for more details about how they did it.

The implementation details of these and other artificial intelligence (AI) features are less important than that they succeeded in outshining humans on tasks that fall into standard IQ categories. And if you consider that the other types of intelligence that most people recognize—such as visual, musical, spatial, emotional and social—are increasingly mimicked by deep learning, it will be only a matter of time before machines master these as well.

Clearly, big data is fundamental to this promise. Machines—such as IBM Watson—can perform these amazingly intelligent feats only if they have access to the right data at the right time, and only if their deep-learning algorithms have been retrained and adapted to continually updated corpora.

How smart are our machines becoming?

We may choose to retrench the Turing test back to those tasks that depend entirely on humans’ ability to leverage corpora beyond the reach of any machine—such as details in our lived experience that have never entered any digital corpora. For example, I have commented on how algorithmically gaming such a Turing test gets harder when you consider intelligence as the situationally contextual expression of a specific individual’s unique experience. Much of our intelligence flows from unique shared understandings within specific relationships and can’t easily be counterfeited. It’s much easier, for example, to fool my wife into thinking she is conversing with an unspecified human (rather than with a machine imitating a human) than it would be to hoodwink her into believing that she was interacting with the man she’s been living with for several decades.

But if we rethink the Turing test to focus on the patterns we can identify from personal experience, we do it at our peril. Doing so would mean that we no longer focus IQ testing on the cognitive tasks—reading, writing, calculating, inferring and so forth—that our society considers the most valid yardstick for making human-to-human comparisons. And if we apply one group of IQ tests to people (for example, considering our ability to factor lived experience into some cognitive tasks) and a different group to machines (such as old-fashioned standard IQ tests no longer applied to humans), we sacrifice human-to-machine comparability.

We all know that machine intelligence is real. And we’re all interested in whether that intelligence is as good as, better than or equivalent to human intelligence in various scenarios. But we all know that machines will grow smarter in our lifetimes, whereas humans generally do so with a long evolutionary time lag. So let’s not be squeamish about this. Let’s develop a testing foundation for measuring how much smarter our machines are becoming than we are in so many ways.

Watson solutions aim to enhance, scale and accelerate human expertise, leveraging increasingly powerful artificial intelligence and cognitive computing capabilities. To explore how Watson Analytics can boost your own very human intelligence, start the journey.