Home Bloomberg What we need to talk about when we talk about AI

What we need to talk about when we talk about AI

December 1, 2023 | 12:03 am

By Dave Lee

Douglas Rain in 2001 A Space Odyssey (1968) — A SPACE ODYSSEY (1968)

EVER since Alan Turing’s “imitation game,” we’ve been acutely aware of the importance of measuring the capabilities of computers against our own miraculous brains. The British pioneer’s method, outlined in 1950, is primitive today, but it sought to answer a persistent question: How will we tell when a machine has become as (or more) intelligent than a human being?

Defining such progress is imperative for productive conversations about artificial intelligence (AI). Specifically, the question of what can be considered artificial general intelligence (AGI) — a “mind” as adaptable as our own — needs to be considered using a set of shared parameters. Currently, the term lacks precise definitions, making predictions of AGI’s arrival and impact simultaneously both unnecessarily alarmist or insufficiently concerned.

Consider the hopeless spread of predictions on AGI. Earlier this year, the preeminent AI researcher Geoffrey Hinton predicted “without much confidence” that AGI could be present within five to 20 years. One attempt to collate a sample of approximately 1,700 experts offered timing estimates from next year to never. One reason for the chasm is that we haven’t decided collectively what we’re even talking about. “If you were to ask 100 AI experts to define what they mean by ‘AGI,’ you would likely get 100 related but different definitions,” notes a recent paper from a team at DeepMind, the AI unit within Google.

One of the paper’s co-authors, Shane Legg, is credited with popularizing the AGI term. Now he and his team are seeking to set up a sensible framework with which to measure and define the technology — a taxonomy that can be used to help assuage or heighten fears and offer straightforward context to non-experts and legislators.

The effort is modeled on the system for describing the capabilities of self-driving cars. In 2014, SAE International (formerly the Society of Automotive Engineers) defined six distinct levels of autonomous capability, from Level 0 — human driver in full control of vehicle’s operation — to Level 5 — full automation of all the vehicle’s functions in all conditions. The scale has proved useful for lawmakers to set rules of the road and for the public to understand their cars’ capabilities. A car with Level 2 automation — steering, lane changes, acceleration and deceleration, in some settings, mostly on highways — can be legally driven on the road today on the condition that a human is sitting alert to take over immediately. But Level 4 or 5 cars, such as Alphabet’s Waymo cars on trial in San Francisco, need special permission to be used in public and are subject to additional oversight on their performance.

Classifying AGI will be much more complex than autonomous vehicles because the latter is merely a subset of the former. But the leveling system is useful for AI, too. In assessing capabilities, the DeepMind team split AI into two groups: narrow and general. A narrow AI, for instance, could have superhuman capability for one application, such as protein folding, but be incapable of writing a simple short story. To be considered AGI, according to DeepMind, a system must demonstrate a “wide range of non-physical tasks, including metacognitive abilities like learning new skills.”

Levels are determined by their capabilities when compared with humans. At Level 1, “Emerging,” an AGI should be “equal to or somewhat better than an unskilled human.” That’s where the famous chatbots like ChatGPT are today, just about. Level 2, “Competent,” would require performing at the standard of the top 50% of skilled adults. No AGI has yet achieved Level 2, the DeepMind team determined. From there, it envisioned “Expert” (more capable than 90% of skilled humans), “Virtuoso” (99%), and “Superhuman” (100%).

But these levels alone wouldn’t be sufficient to determine the capability — or danger — of AGI. One distinct fear among those who worry about existential risk is the possibility that a smart-enough machine could act autonomously, possibly against humans — otherwise known as the “I’m sorry Dave, I’m afraid I can’t do that” scenario. For this reason, the DeepMind team applies an accompanying classification for levels of AI autonomy — in which Level 1 is human in full control, automating mundane tasks — up to Level 5, a fully autonomous AI capable of working without human oversight.

Understanding these levels helps us better classify risk and react accordingly. A company developing an AGI with Level 1 autonomy (like ChatGPT) is of relatively little regulatory concern. But expert AGI, with Level 4 autonomy, is the point at which researchers foresee mass labor displacement and the “decline of human exceptionalism.”

As well as protecting against societal harm, an agreed-upon standard will also come in particularly useful in dispelling disingenuous attempts to overplay the capabilities of an AI as a marketing gimmick. It has helped, for example, that the SAE standard for autonomy means Tesla’s claim of “Autopilot” can be more accurately described as merely Level 2 automation.

For the system to work, relevant and rigorous tests will be needed to determine the appropriate level on this scale. The nature of these tests is still to be determined, but, researchers said, they should cover mathematical tasks, spatial reasoning, social intelligence, and more — an AI pentathlon of sorts*, benchmarks that must be iterated and improved with the same vigor and regularity as the AI systems they’re seeking to measure.

Proper classification will settle some nerves and bring some much needed composure to the AGI conversation. It serves everyone to have clear definitions in that space between “benign” and “annihilation of the human race.”

Bloomberg Opinion

* Our imagination could run wild with these. My favorite is an idea for a test generally attributed to Apple co-founder Steve Wozniak, though it’s not clear if he’s paraphrasing others. His “coffee test” would assess if an AI-powered robot was smart enough — without specific programming — to enter a typical American home, find a coffee machine and make a cup of coffee. Now that’s the kind of progress I can get behind.

What we need to talk about when we talk about AI

Cosco, Thailand’s Siam Global partner for one-stop warehouse

Inflation yet to reach peak, analysts say

WeWork optimistic about growth prospects in the Philippines

RELATED ARTICLESMORE FROM AUTHOR

Philippine auto sales slump in April as oil prices surge

Remittances from Mideast likely to remain resilient

Soaring pump prices now reshaping Philippine travel, retail demand

Cosco, Thailand’s Siam Global partner for one-stop warehouse

Inflation yet to reach peak, analysts say

WeWork optimistic about growth prospects in the Philippines

RELATED ARTICLES MORE FROM AUTHOR