In 2020, the American poet Andrew Brown gave a student the following assignment: write a poem from the point of view of a cloud looking down on two warring cities. The student came up with the following poem:
“I think I’ll start to rain,
Because I don’t think I can stand the pain,
Of seeing you two,
Fighting like you do.”
Impressive, isn’t it?
Well, Brown’s ‘student’ turned out to be a computer program, not a human.
The program, called GPT-3, is one of the most powerful AI language models ever made. Created in 2020 by the research firm OpenAI, its development has cost tens of millions of dollars. Trained on 200 billion words from books, articles, and websites, GPT-3 can generate fluent streams of text on any topic you can imagine.
Nowadays, algorithms are everywhere.
Companies like Amazon, Netflix, Spotify, and LinkedIn feed our personal preferences into them to create targeted recommendations. But their power doesn’t stop there. Google got an AI to design computer chips in just under six hours. For a human, that would take months.
It was an AI, called AlphaFold, that cracked the ‘protein folding problem’, one of the greatest unsolved challenges in biology. Most recently, Deepmind initiated its own language model called RETRO, which according to the company, it can beat larger models 25 times its size.
We’ve even seen the emergence of hundreds of AI artists: algorithms that paint psychedelic images, create pieces of music, and compose poetry, like Robert Brown’s ‘student’.
So, does this mean that we’re heading to the field’s holy grail: human-level intelligence, also called AGI (artificial general intelligence)? To answer that, let’s have a closer look to the power and limits of artificial intelligence.
Math, music and algorithms are closely connected
Music and computers share a surprising common ground: the mathematical language of algorithms.
In his book The Creativity Code: How AI Is Learning to Write, Paint and Think, mathematician Marcus du Sautoy explains that classical composers often use algorithms to create musical complexity. They start with a simple melody or theme and transform it according to mathematical rules. Using math, they create variations and additional voices to build the composition.
Composers with a strong signature style prefer certain mathematical patterns over others. Mozart, for instance, often used the Alberti bass pattern: three notes played in a sequence of 13231323.
Vision and language: AI’s blind spots
A year after GPT-3 made its debut, OpenAI’s team strove for a bigger goal: building a neural network that could work in both images and text. This required two new systems, named DALL- E and CLIP.
DALL-E is a neural network that, according to OpenAI’s chief scientist Ilya Sutskever, can make an image out of any piece of text. It can produce the desired result even if it hasn’t encountered a particular concept before in training.
Two anthropomorphic daikon radishes walking their pet dogs. Image generated by OpenAI’s DALL-E model. Credit: OpenAI
CLIP can take any set of visual categories and instantly create reliable and visually-classifiable textual descriptions.
However, like GPT-3, the new models are far from perfect. DALL-E, in particular, depends on the exact phrasing of the prompt to generate a coherent image. In fact, the flexibility of language remains a blind spot for artificial intelligence.
Take this phrase: ‘The children won’t eat the grapes because they’re old’.
Unlike humans, a program cannot figure out who is ‘old’: the children or the grapes. How we interpret a word or a phrase depends on the context, but also a level of understanding that goes beyond the simple definitions of words. A human mind can do this, but not an algorithm.
Besides language, algorithms have difficulty seeing the ‘big picture’.
An AI that recognizes images does so by asking questions about each individual pixel that makes up the image. However, the specific combination of pixels that we intuitively recognize as a cat, for example, is different each time. The program must therefore ‘learn’ to correlate different pixels with each other to decide whether the photo contains a cat.
This is why many websites use a reCAPTCHA system to prevent automated spam. We can easily identify the cars or bridges in random photographs, but programs can’t.
Machines’ limits and Moravec’s paradox
In a paper titled Why AI Is Harder Than We Think, Melanie Mitchell, a computer science professor at Portland State University, discusses the most common misconceptions around AI.
Thanks to machine learning, modern AI exceeds many of our previous expectations but its creativity and intelligence remain narrow: its skills come from large datasets provided by humans in a specific given context, not a deeper level of true understanding.
That’s another reason why human language cannot describe AI ‘intelligence’. As Mitchell points out, we use words like ‘reading’, ‘understanding’, and ‘thinking’ to describe AI, but these words don’t give us an accurate depiction of how the AI actually functions.
Hans Moravec is an Austria-born American roboticist and computer scientist, and presently, an adjunct faculty member of the Robotics Institute of Carnegie Mellon University.
In the 1980s, Moravec and other pioneering scientists like Marvin Minsky and Rodney Brooks made an insightful observation: high-level reasoning is easy for machines but simple human actions, sensorimotor skills and menial physical tasks are incredibly difficult.
Fast forward to today, and Moravec’s observation holds strong.
So, does that mean that a four-year-old child is smarter than a superpowered, million-dollar AI? The answer is yes, if we recognize the complexity of the everyday tasks we perceive as easy.
Children can easily figure out cause-and-effect relationships based on their interactions with the world around them and accumulated experience. Machines often fail to make basic causal inferences because they lack that context.
The so-called cognitive revolution of abstract thinking and mathematical reasoning are relatively recent development. However, our visual perception, hearing and motor skills are ‘embedded code’ — the result of thousands of years of evolution.
These human capabilities are simply not easy to be developed by a narrow-AI, which primarily focuses on rigid mathematical reasoning to work out solutions.
Intelligence doesn’t lie solely in our heads
As Mitchell points out, intelligence isn’t all about our heads: it also requires a physical body.
In Fyodor Dostoyevsky’s 1868 novel The Idiot, Myshkin tells the Yepanchin ladies the story of a man who is condemned to die but who is pardoned just before the execution. But it wasn’t just a story: Dostoyevsky was putting his own lived experiences to paper. Through a fictional character, he described what his own mind and body experienced one cold day in 1849: the uniquely human feeling of his own impending death.
According to Ben Goertzel, an expert on artificial general intelligence,
Humans are bodies as much as minds, and so achieving human-like AGI will require embedding AI systems in physical systems capable of interacting with the everyday human world in nuanced ways.
Just the idea of AGI takes us to fascinating places. But so long as we don’t fully comprehend intelligence itself, or even our own minds, belief in AGI is like a belief in magic: something that dazzles us, that makes great headlines, but which we fundamentally cannot understand.