To an untrained AI, the world is a blur of confusing data streams. Most humans have no problem making sense of the sights and sounds around them, but algorithms tend only to acquire this skill if those sights and sounds are explicitly labelled for them.
Now DeepMind has developed an AI that teaches itself to recognise a range of visual and audio concepts just by watching tiny snippets of video. This AI can grasp the concept of lawn mowing or tickling, for example, but it hasn’t been taught the words to describe what it’s hearing or seeing.
“We want to build machines that continuously learn about their environment in an autonomous manner,” says Pulkit Agrawal at the University of California, Berkeley. Agrawal, who wasn’t involved with the work, says this project takes us closer to the goal of creating AI that can teach itself by watching and listening to the world around it.