Despite dazzling advances in AI, robots are still horribly ham-fisted.
Increasingly, researchers and companies are turning to machine learning to make them more adaptive and dexterous. This typically means feeding the robot a video of what’s in front of it and asking it to work out how it should move in order to manipulate that object. For instance, researchers at OpenAI, a nonprofit in San Francisco, taught a robotic hand to manipulate a child’s block in this way.
Sign up for the The Algorithm
Artificial intelligence, demystified
By signing up you agree to receive email newsletters and
notifications from MIT Technology Review. You can change your preferences at any time. View our
But humans, of course, use more than just their eyes to learn how to handle objects. Vision is combined with a sense of touch—and we learn, early on, that objects positioned unstably will probably fall over.
That is what inspired a new robot, developed by Nima Fazeli and his colleagues at MIT, that has been given a fundamental understanding of the real world’s physics—and a usable sense of touch.
It proved how nimble-fingered it is by mastering Jenga, a game that involves removing blocks from a precariously assembled tower, ideally without causing it to topple over. The robot also displayed a kind of ingenuity that is crucial for human players: judging which block it can remove without making the tower fall down.
The research draws from several key ideas developed by Josh Tenenbaum, in the Department of Brain and Cognitive Sciences at MIT, and his research on human cognition. This includes the idea that humans develop an intuitive understanding of physics from an early age, and that probability is key to reasoning about the world. This differs from a lot of AI research today, which revolves around feeding as much data as possible to very large, or “deep,” neural networks.
The robot, equipped with force sensors as well as cameras, learns to play Jenga by poking and prodding blocks and using visual and tactile feedback to train a physics model of the world.
Then, when faced with a new tower of blocks, it used the model to infer, probabilistically, which block it should try to poke out of the tower next. You can see how good it was in the video above.
By combining vision, touch, and this model of real-world physics, the robot can learn to play Jenga more efficiently than would be possible otherwise. The intuitive physics model also lets the robot understand quickly that a block hanging over an edge will most probably fall. In testing, the approach outperformed conventional machine-learning methods. The research is published today in the journal Science Robotics.
This more humanlike learning technique could help make factory and warehouse robots far more capable. If that fails, they could at least challenge you to a fun party game.