Berkeley Visual Foresight Technology "Seeing the Future" Robots Predict Unmanned Accidents

Computer scientists at the University of California, Berkeley, have developed robotics technologies that "foresee what will happen in the future." Using visual foresight technology, in the case of completely autonomous learning, such robots can see what kind of results they will produce if they perform certain actions. The current robot prototype is still relatively simple and can only predict the future for a few seconds. The researchers demonstrated this technology at NIPS2017. For babies and toddlers, playing with toys is more than just fun and games. It is important for them to learn and understand how the world works. Inspired by this, researchers at the University of California, Berkeley, have developed a robot that, like a baby, learns to understand the world from scratch, experiment with objects, and find out how to move objects in the best possible path. In this way, robots can "see" what will happen in the future. As the following video introduction shows, this robot is called Vestri. It imagines how to accomplish tasks by playing objects like a baby. Researchers at UC Berkeley have developed a robotic learning technology that enables robots to imagine the future of their actions so that they can know how to handle objects that they have never encountered before. In the future, this technology can help self-driving cars predict future road conditions, or achieve smarter home robotics assistants, but this initial prototype focuses on learning simple hands-on skills from fully autonomous games. This technique is called visual foresight. Using this technique, robots can predict what their cameras will see when they perform a step in an action. The imaginative power of these robots is still relatively simple now - only predict the future for a few seconds - but they are enough for the robot to figure out how to move objects on the table without touching obstacles. Importantly, robots can learn to perform these tasks without the help of humans, nor do they need prior knowledge of physics, environment, or objects. This is because visual imagination is learned from scratch in unsupervised exploration. Only robots play objects on the table. After this game phase, the robot builds a predictive model of the world and can use it to manipulate new objects it has never seen before. "Like how humans can imagine how our movements will move objects in the environment, this method allows robots to imagine how different actions will affect the world around them," said Assistant Professor, Department of Electronic Engineering and Computer Science, Berkeley University. Sergey Levine said that his laboratory has developed this technology. "This enables intelligent planning of highly flexible skills in a complex real world." The research team at the NIPS2017 conference conducted a visual preview of the demo. At the core of this system is a deep learning technique based on convolutional video prediction (dynamic convolutional video prediction) or dynamic neural advertising (DNA). The DNA-based model predicts how pixels in an image move from one frame to the next based on the motion of the robot. Recent improvements to this type of model, as well as greatly enhanced planning capabilities, have enabled robotic control based on video prediction to perform increasingly complex tasks such as moving toys around obstacles and relocating multiple objects. “In the past, robots have been able to learn skills with the help and feedback of human executives. What's exciting about this new job is that robots can learn a whole range of visual object manipulation skills completely autonomously.” Levine Labs Said Dr. Chelsea Finn, a doctoral student and inventor of the original DNA model. Using this new technology, the robot pushes the object onto the table and then uses the learned prediction model to select the action that moves the object to the desired position. The robot uses the learning model observed from the original camera to learn how to avoid obstacles and push objects around obstacles. “In human life, through millions of interactions with a variety of objects, the skills of manipulating objects are learned without any teachers. We have proven to build a learning environment that uses a large amount of autonomously collected data to learn. A robotic system with suitable operational skills is also feasible, especially the skill of pushing objects.” Frederik Ebert said that he is a graduate student of Levine Labs, one of the researchers of this project. Since the control over video prediction depends only on observations that can be collected autonomously by the robot, for example images collected by a camera, this method is versatile and has wide applicability. Unlike traditional computer vision methods, traditional computer vision methods require humans to manually mark thousands or even millions of images. Building a video prediction model requires only unlabeled video and can therefore be completely autonomously acquired by the robot. In fact, the video prediction model has also been applied to various data sets from human activities to driving, and has achieved convincing results. Levine said: "Baby can understand the world by playing with toys, swinging toys, grasping, etc. Our goal is to let robots do the same thing: through autonomous interaction to understand how the world works." He said: "This Although the robot's capabilities are still limited, its skills are completely self-learning, which allows it to predict complex physical interactions by constructing previously observed interaction patterns." UC Berkeley scientists will continue to research robotics control through video prediction, focus on further improving video prediction and prediction-based control, and developing more sophisticated methods that allow robots to collect more focused video data for complex tasks such as picking And place objects, manipulate soft and deformable objects such as fabrics or ropes, and assemble.

Posted on