Some “glances” are enough for this AI to deduce a complete 360 degree view
The new artificial intelligence agent takes a few "glimpses" of its environment, representing less than 20% of the 360-degree view, and deduces the rest of the environment. Credits: David Steadman / Santhosh Ramakrishnan / UT Austin
Computer scientists have developed an artificial intelligence that has been able to learn to see “as a human”. They taught their algorithmic system to take a quick look at the elements of the surrounding environment to then deduce an overview.
This new skill is necessary for the development of effective search and rescue robots, which may one day improve the effectiveness of dangerous missions.
Most artificial intelligence agents (computer systems that can provide robots or other intelligence machines) are trained in very specific tasks, such as recognizing an object in an already-experienced environment.
This new intelligent IT agent is intended for general purpose, gathering visual information that can then be used for a wide range of tasks.
“We need universal equipped agents to evolve in any environment by being ready for new tasks of perception, as they arise” says Kristen Grauman, a professor in the computer science department at the University of Texas in Austin. “This new algorithm behaves in a versatile way and it is able to succeed different tasks because it has learned useful models on the visual world “.
The researchers used deep learning, a type of machine learning inspired neural networks of the brain, to train their agent to thousands of 360-degree images of different environments.
When presented with a scene he has never seen before, the agent uses his experience to pick a few glimpses – like a tourist in the middle of a city taking a few shots in different directions – which together represent less 20% of the complete scene.
This system is so effective that it not only takes pictures in random directions, but after each preview, the next picture (location to be observed) is chosen, allowing the most recent information to be added to the entire scene. The study was published in the journal Science Robotics.
It’s a bit like being in a previously unknown store and seeing oranges, you would expect to find other fruits nearby, but to locate vegetables for example, you would probably look the other way.
Based on previews, the agent deducts what he could have seen if he had looked in all other directions, thus reconstructing a complete 360-degree image of his environment.
“Just as you provide prior information about the regularities in previously experienced environments (as in all grocery stores you have visited), this agent conducts research in a non-exhaustive way,” explains Grauman. “He learns to guess intelligently where to gather visual information to perform perception tasks.
One of the main challenges for the researchers was to design an agent capable of working under tight deadlines. This would be essential in a search and rescue application. For example, in a burning building, a robot would be required to quickly locate people, flames and hazardous materials, and transmit this information to firefighters.
For the moment, the new agent cannot move, although he has the possibility to point a camera in all directions. Equally, the agent can look at an object he is holding and decide how to move it to inspect it from another side. In a second step, researchers will be able to develop the system so that it can be used on a mobile robot.
“The use of additional information that is only present during training helps the (lead) agent to learn faster,” explains Ramakrishnan.