In 2018, autonomous driving accelerated the pace. In the United States, dozens of companies have launched a flood of test vehicles on the roads. One of the consequences has been the increase in accidents where these cars converge. Some, like the outrage carried out by a vehicle of Uber or the shock of a Tesla in function Autopilot, have thrown mortal victims.
Technology is in a process of improvement, to which these tests contribute. One of the aspects that have to be polished are the databases. And within these there is a factor of enormous importance. It deals with the classification or labeling of the images.
The labeling belongs to the technical insides of the autonomous driving systems. It can be compared with the cognitive capacity that every vehicle that wants to guide itself should have.
Of course, the labeling of the images is one of the keys of the vehicle that can not fail. The deep-learning specialist Lucas García, from MathWorks, who develops analytical software for autonomous cars, highlighted the importance of this factor in a conference during the Big Data Spain event. In conversation with EL PAÍS, this mathematician, who has also been a researcher at the Complutense University of Madrid, summarized the question: “A bad labeling of the data can result in an algorithm that is not able to solve problems correctly
This coupled with a compromised situation on the road makes the vehicle more prone to an accident. If the cameras of an autonomous car are his eyes, the way he knows reality, the labeling of the database is his cognitive capacity. By comparing with the database the vehicle understands its environment.
For the car to make good decisions two basic circumstances are needed. “If we want to create an algorithm that detects pedestrians, cyclists and other vehicles very well, we must first have done a process of collecting data and signals provided by the sensors,” Garcia points out, adding immediately: “And also we have to label them correctly
Labeling objects in an image is a finer job than it may seem. The classification will be transmitted to the deep learning algorithm or neural network, which makes the decisions of the autonomous car. Therefore, the information has to be as accurate as possible. “One possibility of labeling is that in an image where a car appears we draw a rectangle above to identify that this is the car,” says the mathematician. “Another would be to say exactly to which pixels of the image the car corresponds
The relationship is clear: the more precise the labeling, the better the algorithms that are nurtured from it. Like any system based on artificial intelligence, that of the autonomous car is probabilistic. “The analysis carried out by all these deep learning models is based on a probability,” he says. “Although the systems try to be very robust, if the machine learning model foresees that with a probability of 99.95% what is in front of it is a car, obviously there is a 0.05% chance of it being something else
All these artificial intelligence models have their failure rates. They are design errors that engineers obviously try to reduce. For this there is no other choice but to dedicate time and specialized people to do the labeling. There are solutions, like the ones García works on from MathWorks, to automate the process, but there must always be a human behind to validate the results.
Neural networks that use autonomous cars require millions of images tagged to work. It is an incredible job, with an inevitable manual component. And this only with respect to the images. But there are more sensors that complement the camera, such as radar, proximity detector or lidar. The latter much more expensive to label correctly.