Deep neural networks still struggling to match human vision
Deep neural networks cannot fully explain the neural responses observed in human subjects when viewing images of objects such as animals and faces, new research has revealed. Researchers say the results have significant implications for using deep learning models in real-world scenarios like self-driving vehicles.
In her report, titled Deep Neural Networks and Visuo-Semantic Models Explain Complementary Components of Human Ventral-Stream Representational Dynamics, Marieke Mur of Western University in Canada aims to identify the aspects of human vision that deep learning cannot emulate.
According to Mur, computers can process incoming data, such as recognising faces and cars, through artificial intelligence called deep neural networks. This machine-learning process uses interconnected nodes in a layered structure resembling the human brain. However, while computers may be faster than humans in identifying familiar objects, they are only sometimes as accurate in the real world.
Mur's team used magnetoencephalography (MEG), a non-invasive medical test that measures the magnetic fields produced by the brain's electrical currents. By using MEG data acquired from human observers during object viewing, the team detected a key point of failure in deep learning: readily nameable parts of objects, such as “eye,” “wheel,” and “face,” can account for variance in human neural dynamics over and above what deep learning can deliver.
Real-world implications for deep learning models
In her report, Mur emphasises the importance of understanding the limitations of deep learning and identifying the aspects of human visual recognition that computers cannot replicate. The ability of the human brain to quickly identify and place familiar objects in context is crucial for real-world applications, such as autonomous driving or facial recognition technology. As deep learning and artificial intelligence use continue to grow, understanding their limitations will be essential for developing more accurate and reliable systems.
“These findings suggest that deep neural networks and humans may in part rely on different object features for visual recognition and provide guidelines for model improvement,” says Mur.
The findings indicate that deep neural networks cannot fully explain the neural responses observed in human subjects when viewing images of objects, including animals and faces. This has significant implications for using deep learning models in real-world scenarios like self-driving vehicles.
“This discovery provides clues about what neural networks are failing to understand in images, namely visual features that are indicative of ecologically relevant object categories such as faces and animals,” says Mur. “We suggest that neural networks can be improved as models of the brain by giving them a more human-like learning experience, like a training regime that more strongly emphasises behavioural pressures that humans are subjected to during development.”
For instance, the ability to rapidly differentiate between an approaching animal and other objects and predict its following action is crucial for human survival. By incorporating these factors into the training process, the performance of deep learning methods in modelling human vision may be improved.
“While promising, deep neural networks are far from being perfect computational models of human vision,” says Mur.