In a classroom, students are often encouraged to “show your work” to help teachers understand the thought process that led to their decisions—the reasoning. Can this approach also work for artificial intelligence (AI) machine learning? Putting this old-school strategy to the test, Duke University researchers have created a new deep neural network method for AI computer vision that provides greater transparency and published the study in last month’s Nature Machine Intelligence.
The Duke University trio of computer science professor Cynthia Rudin, Zhi Chen, and Yijie Bei trained a deep neural network to show its work and express the concepts gained throughout the learning process, rather than afterwards.
Deep learning’s architecture of neural networks have enabled state-of-the-art computer vision. It is not evident what neural networks encode in its latent, or hidden space. In machine learning, the latent space is a representation of compressed data. The latent space of typical convolutional neural networks does not have direct disentanglement.
Latent variables are hidden variables that are inferred from other variables that are directly observed and are used to reduce the dimensionality (number of features) of data. Dimensionality reduction, or feature extraction, is the transformation of data from a high-dimensional space to a low-dimensional space to get closer to its intrinsic dimension. Dimensionality reduction may be used for brain-computing interfaces, neuro-informatics, speech recognition, genomics, signal processing, neurotech, bioinformatics, and neurocomputing technologies.
The team created a module called concept whitening into a neural network that constrains the latent space to represent concepts so that it is aligned. In data science, whitening refers to the pre-processing step in machine learning where the covariance matrix of random input vectors is transformed by linear transformation to become the identity matrix. Concept whitening shapes the latent space via training, and for a given layer of the network, it shows how a concept is represented. This is a more flexible approach than providing complete details of each computation which requires more constraints and can be used in any layer of the neural network without impacting prediction accuracy.
What further differentiates the Duke researchers’ solution from other approaches is that their “whitening matrix is multiplied by an orthogonal matrix and maximizes the activation of known concepts along the latent space axes.” The researchers optimize the orthogonal matrix using Cayley transform-based curvilinear search algorithms in their concept whitening module.
According to the researchers, their concept whitening can replace the batch normalization step in convolutional neural networks (CNNs). The module is easy-to-use given it only needs one additional epoch of training. Other advantages over batch normalization include superior interpretability while maintaining accuracy on par with standard convolutional neural networks.
Artificial intelligence machine learning algorithms share a fundamental problem—no one really knows exactly what led to its predictions and decisions. This inherent opacity in AI’s black box presents potential ethical, safety and security issues—especially in areas such as autonomous vehicles, radiology, and medical diagnostics. With this new proof-of-concept of adding a concept whitening module to a convolutional neutral network, now there is a new way to improve the transparency of AI machine learning—without sacrificing prediction accuracy.
Copyright © 2021 Cami Rosso All rights reserved.