(By @Shahab Bakhtiari, @SNAIL, September 30, 2024)
Neural networks, both biological and artificial, don’t always distribute neurons evenly across input features. Some features may be encoded more frequently by neurons than others. This kind of representational asymmetries are all over the cortex: cardinal orientations, centrifugal motion stimuli, horizontal disparities, expansive optic flow, etc.
As discussed in a recent paper by Andrew Lampinen and colleagues, several factors can influence this asymmetric representation of features (Figure 1). These include the ease of extracting a feature from the input, its prevalence in the training data, and the order in which features are learned. The effect of feature distribution in the training data has specifically been studied more extensively before.
Figure 1. The effect of task difficulty and order of learning on biased representations. (From Lampinen et al., 2024)
Now that we know biased representations exist in both the brain and artificial neural networks (ANNs), the next question is: how do these biases influence downstream readout and behavior? In the last figure of the paper, Lampinen et al. explore this question (Figure 2). Interestingly, they show that in a downstream task where both over- and under-represented features are equally and highly predictive of the output label, the output primarily relies on the over-represented feature to solve the task. This only happens if the features are predictive of the correct output with high probability.
Figure 2. In a binary classification task where both over-represented (here, easy) and under-represented (here, hard) features are equally informative, the downstream readout relies more heavily on the over-represented feature when both are highly predictive of the output. This can be seen in the divergence of the two curves as the probability increases. (From Lampinen et al., 2024)
But does this over-reliance on over-represented features also occur in the brain? In our recent paper, we explored this very question. We investigated how cortical representational asymmetry (or bias) influences visual perception and learning. We studied visual learning in an optic flow discrimination task, building on extensive research on area MST of the monkey visual cortex; an area responsible for complex motion. This body of research have shown that expansive optic flow (the type experienced when moving straightforward) is overrepresented in area MST: more neurons in this area prefer expansive optic flow. Interestingly, when we tested human observers on a noisy version of the optic flow stimuli, they tended to report the absence of expansive optic flow in favor of contractive flow despite the former being the over-represented feature (Figure 4). This puzzling result was ultimately explained by a mechanism similar to Lampinen et al.’s: human visual decisions are primarily driven by neurons tuned to the over-represented feature.
Figure 3. Top: A demonstration of area MST in the monkey visual cortex. Bottom: Schematics of expansive/contractive optic flow. (From Wurtz, 1998)
Figure 4. Perceptual bias measured in humans during the optic flow discrimination task. Across all three different stimulus durations, human observers showed a consistent bias toward contractive optic flow. (From Laamerad et al., 2024)
So, how does this over-reliance on the over-represented feature explain our observed behavioral bias? Under natural conditions with high signal-to-noise ratio (SNR), where visual representations and their readouts primarily develop, human behavior, similar to the readout of ANNs (à la Lampinen et al.), relies mostly on the over-represented feature (here, neurons that encode expansive optic flow). However, in lab conditions where we test humans on the low SNR version of optic flow stimuli, this same strategy backfires and leads to the observed behavioral bias (more details in our paper).
Figure 5. The binary classification task used for the simulations in Figure 16 of Lampinen et al. The latent representations of a trained ANN are used in a classification scenario where both over-represented and under-represented features are predictive of the output with high probability. The trained classifier is then tested on the same task, but with the two features being predictive of the output with different probabilities. As the test probability (i.e., feature signal-to-noise ratio) decreases, the output exhibits a bias toward the absence of the overrepresented feature.
Does the same behavioral (or output) bias also appear in the model and tasks used by Lampinen et al.? I decided to investigate: I implemented a simplified version of our experimental condition using their task. In the training condition (analogous to the natural developmental condition of the visual system), I set the output category to be highly and equally predictable from both the over- and under-represented input features. Training the downstream classifier on this condition led the model to rely heavily on the over-represented feature, as Lampinen et al. had shown. Then, I simulated our lab experiment condition by testing the same trained model in a scenario where both features were predictive of the output category with low probability. This revealed exactly the bias we anticipated from our paper (Figure 5): an output biased towards the absence of the over-represented feature!
This exemplifies the shared principles that govern both artificial neural networks (ANNs) and biological brains. Certain features are over-represented in the brain, meaning that a greater number of neurons respond to these features when they are present in the input. This over-representation leads to a significant reliance on these neurons in behavior (or downstream readout) when their preferred feature is highly predictive of the decision. In essence, the brain transforms a discrimination task (i.e., distinguishing between two conditions) into a detection task of a single over-represented feature. While this strategy, as shown in our paper, can result in biased behavior under low SNR conditions, such scenarios are relatively infrequent in our daily experiences.
Given these findings, and considering the prevalence of asymmetric representations in the cortex, I speculate that the ANNs most similar to the brain are those that capture the same representational asymmetries and biases. Or at least this is an additional, complementary alignment criterion that should be included in our alignment toolbox. While current Representational Similarity Analysis considers two neural networks similar if the relative distances between stimuli in their representational spaces are comparable, I propose that we should also account for the representational asymmetries of the networks. As demonstrated above, these asymmetries significantly influence behaviour in various situations, and may even lead to behavioural bias.
There are two nuances related to the concept of alignment based on representational asymmetry that must be considered:
First, as Lampinen et al. discuss and Andrew highlights in this Twitter/X thread, representational similarity analysis (RSA)—or at least some of its metrics—can be quite sensitive to representational asymmetry and the over-representation of certain features. However, it’s still unclear which specific RSA metrics are most affected and how we can separate the impact of representational asymmetry from other factors that influence representational similarity.
I ran a simple simulation of a population with orientation-tuned neurons. The results show that Representational Dissimilarity Matrices (RDMs; with Euclidean distance) for populations with symmetric and asymmetric representations of orientation can differ significantly. This difference reflects how much the representational space is warped by the over-represented orientation (Figure 6). Therefore, while RDMs, as typically calculated in neuroscience and deep learning, do show sensitivity to representational asymmetry (as also expected from the previously shown relationship between RDMs and the tuning functions), it’s not entirely clear how biases affect comparisons between RDMs of different networks.
Figure 6. Top: Representational Dissimilarity Matrices for an unbiased (left) and a biased (right) population of orientation-tuned neurons. In the simulated biased population, 70% of neurons were tuned to 45 degrees orientation. Bottom: Multidimensional scaling (MDS) of the two RDMs which show the warping of the representational space towards the over-represented orientation.
Second, if we agree that the most aligned ANN model of the brain should capture the same representational asymmetries—or at least include this criterion in our alignment analysis—then studies like Lampinen et al. offer insights into the factors that must be considered when building these models. As explained by Andrew and colleagues, the over-representation of a feature is influenced by the difficulty of extracting that feature from input, and the training regimen of the network, including the temporal order of tasks and the statistical properties of the input. These factors must be taken into account when building an ANN that closely aligns with the brain based on representational asymmetry. The repertoire of tasks, with varying levels of input-output complexity, should not only match between brains and ANNs but also consider the order in which they are acquired by the brain through development and/or evolution. Additionally, the environmental statistics, and how they are influenced by the movement of the animal within the environment, affect representational asymmetries and must be similar between the two networks. If representational asymmetry is key to brain-ANN alignment and is significantly influenced by these factors, building the most aligned ANN to the brain will require an immense amount of work—meaning we won’t be out of a job anytime soon.
Thanks to Andrew Lampinen, Patrick Mineault, and Chris Pack for their comments on the initial version of this post.