A Thalamocortical Model For Contextual Inference

In the quest to understand how our brains process and adapt to complex environments, computational models have long served as powerful tools to simulate neural circuits. This has been certainly the case for vision, starting with Hubel and Wiesel receptive field models and ending with state-of-the-art deep neural network models capturing primate core object recognition. To go beyond recognition and into higher cognitive capacities including reasoning and inference, multiple groups have been using recurrent neural networks (RNN), which can solve tasks that require maintenance and manipulation of information over time. These approaches have been quite amazing and have taught us quite a bit about population coding, and how various computational processes can be instantiated in neural hardware (e.g. short term memory and integration). Our lab has been interested in the types of inductive biases the brain has, those that allow us to learn rapidly and adapt to complex environments. This interest has led to ask the question of—why is the brain wired the way it is? Why has evolution decided that we should have a thalamus in the middle of the forebrain which is reciprocally connected to all cortical areas, including the reasoning/inferential machine we call the prefrontal cortex.

These questions have led us on a largely empirical journey over the last decade, recording from the thalamus and prefrontal cortex in animals solving various tasks. In collaboration with a number of colleagues, we have also tried to summarize our results and understanding of them in the form of neural models. I would like to highlight the latest of those—the paper, “Rapid contextual inference by a thalamocortical model using recurrent neural networks,” led by Wei-Long Zheng, a former postdoc in the lab and now Professor at Shanghai Jiao Tong University.

Wei-Long’s paper, published in Nature Communications, uses a biologically inspired architecture to mimic the prefrontal cortex (as an RNN) and the mediodorsal thalamus (as a single layer feedforward network). The resulting hybrid network also includes some biologically-inspired tricks, such as convergent inputs, a special cortico-thalamic learning rule and a winner take all mechanism in the thalamus (to capture the function of the thalamic reticular nucleus). All told, this hybrid structure outperforms RNNs on a number of decision making tasks, mainly in the ability to infer that tasks have changed and in the ability to solve multiple tasks sequentially without interference. The thalamus-like feedforward network partitions the prefrontal cortex, reducing interference across task representations. There was very nice press coverage on this by Shanghai Jiao Tong University which I encourage people to read– https://news.sjtu.edu.cn/jdzh/20241008/202588.html (It’s in mandarin, but google translate does a very good job). I should also emphasize that this paper was possible because of the very talented collaborator we had—Robert Yang who was amazing to work with and had so many insights throughout. The two trainees involved, Zhongxuan Wu (student at UT Austin) and Ali Hummos (postdoc at MIT) did a terrific job helping Wei-Long with this study.

The broader context: task optimized networks as tools in systems neuroscience

The 2024 Nobel Prize in Physics was awarded to Geoff Hinton and John Hopfield for their pioneering work on backpropagation and associative networks, respectively. These ingredients laid the foundation for much of the neural network tools we have today, including those we use to model the brain. RNNs, with their ability to maintain information over time, have been critical in advancing AI, from language models to reinforcement learning systems. But their potential in neuroscience is equally profound—They are uniquely positioned to answer why questions—such as the fundamental question of ‘why is the brain wired the way it is?’.

The narrower context: RNN-based PFC-thalamus models

Wei-Long’s paper follows and build on two other papers by our lab which I would like to highlight. The first is Rikhye, Gilra and Halassa (2018), where Aditya Gilra developed a similar model (in spirit) and showed for the first time a role for the thalamus in PFC partitioning and the mitigation of catastrophic forgetting. The second is by Hummos et al. (2022), who derived the cortico-thalamic learning rule and showed that it was capable of compressing cortical activity into a task context signal the thalamus carries. Importantly, the Hummos model was not only capable of solving complex human tasks, but also reproducing neural activity patterns seen in the scanner.

A place for NeuroAI in systems neuroscience

The intersection between Biological and Artificial Intelligence is quite exciting. Many institutions are now investing in what some call “NeuroAI”, with dedicated training programs that teach computer scientists about the brain and neuroscientists about computational methods. The future is quite exciting and just like causal tools (e.g. optogenetics) have become a hallmark of systems neuroscience studies (particularly in small animals), it is not unlikely that NeuroAI will be an equally prevalent approach for the interface between systems and cognitive neuroscience.

References:

Rikhye RV, Gilra A, Halassa MM (2018). “Thalamic regulation of switching between cortical representations enables cognitive flexibility.” Nature Neuroscience.
Hummos A, et al. (2022). “Thalamic regulation of frontal activity in human decision making” PLoS Computational Biology.
Zheng W-L, et al. (2024) “Rapid contextual inference by a thalamocortical model using recurrent neural networks.”Nature Communications, 2024.

Michael Halassa MD, PhD