A Thousand Brains Theory: A Review

A new theory on intelligence and how the neocortex works.

Christophe Pere

Published in

Towards Data Science

8 min readSep 26, 2021

Introduction

For a long time, I have been following Numenta, a startup of neuroscientists whose goal is to understand the neocortex to reproduce the mechanisms in learning algorithms.

The founder, Jeff Hawkins, wrote the book A Thousand Brains: A New Theory of Intelligence. Attractive title for me who loves neuroscience and artificial intelligence. In his book, the author gives a history of theories about the brain and intelligence. He explains with anecdotes and experiences how he arrived at his theory.

His hypothesis is interesting and changes the way we compute all the actual machine learning algorithms:

The hypothesis I explore in this chapter is that the brain stores all knowledge using reference frames, and thinking is a form of moving. Thinking occurs when we activate successive locations in reference frames. p.71

In this blog post, I will only talk about the two first parts of the book 1) A New Understanding of the Brain, and 2) Machine Intelligence. The third part is about Human Intelligence but, I’m not going to explain it. I think the readers should make their own opinion about this part and not the biased vision I will provide them.

Let’s go.

A New Understanding of the Brain

The author builds his theory based on Mountcastle (1978)[1]. In this paper, he explains that the neocortex is a cortical algorithm. Mountcastle describes it as a general algorithm without detail about the algorithm.

The neocortex can be seen as a tensor with the same computational unit cloned a hundred thousand times. Indeed, the neocortex contains 150,000 cortical columns (see them as a neural layer containing a fixed number of neurons). Each cortical column is composed of 100 mini-columns. The best way to visualize this architecture is to think of a box of spaghetti. The box is the cortical column, and the spaghetti are the mini-columns.

Well, here you have a grasp of what is a cortical column. Now, imagine that this cortical column is a lego block and to build the neocortex you place a lot of cortical columns against each other and connect them in order to create a 3D network. That’s where the beauty lies. Each column is almost identical no matter where you look except for a few details. Each column can compute any stimulus or sense (hearing, smell, touch, taste and vision).

Now, think of an individual column. It can analyze and process the input signal, no matter the type of signal. So, each column can also predict an output. That means 150,000 predictions at each instant. The neocortex predicts thousands of realities with a continuous-time step. This is anticipation, consciousness doesn’t see it unless if it makes mistakes. You learn by analyzing the errors.

But, what does the brain learn? If all signals are perceived similarly. If the brain areas are not easily differentiable. How does the brain learn?

All signals are processed and computed at an identical location. How can the brain determine distances? Determine that the sound comes from a particular side. How does it know which signal to the primary brain is needed to reach a specific space to grasp an object? What does the brain learn from its perception of reality?

We learn a model of the world.

The brain, cortical columns, predict the future, of a few moments to adapt gradually. The learning appears from the mistakes it makes and provides updates to its model of the world.

Also, the brain has to make assumptions about what it perceives. No label on the environment surrounds it. It must first assume, as in an unsupervised way, and emit hypotheses to validate them afterwards with a label.

The neocortex will learn the shapes or global characteristics of objects. That will permit to recognize them regardless of additions or modifications of features. It, therefore, learns rules.

It must also learn mechanisms for retrieving information. The information it learns cannot be permanently available. The flow of information just shot down it. It must therefore learn by association. A smell will recall a memory. A texture brings back images, etc.

The neocortex will therefore act as a knowledge graph. A particular input will allow access to memories that are associated with a similar event. I visualize it as a database of understandings.

Unlike current general machine learning methods, the brain must learn continuously and in a doubly dynamic space. The world moves, and the brain moves in it. We could see it as continuous reinforcement learning or continuous exploration.

Dynamism will allow the neocortex to create connections between events and sensations. Also, it will be able to learn the effects of the movements it generates. It is sensory-motor learning.

The brain needs to predict the continuous-time of events to be able to move and act in the environment. But, it needs to do two types of predictions. The one for the environment. The second, his representation inside the environment.

So, the brain learns and creates reference frames with the location. A reference frame is like a grid of the environment and objects. The brain will not save an image of each thing but a representation of points of interest. It’s like you have multi mesh grid of the world in your head. Each cortical columns can learn hundred of object.

The world, for the brain, is a sequence of memory (dynamics). Location is needed to associate position and memory because it allows you to find your way and move.

A current equivalent in the AI domain is the RNN family and attention mechanisms. The world model creates by the neocortex is a combination of sensory input, reference frames and location. The neocortex isn’t responsible for the movement and map creation. It needs the old brain to do it. Mostly the hippocampus and the entorhinal cortex.

Each neuron will search for the corresponding map between hundreds. It’s working as associative memory with the help of dendrites. In the citation provided in the introduction. The author hypothesizes that knowledge is encoded in the reference frames, and location, or the ability to connect the dots, creates thinking. Thinking is the selection of the corresponding cortical column. It’s possible because reference frames learn a representation of the world, not just objects. The location allows the cortical columns to choose the corresponding map.

Where the name of the theory comes from? As explained before, the neocortex is based on cortical columns based on mini-columns. This approach is a thousand brains theory.

In this theory, cortical columns are relatively similar but could only retain a fixed size of knowledge. So the knowledge is spread inside the 150,000 columns but with overlap. If a part is injured, the information isn’t lost.

But, there is a problem here. If hundreds of columns retain the same information or a version of the information, how does the brain perceive only one thing? It’s called the binding problem.

How could the cortical columns resolve the binding problem? The simple answer of the author is: by voting. For me, each column will predict a piece of information with a probability. If the information reaches the majority, the reference frame is selected.

The magic part is that the neuron providing the vote is always the same in the cortical columns. Every time a vote is needed, the same neurons will be activated through cortical columns.

If unfortunately, two maps have the same probability of appearance, the neurons will reach a consensus. It’s why attention is mandatory.

Machine Intelligence

Well, very great title. But, there is no intelligence in artificial intelligence. In the current methods used in AI, there is just an A.

The future of AI will be based on principles that mimic the brain. p.117

Why?

We, as Humans, learn continuously. Our brain updates its world model every second. The actual neural networks need to be fully trained once and have difficulties learning more things without losing too much information when retrained (called catastrophic forgetting[2]). A new domain called Continual Lifelong Learning[3][4] uses a continuous stream of data to train a neural network without losing information.

But why isn’t an I? It’s because today systems are specialized to do one thing, while humans could do multiple things. The AI is not flexible.

Currently, AI researchers couldn’t program a system that approaches the intelligence of a five-year-old child. They haven’t, yet, find a way to create the everyday knowledge. This problem is called knowledge representation, and it’s considered the only problem.

The brain learns a model of knowledge, not words or images. Like we have seen in the first part, the brain learns map-like references frames.

So how to consider a system as intelligent? What are the criteria?

The author provides four criteria to form the basis of machine intelligence.

The machine needs to learn continuously (Continual Lifelong Learning). The machine needs to learn from its mistakes to update its world model. The machine needs to create new connections to acquire new knowledge without replacing old ones or delete them.
The machine will need to learn via movement (called embodiment). The motion leads to location. The world representation will be biased if it is avoided.
The machine needs to create many models. Each cortical column of the neocortex learns a model of thousand objects, and the process to resolve the binding problem (one unique perception) is made by voting. A machine needs to acquire the same process.
The machine needs to use reference frames to store knowledge. Thinking is a sort of movement. It is appeared by connecting the dots in the reference frames. If a machine can’t use motion, it cannot think.
Cortical columns use cells like grid cells and place cells.

When machines are conscious

Consciousness depends on continuously forming memories of our recent thoughts and experiences and the ability to playing them back.

Awareness seems to be intricate with consciousness. Memories and attention are the basis of consciousness. If I remember what I’ve been doing yesterday, and replace each moment. I’m aware of them, and I’m conscious.

A machine intelligence needs goals and motivations. But, they appear according to different procedures, encoded in the machine (our DNA programs our body to eat, breathe etc .). Or they are learned. They also need safety measures. For us, a safety measure is an impossibility of dying just by holding your breath. For a machine, the three laws of robotics by Isaac Asimov[5] could be a good start.
Goals and motivations are not a consequence of intelligence, and will not appear on their own.

Conclusion

The cons of the neocortex: every neuron is wired, so if a zone isn’t active at the beginning (young age), it will never be active. It’s why kids that heard many languages when they are young have more chance to be fluent in multiple languages when adult. The goal is to avoid this mechanism in a machine.

The author doesn’t talk about quantum processes or quantum phenomenons that are present in the brain (chemical processes or the teleportation of information).

The appearance of more and more powerful quantum computers will probably change how we think about the brain.

Last words, it’s a great book that I recommend to everyone. It gives a new perspective on the neocortex and how we learn. The approach of the author is very interesting and rises ideas and questions all along.

What I appreciate even more is that the author proposes his vision, his theory without criticizing those already present. Even when he discusses intelligent machines, he only points out the limits already established.

References

[1] Mountcastle, Vernon, 1978. An Organizing Principle for Cerebral Function The Unit Model and the Distributed System. The Mindful Brain, 7–50. Cambridge, MA — MIT Press. http://nicorg.pbworks.com/w/file/fetch/49365852/Mountcastle%20Organizing%20Principle.pdf
[2] Michael McCloskey, Neal J. Cohen, 1989. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. Psychology of Learning and Motivation, Academic Press, Volume 24, p.109–165. https://doi.org/10.1016/S0079-7421(08)60536-8
[3] James Kirkpatrick and al., 2016. Overcoming catastrophic forgetting in neural networks. ArXiv. http://arxiv.org/abs/1612.00796
[4] German I. Parisi et al., 2019. Continual lifelong learning with neural networks: A review. Neural Networks, Volume 113, p.54–71. https://doi.org/10.1016/j.neunet.2019.01.012
[5] https://en.wikipedia.org/wiki/Three_Laws_of_Robotics