Saturday, November 14, 2009

Review of Towards a Mathematical Theory of Cortical Micro-circuits

This is a report on my first reading of "Towards a Mathematical Theory of Cortical Micro-circuits" [Dileep George & Jeff Hawkins, PLoS Computational Biology, October 2009, Volume 5, Issue 10]. There were some bits of it that I had to treat as black boxes, but on the whole I found it comprehensible.

The core concern of the paper is to map a particular attempt to use computers to implement the Memory-Prediction framework of human intelligence to what is known about how nerve cells actually function in the cortex of the human brain. The specific machine methodology is called Hierarchical Temporal Memory (HTM). The article begins by explaining how HTM operates, then proceeds to map HTM processes to the cortex.

HTM structures are called nodes, but it should be noted that the nodes have extensive internal structure. They can have multiple inputs and outputs. They can have multiple "coincidence pattern matrices" and Markov chain processes. There are four major equations sets necessary for calibrating the nodes. I suggest you take a look at the HTM Technology Overview if you have not already. In particular, the authors assume the readers understand how Markov chains work, which is not obvious in this context.

It should be noted that HTM nodes are typically found in a hierarchy of nodes, and that information can move both up and down the hierarchy. The bottom of the hierarchy would typically correspond to sensory input, the top of the hierarchy to conceptual modes.

In fact, I found it easier to understand how the HTM worked internally when the authors started explaining the details in terms of neuron processes. The Markov chain likelihood circuit (Figure 4), in addition to mapping neurons to an aspect of an HTM node, makes it clearer that a Markov chain, in this context, is a set of information that has (or could have) predictive value. Markov chains are learned expectations about probabilities that events will occur in a time sequence.

This is a good example of a premise of the Memory-Prediction framework. The Markov chains are a sort of memory, yet they also are used to process information to make predictions. In computer science we tend to separate out memory from logical operations, but in the brain they are mostly inseparable.

As to the nervous system side of the mirror, I was not surprised to see a lot of the more complex work being performed by pyramidal cells. This neuron type appears to be quite differentiated from the more ordinary neurons that mainly relay information between muscles or sensory cells and the brain. Their very size and location in the cortex (particularly in layers 2, 3, and 5) should arose anyone's curiosity. Whether they really play the roles assigned to them here is not generally known, but hypotheses are made that should be testable.

The paper also covers the use of HTMs to recognize visual objects. Since much of the work done by Numenta and others using HTMs is in this field, I won't comment further here except to say that I found the work on the subjective contour effect particularly intriguing (example of subjective contours: Kanizsa triangle).

Merely categorizing visual objects is a more difficult problem that was originally thought by computer scientists working on Artificial Intelligence (AI). It is not learning about the successes with HTMs with pattern recognition that excites me. It is the way that it is done, and the promise that implies. I want to understand how the brain works in detail, but the ultimate goal is to understand intelligence, awareness, and understanding itself. The way HTMs process information, including the flowing of information both up and down the hierarchy, seems like a necessary condition for attacking these problems of higher intelligence. I don't think anyone thinks HTMs will prove to be entirely sufficient, but they seem like a good starting place for further physical and philosophical investigations.