Home AGI Presentations index Principles Of Hierarchical Temporal Memory - Foundations Of Machine Intelligence

Principles Of Hierarchical Temporal Memory - Foundations Of Machine Intelligence

History / Edit / PDF / EPUB / BIB /
Created: November 15, 2015 / Updated: August 30, 2025 / Status: finished / 4 min read (~741 words)
artificial-general-intelligence

The cortex uses a common learning algorithm
The cortical algorithm is incredibly adaptable
It has a network effects: hardware and software efforts will focus on the most universal/generic solution

Sensory systems:

Retina
Cochlear
Somatic

Patterns of action potential, firing of neural fibers.
The brain deals with patterns.
The neocortex learns a model from fast changing sensory data. With the model, it can generate

predictions
anomalies
actions

The neocortex learns what is called a sensory-motor model of the world.
The brain is a sheet of cells which is remarkably uniform.
It is organized as a hierarchy.
Within a level of the hierarchy are cellular layers.
Within those layers there's an organisation called mini-columns.
At the end are neurons.
Learning is about modifying synaptic weights, but also about degenesis/neogenesis of synaptic connections.

Hierarchy of identical regions
Each region learns sequences (time-based patterns)
Stability increases going up the hierarchy if input is predictable
Sequences unfold going down

What does a region do?
What do the cellular layers do?
How do neurons implement this?
How does this work in hierarchy?

1-6 layers, 2 to 3 layers of feed forward (2-3-4) and 2 layers of feedback (5-6)
Each layer is implementing a type (variation) of a common sequence memory algorithm
Layers 2-3-4 are doing inference
Layer 5 is doing motor behavior
Layer 6 is doing attention

The input to a particular region arrives at the layer 4 (L4) and then projects to L3 which then projects down to the lower layer of the hierarchy
The motor behavior is also passed in at the same time
Thus, what is received is the information that is perceived as well as the recent behaviors of the body

L4: Learns sensory-motor sequences
If the layer is able to predict properly the sequence, it forms a stable representation that is passed onto L3
if it is unable to predict, it passes through the change to L3

L3: Learns high-order sequences

10% of the synapses are close to the cell body
Feedfoward input
Added linearly
Generate spikes

2 regions
Basal dendrites (bottom, close to the cell)
Apical dendrites (top, far from the cell)
They are non-linear
Dendritic action potentials depolarize soma

Feedforward
Linear summation
Binary activation

Distal synapses
Modeled as a set of coincidence detector
Threshold coincidence detectors
Puts the cell in a predicted state

Learning is mostly formation of new synapses
Synapses are low fidelity

Scalar permanence
Binary weight

Called The language of intelligence

Many bits (thousands)
Few 1's, mostly 0's
Each bit has semantic meaning
Learned

Similarity
shared bits = semantic similarity
Store and compare
store indices of active bits
subsampling is OK
Union membership
can ask "is this pattern part of the union?"

A cell can recognize many unique patterns on a single dendritic branch

A cell activates from dozens of feedforward patterns
It predicts its activity in hundreds of contexts

Feedforward activation
Inhibition which generates sparse cell activation
Formation of connections with nearby cells which were previously active cells (give them the ability to predict future activity)

If a pattern is input, many cells will indicate that they predict to be activated next
It can predict A-B, A-C, A-D
This is known as a first order sequence memory
It cannot learn A-B-C-D vs X-B-C-Y
Mini-columns turn this into a high-order sequence memory

If there is no prediction, all cells within a column become active
If there is a prediction, only the predicted cells will become active (the other will be inhibited)

Converts input to sparse activation of columns
Recognizes and recalls high-order sequences

Continuous learning
High capacity
Local learning rules
Fault tolerant
No sensitive parameters
Semantic generalization

graph LR;
    0[Data]
    0 --> 1[Encoder]
    1 --SDR--> 2[HTM high-order sequence memory]
    2 --> 3[Predictions<br/>Anomalies]

Cortical (HTM)
ANNs (deep learning)
AI (Watson)

	Cortical	ANNs	AI
Premise	Biological	Mathematical	Engineered
Data	Spatial-temporal	Spatial-temporal	Language
	Behavior		Documents
Capabilities	Prediction	Classification	NL Query
	Classification
	Goal-oriented behavior
Valuable?	Yes	Yes	Yes
Path to machine intelligence?	Yes	Probably not	No