Standard neural networks often suffer from catastrophic forgetting, where learning new tasks degrades performance on previously learned tasks. In contrast, the human brain integrates new and old memories through two complementary memory systems: the hippocampus and neocortex.
1. Objectives
The authors aim to build a model that captures:
- Pattern separation: distinct encoding of similar experiences,
- Pattern completion: reconstructing full representations from partial inputs,
to support continual learning without loss of previously acquired skills.
2. Problem Statement
Catastrophic forgetting manifests as a significant drop in performance on earlier tasks after training on new datasets.
3. Complementary Learning Systems Theory
CLS theory posits:
- A fast system (hippocampus) for encoding new episodic information with high resolution (pattern separation),
- A slow system (neocortex) for long-term consolidation and retrieval (pattern completion).
4. Proposed Model
4.1. Variational Autoencoder (VAE)
The VAE learns a latent distribution $z$ for inputs $x$ by maximizing the Evidence Lower Bound (ELBO):
$ \mathcal{L}(\theta, \phi; x) = $ $ {E}_{q_z|x} $
$ [\log p_\theta(x|z)]-D_{KL}(q_\phi(z|x)|p(z)) $
4.2. Modern Hopfield Network (MHN)
MHN serves as an associative memory, storing $N$ patterns ${w_i}$ and recalling the closest stored pattern for a given input by minimizing an energy function $E$.
4.3. Integration of VAE and MHN
- The VAE provides a continuous latent space representation,
- The MHN stores and discriminates critical patterns,
together offering both generalization and separation capabilities.
5. Experiments
The model is evaluated on the Split-MNIST benchmark, dividing MNIST into five sequential tasks. Metrics include:
- Average accuracy across tasks,
- Forgetting measure.
6. Results
The model achieves approximately 90% average accuracy, outperforming traditional approaches without CLS mechanisms.
The VAEโMHN integration leads to:
- Reduced forgetting,
- Effective generalization,
- Potential extension to more complex continual learning settings.
8. Conclusion
This architecture represents a promising step towards neural networks capable of continual learning, bridging neurobiological insights and practical algorithms.
๐ Links
- Based on the publication ๐ 2507.11393