TitanMAC: Memory-Augmented Transformer

Standalone implementation of the TitanMAC architecture combining neural long-term memory with nested learning optimization.

Foundational Papers

Paper	Link	Key Contribution
Titans: Learning to Memorize at Test Time	arXiv:2501.00663	Neural long-term memory module, MAC/MAG/MAL variants, surprise-based updates
Titans Revisited	arXiv:2510.09551	Lightweight reimplementation, empirical validation across domains
Titans & MIRAS Blog	Google Research	Unified framework viewing attention as associative memory
It's All Connected (MIRAS)	arXiv:2504.13173	MIRAS framework, attentional bias, retention regularization, MONETA/YAAD/MEMORA variants
Nested Learning	Google Research	DMGD, CMS, multi-frequency updates, continual learning paradigm

Features

Windowed Attention: O(T*w) memory instead of O(T²)
Persistent Tokens: Global context via bidirectional attention
Neural Long-Term Memory: Gradient-based memory with surprise updates
MAC/MAG/MAL Variants: Three memory integration strategies
Deep Momentum Gradient Descent (DMGD): Learned momentum
Continuum Memory System (CMS): Multi-frequency updates

Installation

pip install -e .

Quick Start

from titans_core import TitanMACConfig, TitanMAC

# Create config
config = TitanMACConfig(
    vocab_size=50000,
    d_model=512,
    n_heads=8,
    n_layers=12,
    window_size=256,
    n_persistent=16,
)

# Create model
model = TitanMAC(config)

# Forward pass
input_ids = torch.randint(0, 50000, (2, 512))
outputs = model(input_ids=input_ids, labels=input_ids)
loss = outputs["loss"]

Training Example

Train on generated math problems:

python examples/train_math.py --steps 1000 --batch-size 4 --device cuda

With neural memory enabled:

python examples/train_math.py --steps 1000 --use-neural-memory --memory-capacity 256

Architecture

TitanMAC
├── Token Embeddings + Positional Embeddings
├── Persistent Tokens (learnable, bidirectional attention)
├── N × TitanBlock
│   ├── RMSNorm → WindowedAttention → Residual
│   └── RMSNorm → MLP → Residual
├── Optional: NeuralMemory (gradient-based)
├── Final RMSNorm
└── LM Head (optionally tied to embeddings)

Configuration

Key parameters in TitanMACConfig:

Parameter	Default	Description
`d_model`	640	Model dimension
`n_heads`	10	Attention heads
`n_layers`	16	Transformer layers
`window_size`	512	Local attention window
`n_persistent`	16	Global persistent tokens
`use_neural_memory`	False	Enable gradient-based memory
`titans_variant`	"MAC"	Memory integration (MAC/MAG/MAL)

Package Structure

titans_core/
├── config.py              # TitanMACConfig dataclass
├── blocks/
│   ├── norms.py          # RMSNorm
│   ├── mlp.py            # MLPBlock
│   └── titan_block.py    # TitanBlock
├── attn/
│   └── windowed_attention.py
├── memory/
│   ├── memory_bank.py    # Key-value retrieval
│   └── neural_memory.py  # Gradient-based memory
├── models/
│   ├── titanmac.py       # Main model
│   └── titanmac_wrapper.py  # HuggingFace wrapper
└── opt/
    ├── continuum_optimizer.py  # Nested optimizer
    ├── nested_controller.py    # LR modulation MLP
    ├── param_groups.py         # Parameter grouping
    ├── dmgd.py                 # Deep Momentum GD
    └── cms.py                  # Continuum Memory System

Dependencies

PyTorch >= 2.1
Transformers >= 4.35 (for wrapper only)

License

GPL

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
docs		docs
examples		examples
results/ab_tests		results/ab_tests
tests/unit		tests/unit
titans_core		titans_core
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
train_titanmac_nested.py		train_titanmac_nested.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TitanMAC: Memory-Augmented Transformer

Foundational Papers

Features

Installation

Quick Start

Training Example

Architecture

Configuration

Package Structure

Dependencies

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

TitanMAC: Memory-Augmented Transformer

Foundational Papers

Features

Installation

Quick Start

Training Example

Architecture

Configuration

Package Structure

Dependencies

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages