A curated collection of systems, benchmarks, and papers et. on memory mechanisms for Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs), exploring how different approaches enable long-term context, retrieval, and efficient reasoning.
๐ Open-source resources (e.g. papers with reproducible code publicly available on Github) are marked in bold font and ranked higher.
๐๏ธ Table of Contents
- ๐ฟ Products
- ๐ Tutorials
- ๐ Surveys
- ๐ Benchmarks
- ๐ค Papers - Nonparameteric Memory
- ๐ Text Memory
- ๐ Graph Memory
- ๐ฅ Multimodal Memory (for Understanding)
- ๐ฅ Multimodal Memory (for Generation)
- ๐ข Papers - Parameteric Memory
- ๐ Papers - Memory for Agent Evolution
- ๐ฌ Papers - Memory in Cognitive Science
- ๐ฐ Articles
- ๐ฅ Workshops
If you find this project helpful, please give us a โญ๏ธ on GitHub for the latest update.
Ordered by the number of Github stars.
-
-
TeleMem: A Drop-in Replacement for Mem0 [code]
import telemem as mem0๐ Newly released. Rising star. Tech-Report will be on arXiv soon. Stay tuned! ๐
-
-
Letta (formerly MemGPT) [code] [paper] [research] [blog]
-
MemMachine [code] [blog]
-
MemoryBear [code] [paper]
-
Memories.ai [research] [paper] [blog]
-
ACM SIGIR-AP 2025 Tutorial: Conversational Agents: From RAG to LTM [paper] [code]
-
Daily Dose of DS: A Practical Deep Dive Into Memory Optimization for Agentic Systems [Part-A] [Part-B] [Part-C]
-
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions [code]
-
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
-
Human-inspired Perspectives: A Survey on AI Long-term Memory
-
Beyond a Million Tokens: Benchmarking and Enhancing Long-Term Memory in LLMs (The BEAM Paper) [code] [data]
-
MOOM: Maintenance, Organization and Optimization of Memory in Ultra-Long Role-Playing Dialogues (The ZH-4O Paper) [code] [data]
-
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale (The PersonaMem and ImplicitPersona Paper) [code] [data11] [data2]
-
Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions (The MemoryAgentBench Paper) [code] [data]
-
LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners [code] [data]
-
NoLiMa: Long-Context Evaluation Beyond Literal Matching [code] [data]
-
MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems [code] [data]
-
HaluMem: Evaluating Hallucinations in Memory Systems of Agents [code] [data]
-
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks [code]
-
Minerva: A Programmable Memory Test Benchmark for Language Models [code]
-
MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based Agents [code]
-
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory
-
OdysseyBench: Evaluating LLM Agents on Long-Horizon Complex Office Application Workflows
-
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory [data]
-
Evaluating Very Long-Term Conversational Memory of LLM Agents (The LoCoMo Paper) [code] [data]
-
โBench: Extending Long Context Evaluation Beyond 100K Tokens [code]
-
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding [code]
-
TeleEgo: Benchmarking Egocentric AI Assistants in the Wild [code] [proj]
-
LVBench: An Extreme Long Video Understanding Benchmark [code]
-
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis [code]
-
MovieChat+: Question-aware Sparse Memory for Long Video Question Answering [code]
-
CinePile: A Long Video Question Answering Dataset and Benchmark [code]
-
LongVideoBench: A Benchmark for Long-Context Interleaved Video-Language Understanding [code]
-
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding [code]
-
LvBench: A Benchmark for Long-form Video Understanding with Versatile Multi-modal Question Answering
- ARE: Scaling Up Agent Environments and Evaluations (The Gaia2 Paper) [code]
-
LightMem: Lightweight and Efficient Memory-Augmented Generation [code]
-
Nemori: Self-Organizing Agent Memory Inspired by Cognitive Science [code]
-
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents
-
Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering
-
In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents
-
MemoRAG: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation
-
Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations [code]
-
MemoryBank: Enhancing Large Language Models with Long-Term Memory [code]
-
Toward Conversational Agents with Context and Time Sensitive Long-term Memory [data]
-
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
-
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models [code]
-
MIRIX: Multi-Agent Memory System for LLM-Based Agents [code]
-
Hierarchical Memory Organization for Wikipedia Generation [code]
-
From Experience to Strategy: Empowering LLM Agents with Trainable Graph Memory
-
Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning
-
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models [code]
-
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents [code]
-
MemVerse: Multimodal Memory for Lifelong Learning Agents [code] [blog]
-
MGA: Memory-Driven GUI Agent for Observation-Centric Interaction [code]
-
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory [code]
-
HippoMM: Hippocampal-inspired Multimodal Memory for Long Audiovisual Event Understanding [code]
-
Episodic Memory Representation for Long-form Video Understanding
-
Multi-RAG: A Multimodal Retrieval-Augmented Generation System for Adaptive Video Understanding
-
Contextual Experience Replay for Self-Improvement of Language Agents
-
VideoAgent: Long-form Video Understanding with Large Language Model as Agent [code]
-
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling [code]
-
LongVLM: Efficient Long Video Understanding via Large Language Models [code]
-
KARMA: Augmenting Embodied AI Agents with Long-and-short Term Memory Systems [code]
-
StoryMem: Multi-shot Long Video Storytelling with Memory [code]
-
MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives [code]
-
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation [code]
-
VideoRAG: Retrieval-Augmented Generation over Video Corpus [code]
-
Pretraining Frame Preservation in Autoregressive Video Memory Compression
-
EgoLCD: Egocentric Video Generation with Long Context Diffusion
-
Pack and Force Your Memory: Long-form and Consistent Video Generation
-
Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval
-
MLP Memory: Language Modeling with Retriever-pretrained External Memory [code]
-
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models [code]
-
Nested Learning: The Illusion of Deep Learning Architectures
-
R3Mem: Bridging Memory Retention and Retrieval via Reversible Compression
-
May the Memory Be With You: Efficient and Infinitely Updatable State for Large Language Models
-
MeMo: Towards Language Models with Associative Memory Mechanisms
-
EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts
-
Disentangling Memory and Reasoning Ability in Large Language Models
-
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory [code]
-
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding [code]
-
MemoryLLM: Towards Self-Updatable Large Language Models [code]
-
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models [code]
-
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
-
MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool
-
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models
-
ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning [code]
-
Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution [code]
-
Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks [code]
-
Mem-ฮฑ: Learning Memory Construction via Reinforcement Learning [code]
-
Memento: Fine-tuning LLM Agents without Fine-tuning LLMs [code]
-
Goal-Directed Search Outperforms Goal-Agnostic Memory Compression in Long-Context Memory Tasks [code]
-
AgentEvolver: Towards Efficient Self-Evolving Agent System [code]
-
FLEX: Continuous Agent Evolution via Forward Learning from Experience [code]
-
Beyond Heuristics: A Decision-Theoretic Framework for Agent Memory Management
-
Nested Learning: The Illusion of Deep Learning Architecture [blog]
-
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory
-
ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
-
MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent
-
MemGen: Weaving Generative Latent Memory for Self-Evolving Agents
-
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization
-
MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding
-
Task-Core Memory Management and Consolidation for Long-term Continual Learning
-
Everything is Context: Agentic File System Abstraction for Context Engineering [code]
-
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
-
Neural Population Activity for Memory: Properties, Computations, and Codes
-
How Prediction Error Drives Memory Updating: Role of Locus CoeruleusโHippocampal Interactions
-
Towards Large Language Models with Human-Like Episodic Memory
If you find this project helpful, please give us a โญ๏ธ.
Made with โค๏ธ by the Ubiquitous AGI team at TeleAI.
