Agent Memory: Why Your AI Has Amnesia and How to Fix It
TL;DR
AI agents forget everything between conversations. This article shows why larger context windows don't solve the problem — and how four memory types from cognitive science form the foundation for persistent agent memory.
Reasoning Seed
A Reasoning Seed is a structured prompt you can copy into your AI reasoning tool (Claude, ChatGPT, Obsidian, Notion). It contains the article's thesis, its core tension, and our lab context — ready for your own analysis.
Click the button below to copy as Markdown. More ways to interact with this content in the discussion questions below.
Tension: If agents manage their own memory and decide what to forget — who controls what counts as knowledge?
Lab context: Agent memory is the missing piece between session and continuity. In the lab, we are building exactly this bridge — with structured context instead of ephemeral chat history.
Key Insights
1 — Four Memory Types, Straight from Cognitive Science
The CoALA framework (Princeton, 2023) defines four memory types for AI agents, derived from the SOAR architecture of the 1980s: Working Memory (current conversation), Procedural Memory (system prompts, decision logic), Semantic Memory (accumulated knowledge, preferences), and Episodic Memory (past interactions, experience logs). The analogy to human memory is no accident — every major framework in the field builds on this taxonomy. Lilian Weng’s formula captures it: Agent = LLM + Memory + Planning + Tool Use.
2 — Context Windows Are Not Memory
The expansion of context windows to hundreds of thousands or millions of tokens has created an “illusion of memory.” But: models degrade well before their limits (a 200K-token model often becomes unreliable around 130K). Every token is weighted equally — no prioritization, no relevance filtering. And: close the session, everything is gone. More space on the Post-it doesn’t make it memory.
3 — RAG and Memory Solve Different Problems
RAG brings external knowledge into the prompt at inference time — great for fact-based answers. But RAG is stateless: no awareness of previous interactions, no user identity, no connection between queries. Memory provides continuity. RAG helps an agent answer better. Memory helps it learn and adapt. You need both, but they solve fundamentally different problems.
4 — Two Strategies: Hot Path vs. Background Memory
LangChain distinguishes two approaches to memory updates: Hot path — the agent explicitly decides what to remember before responding (higher latency, immediate availability). Background — a separate process extracts and stores memories during or after the conversation (no latency hit, but delayed availability). Add to this the distinction between programmatic memory (developer defines what gets stored) and agentic memory (the agent decides itself). The field is moving toward the latter — agents that manage their own memory.
5 — Forgetting Is a Feature, Not a Bug
Effective forgetting through decay functions: a relevance score multiplies semantic similarity by an exponential time decay since last retrieval. Memories that haven’t been recalled recently gradually lose salience — analogous to biological memory. Alternative: old facts are invalidated but never deleted — for audit trails and historical accuracy. The four core operations of every memory system: ADD, UPDATE, DELETE, SKIP. Modern systems delegate these decisions to the LLM itself rather than if/else logic.
6 — A Knowledge OS Is Already Living Agent Memory
What the article describes as enterprise infrastructure already exists in simplified form in any well-structured Knowledge OS. The translation:
- Procedural Memory: CLAUDE.md, workflow.md, Skills — codified working methods, conventions, decision logic. Manually maintained, git-versioned.
- Semantic Memory: auto memory (user/feedback files), vault contents (knowledge/, business/) — accumulated knowledge about the user and the domain.
- Episodic Memory: Session issues in Linear, auto memory (project/reference), git history — logs of past interactions and decisions.
- Working Memory: The context window of the current Claude Code session, including plan files and loaded documents.
The gaps become visible: no relevance scoring in the vault, no systematic forgetting, fragmented recall across episodic sources. The taxonomy makes these gaps nameable — and therefore actionable.
Critical Assessment
What holds up
- The CoALA taxonomy is academically grounded (Princeton 2023, building on SOAR from the 1980s) and has established itself as the field’s lingua franca
- The distinction between RAG and memory is practically relevant and frequently confused in the industry — the article clears this up cleanly
- Treating forgetting as an explicit feature rather than a failure — a perspective shift most implementations ignore
- The frameworks overview (LangChain, Letta, Zep, Mem0) is current and provides useful orientation
What needs context
- Oracle vendor perspective: The article culminates in “converged database” as the answer to everything. The memory taxonomy is valid; the conclusion (“you need Oracle”) is marketing
- Enterprise bias: ACID transactions across memory types, row-level security, multi-tenancy — relevant for large corporations, irrelevant for 95% of practitioners working with file-based memory
- Blind spot for pragmatism: Not a word about simple, file-based memory systems like Claude Code’s auto memory or Markdown-based knowledge bases. The article ignores the fact that most functioning agent memory systems today run on files, not databases
- Sleep-time computation: Presented as a future vision, but without original data — OpenAI and Letta are cited, Oracle has nothing of its own to show here
Discussion Questions for the Next Lab
01 Knowledge OS as memory architecture: When we view our vault system through the CoALA lens — where are the structural gaps? Working and procedural memory are strong, but semantic and episodic memory are fragmented across different systems (auto memory, Linear, git). What would a coherent architecture look like?
02 Forgetting in the vault: Git never forgets — every change stays in history. But a Knowledge OS that never forgets accumulates noise. How do we implement “forgetting” in a system based on version control? Decay scores on Markdown files? Archival automation? Or is git-based “keep everything” a feature?
03 Programmatic vs. agentic: Claude Code’s auto memory decides on its own what to store — that’s agentic memory. CLAUDE.md and workflow.md are programmatic — we define what the agent should know. Where do we shift the boundary? More agentic control means less maintenance but also less predictability.
04 Client communication: “We’re building agent memory” sounds like science fiction. “Your chatbot forgets everything after every conversation” is immediately understood. How do we explain to clients the difference between RAG (which they probably already have) and persistent memory (which they need)?
Sources
- Original: Casius Lee — Agent Memory: Why Your AI Has Amnesia and How to Fix It (Oracle, Feb 2026)
- Sumers et al. — CoALA: Cognitive Architectures for Language Agents (Princeton, 2023)
- Lilian Weng — LLM Powered Autonomous Agents (2023)
- Packer et al. — MemGPT: Towards LLMs as Operating Systems (2023)
- LangChain — Memory Concepts
- Mem0 — Memory in Agents: What, Why, and How
Glossary
Agent Memory A persistent, evolving state that gives AI agents context across sessions. Not to be confused with the context window (volatile) or RAG (stateless).
CoALA (Cognitive Architectures for Language Agents) Framework from Princeton (2023) defining four memory types for AI agents — derived from the cognitive SOAR architecture. The field’s lingua franca.
Working Memory The current conversation context — what the agent is actively “thinking” about. Corresponds to the context window. Fast but volatile.
Procedural Memory Codified behavioral rules: system prompts, tool definitions, decision logic. An agent’s “muscle memory.”
Semantic Memory Accumulated factual knowledge: user preferences, extracted facts, knowledge bases. Grows over time.
Episodic Memory Logs of past experiences: conversation logs, action sequences, few-shot examples. The agent’s “autobiographical memory.”
Decay Function A mathematical function that reduces a memory’s relevance score over time — imitating biological forgetting. Semantic similarity × exponential time decay.
PARA (Projects, Areas, Resources, Archives) Organization method by Tiago Forte for personal knowledge management. Four categories based on actionability, not topic. Foundation of many knowledge management systems now being combined with LLM agents.
Tiago Forte Author of “Building a Second Brain” and developer of the PARA method. One of the most influential thinkers in personal knowledge management — his framework is increasingly being adapted as a basis for AI-augmented knowledge systems.
Curated by David Latz · Panoptia March 2026