LLM Knowledge Bases: Why Everyone Lands on the Same Stack

April 3, 2026 · David Latz

TL;DR

Andrej Karpathy describes his setup for LLM-powered knowledge work — and it sounds familiar. Markdown, Git, Obsidian, an LLM as operator. Practitioners independently discover the same architecture. That's not coincidence — it's convergent evolution.

Reasoning Seed

Tension: If everyone independently arrives at the same stack — is that a sign of the right architecture, or a blind spot nobody sees?

Lab context: Convergent evolution: when everyone independently arrives at the same stack (Markdown, Git, Obsidian), that is not coincidence — it is a signal that the architecture is right.

Key Insights

1 — Convergent Evolution: Same Pattern, Different People

Karpathy describes his workflow for LLM-powered knowledge bases: collect source documents in a raw/ directory, have an LLM compile a Markdown wiki from them, use Obsidian as frontend, run Q&A against the knowledge base, lint and health-check the data.

What stands out: this setup is appearing everywhere at the same time. Simon Willison has been building on Markdown and SQLite for years. Practitioners in the Claude Code community structure git-versioned vaults with layered context architectures. Tiago Forte’s PARA method gets combined with LLM agents. Nobody coordinated this — yet everything converges on the same stack.

In biology, this is called convergent evolution: different species independently develop the same solutions to the same problem. Here, the problem is: How does a human organize knowledge so an LLM can use it operationally?

2 — The Four Building Blocks of the Convergent Stack

Beneath the surface of different implementations lies the same architecture. Four layers that appear in every practitioner’s setup — in different flavors, but with the same logic:

Data Ingest — Getting knowledge into the system. Karpathy uses Web Clipper and a raw/ folder. Others work with Readwise, Obsidian Clipper, or manual curation. The result is always: source documents as local files, preferably Markdown.

Knowledge Structure — Organizing knowledge. Karpathy has the LLM compile a wiki with categories, backlinks, and summaries. Others work with explicit conventions, index files, and layered context systems. The principle is the same: structure that’s navigable for both humans and LLMs.

Agent Operation — LLM as operator, not oracle. Q&A against the knowledge base, health checks, consistency validation, incremental improvement. The LLM doesn’t just read — it works within the system.

Output Rendering — Making results visible. Markdown files, slides (Marp), visualizations (matplotlib), search interfaces. Outputs flow back into the system and enrich it.

3 — Why Markdown and Git Win

Karpathy could have chosen any format. Notion, Roam, a database. Instead: Markdown files in a directory. This isn’t accidental — it’s the result of a silent selection process.

Markdown is the format LLMs natively understand. No parsing required, no API, no vendor lock-in. Git provides versioning, branching, and collaboration — free, robust, battle-tested for decades. Obsidian is a viewer on the file system, not a proprietary system. The data belongs to you, not the tool.

The consequence: a knowledge system on this stack outlives every single one of its components. Claude Code gets replaced? The Markdown files remain. Obsidian disappears? The directory structure works with any editor. This isn’t a theoretical consideration — it’s a design decision with real consequences for the durability of knowledge work.

4 — The Critical Split: Wiki vs. Operating System

This is where it gets interesting. Karpathy describes a system where the LLM maintains a wiki and answers questions on demand. That’s read-heavy — knowledge gets collected, structured, queried. The human asks questions, the LLM researches.

But the pattern can also be write-heavy. When the LLM doesn’t just write summaries but actively operates within the system — propagating changes, checking consistency, structuring tasks, generating outputs for different channels — it’s no longer a wiki. It’s an operational work system.

The difference isn’t gradual — it’s qualitative. A wiki is a knowledge archive with intelligent access. A Knowledge OS is a work environment where human and LLM operate together. Both use the same stack — but the ambition is different.

Karpathy hints at this direction when he writes: “Often, I end up ‘filing’ the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always ‘add up’ in the knowledge base.” That’s the beginning of a feedback loop that goes beyond pure Q&A.

5 — What’s Missing: Team Scale and Standards

Karpathy’s setup is — like most of these systems — a solo project. That works as long as one person controls the context. The open question: What happens when multiple people work in the same system?

Collaborative context architectures are largely unsolved. How do you share conventions? How do you prevent context drift across multiple agents? How do you scale maintenance? Karpathy himself says: “I think there is room here for an incredible new product instead of a hacky collection of scripts.”

Maybe. But maybe the product isn’t an app — it’s a set of conventions. A standard for the architecture of LLM-powered knowledge systems. Not the tool, but the practice.

Critical Assessment

What holds up

The convergence is real and empirically observable — different communities independently arrive at the same fundamental decisions
Markdown + Git as a foundation for LLM interaction has proven robust in practice
The four-layer architecture (Ingest → Structure → Operation → Output) is a useful model, even though boundaries are fluid
Karpathy’s observation that RAG is often unnecessary at moderate scale aligns with practical experience

What needs context

Survivorship bias: we see the successful setups — not those who tried Notion, Roam, or proprietary tools and gave up
Karpathy’s perspective is that of an ML researcher with high technical competence — accessibility for non-technical users remains an open question
“Convergent evolution” is a strong metaphor, but the sample is small and self-selected (tech-savvy early adopters)
The scaling question (team, enterprise) isn’t just open — it may be a fundamentally different problem than the solo setup

Discussion Questions for the Next Lab

01 Convergence or echo chamber: Do practitioners genuinely arrive at the same stack independently — or does the visibility of certain setups (Karpathy, Willison) influence others’ decisions? How much is real convergence, how much social imitation?

02 Wiki vs. OS: Where exactly is the boundary between an LLM-maintained knowledge archive and an operational Knowledge OS? Is the transition gradual — or is there a qualitative leap where requirements fundamentally change?

03 Standards vs. products: Karpathy sees room for “an incredible new product.” Is the answer actually a product — or rather a set of conventions and patterns that different tools can implement? What would be the “HTTP for Knowledge OS”?

04 Accessibility: These systems require Markdown literacy, Git understanding, and LLM experience. How do you democratize that — without losing the advantages of the simple stack?

Sources

Glossary

Convergent Evolution (here: technological) Independent actors develop the same solution to the same problem — not through coordination, but through identical selection pressures. In biology: wings in birds and bats. Here: Markdown + Git + LLM across different practitioners.

Knowledge OS A structured, git-versioned knowledge repository that goes beyond an archive — the LLM actively operates within the system (consistency checks, task structuring, output generation) rather than just answering questions.

Context Engineering Designing the persistent information environment in which an LLM operates — beyond individual prompts. Encompasses file structures, conventions, dependency systems, and layered context architectures.

Compiled Wiki (Karpathy) A Markdown wiki that an LLM generates from raw data and incrementally maintains — with summaries, categories, and backlinks. Read-heavy, with the LLM as maintainer.

Curated by David Latz · Panoptia April 2026

Related Field Notes

Agent Memory: Why Your AI Has Amnesia and How to Fix It

Mar 27, 2026 · Casius Lee (Oracle)

Context Engineering: Building a Knowledge OS with Claude

Mar 8, 2026

Documentation Is the New Interface — and Design Systems Are the Testing Ground

Apr 3, 2026 · Figma (Developer Documentation)