Notes that compound: a wiki the LLM maintains for itself

Written with Claude. I wrote the ideas and structure; Claude helped refine the prose.

The unit of memory for an LLM, as currently practiced, is the conversation. You open a chat, you ask questions, the model answers. Tomorrow you open a new chat and the model knows nothing about the work you did the day before — not because it forgot, but because there is nothing to remember. The session is the artifact; the artifact is gone.

This was tolerable when sessions were the unit of work. They aren’t anymore: agents now run for hours, projects span months, and the conversation as memory unit is showing the strain.

Anything you actually want to keep — the cross-reference you formed between two articles, the synthesis the model wrote at the end of a long thread, the contradiction you flagged between sources — has to live somewhere durable, or it doesn’t compound. Chats are ephemeral. RAG is retrieval. Memory features are personalization. None of them is accumulation.

System	What persists between sessions
Chat	Nothing
RAG	The raw documents, re-summarized on every query
Memory features (mem0, Letta, Claude/ChatGPT memory)	Facts about you — preferences, projects, history
LLM wiki	Synthesized understanding of a topic

Andrej Karpathy posted a gist in April sketching what the fourth row should look like.

The proposal

The idea is a folder of markdown files the model writes for itself. As you feed it sources, it produces summary pages, entity pages, and concept pages — with cross-links between them, contradictions flagged inline, an index that catalogs the whole. When you ask a question, the model reads its own notes first and synthesizes. If the synthesis is worth keeping, it files the answer back as a new page.

“The LLM is rediscovering knowledge from scratch on every question. There’s no accumulation.” — Karpathy, LLM Wiki

The structural insight is that the wiki has three layers, each with strict ownership:

raw/ — the source documents you’ve curated. The model reads from here but never writes.
wiki/ — markdown pages the model writes. The model owns this layer entirely; you read.
A schema file — a rulebook describing how the wiki is organized. The model writes it once during init. The human edits it as conventions evolve.

The schema is the layer that does the most work, and it does almost none of it on the day it’s written.

The gist sketches the idea but doesn’t ship code. A few weeks ago I packaged it as a Claude Code skill.

Four operations, one that’s actually different

The skill exposes four operations: init, ingest, query, lint. Three of them do work on the wiki. One of them does something else — and it’s the one most likely to be dismissed as setup.

Init writes the schema. This is the single most important step, because once the schema exists, every future session — even sessions that don’t have the skill loaded, and in principle any LLM capable of following a written contract — can keep the wiki consistent by reading it. The skill’s leverage compounds outward from this one file.

Ingest is where the work happens. The model reads a source, copies it verbatim into raw/, writes a summary page in wiki/sources/, then updates every entity or concept page the source touches. A meaningful ingest changes between five and fifteen pages. If only the source page is touched, the cross-reference pass was skipped and the ingest is incomplete.

Query reads the wiki’s own notes before answering. The model loads the index, opens the most relevant pages, and synthesizes with citations that link back. If the answer is a new map of the territory — a comparison, an analysis, a synthesis — the model offers to file it back as a new page.

Lint is the health check. Contradictions across pages, stale claims newer sources have superseded, orphan pages with no inbound links, concepts mentioned three times that should have their own page by now. A categorized report; fixes offered one category at a time.

Three of these do what they say. Init looks like setup. It isn’t.

Watch the schema do its job

Words can argue this; the artifact has to show it. I ran the skill against two sources — the Wikipedia articles on Espresso and Espresso machine — and committed the wiki after each ingest. The intermediate state is recoverable from the git history.

Ingest #1 — the basics

The skill read the espresso article, copied it to raw/espresso-wikipedia.md, and created four pages:

wiki/sources/espresso-wikipedia.md       (the summary page)
wiki/concepts/espresso.md                (new)
wiki/concepts/crema.md                   (new)
wiki/concepts/extraction.md              (new)

Here is crema.md after the first ingest, exactly as it sat on disk:

---
type: concept
title: Crema
created: 2026-05-16
updated: 2026-05-16
sources: [espresso-wikipedia]
---

# Crema

The reddish-brown foam that sits on top of a well-pulled espresso shot.
Crema is an emulsion of coffee oils and dissolved CO₂ in water — the oils
are emulsified into a colloid, and the CO₂ is gas released during roasting
that the brewing pressure dissolves into the liquid (per [Espresso
(Wikipedia)](../sources/espresso-wikipedia.md)). Crema is unique to
espresso and does not appear in drip, pour-over, or other brewing methods.

## See also
- [Espresso](espresso.md)
- [Extraction](extraction.md)

One paragraph, one citation, two cross-references. A notebook page.

Ingest #2 — the machines

Then I gave it the article on espresso machines. The skill copied that to raw/, wrote a new espresso-machine.md page, and — without being asked — opened crema.md and added a section to it. Here is the diff against the file above:

---
type: concept
title: Crema
created: 2026-05-16
updated: 2026-05-16
sources: [espresso-wikipedia, espresso-machine-wikipedia]   # ← added
---

# Crema

(opening paragraph from ingest #1 — unchanged)

## How crema is produced

Crema forms because of sustained high pressure on the puck. Pump-driven
and lever-driven [espresso machines](espresso-machine.md) operate at
8–10 bar (per [Espresso machine (Wikipedia)](../sources/espresso-machine-wikipedia.md)),
which is enough to dissolve the roasting CO₂ and emulsify the oils
described above. Steam-driven machines generate lower pressure and produce
thinner, lower-quality crema for exactly this reason (per [Espresso
machine (Wikipedia)](../sources/espresso-machine-wikipedia.md)).

## See also
- [Espresso](espresso.md)
- [Espresso machine](espresso-machine.md)   # ← added
- [Extraction](extraction.md)

Three things happened here that wouldn’t have happened in a folder of notes I wrote by hand.

The crema.md page now knows the mechanism behind crema, not just its description. The opening paragraph — the part written from the first source — was left alone; the new claim was appended, not overwritten. The sources: list, the See also block, and the separate extraction.md page all updated in the same pass.

And look closely at the new section. The first sentence pulls a fact from the second source (the pressure range). The rest of the paragraph stitches it together with the CO₂ mechanism from the first. Neither source on its own contains that synthesis. The model wrote it because both pages were already in the wiki when the second source came in.

A query

I asked: “What’s the relationship between pressure and crema?”

The skill read wiki/index.md first, opened the two most relevant pages, and answered:

Crema is a pressure-dependent phenomenon. Pump-driven and lever-driven espresso machines force water through the coffee puck at 8–10 bar (concepts/espresso-machine.md), which dissolves the CO₂ released during roasting and emulsifies the coffee oils into the colloid described on concepts/crema.md. Steam-driven machines, which cannot sustain that pressure, produce thinner and lower-quality crema. Brewing methods without pressure — drip, pour-over — do not produce crema at all.

Two citations, both real, both pointing to pages I can open and verify. The answer itself was offered back as a candidate page — wiki/concepts/pressure-and-crema.md — because syntheses are exactly what the wiki is biased to preserve.

Now notice what didn’t happen. No vector search. No reloading the source documents into context. The model read the index, opened two pages, and wrote a paragraph. The work the schema is doing — declaring that the index exists, declaring that pages have a sources frontmatter field, declaring that every claim links back to a source page — is what made any of that possible. The wiki’s structure is what’s compounding, not just the wiki’s content.

The schema is the discipline

Three things distinguish an LLM wiki from a folder of markdown files you’d write yourself. They are not equally important.

The first matters most. The wiki is self-describing. A fresh model session, in a year, opening this folder with no skill loaded — and in principle, with a different LLM behind it — can read WIKI.md and pick up exactly where the last session left off. The schema declares the page taxonomy, the frontmatter contract, the linking conventions, the log format. Any model that reads it before performing an operation produces work indistinguishable from work the skill produced. This is what makes the discipline portable. The skill is a bootstrap. The schema is the protocol.

The other two are mechanism. Cross-references happen on every ingest, not when the human remembers, because the model walks the index before writing. Contradictions get flagged in place with a > ⚠ Contradiction block, not silently overwritten, because the schema says so. Both behaviors fall out of the schema; neither requires the skill to be loaded.

The downside compounds too

The mechanism that makes a wiki valuable also makes it dangerous. A bad synthesis, once written into a page during ingest, becomes part of the wiki’s “knowledge” of the topic — and the next query cites it as if it were established. The flywheel that integrates real information across sources will integrate a hallucinated claim with equal confidence. The longer the wiki runs, the harder a bad claim is to spot, because more downstream pages have cited it.

The schema constrains structure, not truth. Linting catches contradictions between pages; it doesn’t catch a smooth, plausible, wrong synthesis that disagrees with no one. The mitigation is the immutable raw/ layer — every wiki claim links back to a source page, every source page links back to a raw document the human curated. That trace is the audit trail. A wiki without source-page hygiene is a wiki drifting away from its sources, whether anyone has noticed yet or not.

The skill writes a contract and steps back

The skill lives at github.com/alainbrown/skills/tree/main/skills/llm-wiki. Four operations, three layers, one file that does the heavy lifting.

The pattern generalizes. For skills that need state to outlive a single session — wikis, scaffolded projects, decision trackers, anything that has to stay coherent across many invocations — the shape that works is this: write a durable artifact into the repo on the first run, then step out of the way. The artifact is the value. The skill is the bootstrap. Anything a skill has to keep doing, session after session, is a sign that the artifact wasn’t expressive enough.

Run init against any topic you’re learning — a research area, a book, a domain you’re moving into — and ingest as you go. Ten sources in, the model will know things about your topic that you’d have to re-read everything to recover. The conversation isn’t the unit of memory anymore.

The wiki is.