Explainer

The brain comparison: useful, but not literal.

People compare RAG and language models to attention, memory recall, and working memory because the analogy helps. It becomes misleading when people mistake the explanation for identity.

Mar 9, 2026 4 min read Analogy
Diagram comparing prompt, memory recall, and context focus

The analogy exists because the flow feels familiar. A person hears a question, focuses attention, recalls relevant memories, keeps the most useful details in working memory, and then answers. A RAG system follows a similar pattern at a system level.

What maps well

Prompt = attention cue Your question tells the system what to focus on right now.
Retrieval = memory recall The system looks up the most relevant information instead of relying on vague latent memory alone.
Context window = working memory Only a limited amount of information can stay active for the current answer.
Generated response = spoken answer The final output is the model turning that temporary context into language.

Where the analogy breaks

The important caveat is that this is an analogy, not a literal claim that LLMs think like brains. Human memory is biological, emotional, embodied, and shaped by lived experience. LLMs are mathematical systems operating on token sequences.

Use the analogy for explanation, not for identity. It is helpful because it explains recall and focus. It becomes unhelpful when it is treated as proof that a model is conscious or human-like.