Knowledge dashboard

Search was never about humans

The essay's knowledge, dashboard-shaped — graph, eyes, actions. The graph is the source of truth; the prose follows.

The essay's claims, typed and connected. Synthesis is the author's contribution. Citation grounds in a named source. Derivation follows from prior nodes. Click any node to jump to its card.

S 8 C 7 D 1

S S01-search-one-shape · the spine

Search has always been one shape: graph-traversal-with-ranking — find relevant nodes by walking edges, ranked by some distance function.

C C01-salton-1968

Salton 1968, Automatic Information Organization and Retrieval — set the original shape; vector-space-model named the geometry (Salton, Wong, Yang 1975).

grounds: Salton, Automatic Information Organization and Retrieval (1968)

C C02-pagerank-1998

Brin & Page 1998, PageRank — added link-graph priors to the ranking function.

grounds: Brin & Page, The Anatomy of a Large-Scale Hypertextual Web Search Engine (1998)

S S02-scale-1-inverted-index · four scales · §2

Scale 1 — Inverted index (Lucene, BM25/TF-IDF). Terms ↔ documents, weighted by frequency. Reader: human. Format: ten ranked links.

derives_from: S01-search-one-shape

S S03-scale-2-vector-retrieval · four scales · §2

Scale 2 — Vector retrieval. Embeddings as nodes; cosine as edge; HNSW as the index that becomes a graph itself. Reader: human or LLM.

derives_from: S01-search-one-shape

C C03-hnsw-2018

Malkov & Yashunin 2018 — Hierarchical Navigable Small World graphs. The index that makes vector lookup log-time at scale.

grounds: Malkov & Yashunin, Hierarchical Navigable Small World graphs (2018)

S S04-scale-3-typed-graph · four scales · §2

Scale 3 — Typed knowledge graph. Nodes are claims, not documents; edges are labelled (grounds, derives_from, contradicts). Reader: a self or its agent.

derives_from: S01-search-one-shape

S S05-scale-4-ai-native · four scales · §2

Scale 4 — AI-native search. Query is a declarative sentence shaped like the answer; results are atomic chunks with provenance. Reader: an agent.

derives_from: S01-search-one-shape

S S06-bounded-context-forces-graph · the necessity argument · §3

Bounded context forces structured memory. When |K| exceeds C_n, lossless compression hits Shannon's floor; factoring into typed nodes is the only operation that scales.

derives_from: S01-search-one-shape

C C04-bounded-context-stack

Miller 1956 / Cowan 2001 / Simon / Shannon / Codd — working memory bounded; institutions bounded; lossless compression bounded; factoring is the space-creating operation that doesn't lose information.

grounds: Miller, Cowan, Simon, Shannon, Codd — the bounded-context stack

C C05-mccarthy-okg-theorem-4

McCarthy, open-knowledge-graph — Theorem 4: growing C_n directly does not solve retrieval; the efficient path is filling the graph, not the context window.

grounds: McCarthy, open-knowledge-graph

S S07-two-query-archetypes · four scales · §2

Two query archetypes — humans type two words, agents emit declarative sentences shaped like the answer.

derives_from: S02-scale-1-inverted-index , S05-scale-4-ai-native

D D01-same-shape-across-scales · §2 close

The four scales reset constants, not shape. Node spec, edge spec, query format, reader change; graph-traversal-with-ranking does not.

derives_from: S02-scale-1-inverted-index , S03-scale-2-vector-retrieval , S04-scale-3-typed-graph , S05-scale-4-ai-native

S S08-deepseek-v4-postscript · postscript · Apr 26 2026

Postscript — DeepSeek V4. Hybrid sparse + heavily-compressed attention at 7–10% of V3.2 KV cache: graph-traversal-with-ranking internalized one further. The reader is the model.

derives_from: S05-scale-4-ai-native , S06-bounded-context-forces-graph

C C06-turnbull-metadata

Turnbull, Metadata: the 3rd kind of retrieval (Apr 21 2026) — agents query by attribute; metadata is the retrieval kind lexical and embedding both miss.

grounds: Turnbull, Metadata: the 3rd kind of retrieval

C C07-turnbull-agents

Turnbull, Can agents replace the search stack? (Apr 28 2026) — agents add value on entity-discovery; add nothing on information-discovery, because if they knew what was correct they would not need search.

grounds: Turnbull, Can agents replace the search stack?

Claims that need more grounding. The essay's confident synthesis — but where external corroboration would strengthen the case, or an open question is implicit.

D01-same-shape-across-scales

The cross-scale "same shape, four times" claim is the author's synthesis. No external source names this exact invariance — graph-traversal-with-ranking across inverted index, vector retrieval, typed graph, and AI-native search.

Why eyes: Adjacent literature names the pieces (Bryk on Scale 4, Lù et al. on agent web, McCarthy on Scale 3) but stops short of the cross-scale invariant. The synthesis is plausible from first-hand engineering, but external corroboration would let it sit alongside named prior work.

ground_with:

Bryk, Why Google Search Sucks for AI
Lù et al., Build the Web for Agents
Scale-invariance / fractal data structures in DB & IR literature

Suggest grounding ↗

C05-mccarthy-okg-theorem-4

"Fill the graph, not the context window" is the central technical wager. McCarthy proves it for the scientific case; the essay extends it to personal-memory and agent-retrieval without the same selection pressure.

Why eyes: The wager hinges on long-context degradation as |K| grows. Empirical needle-in-haystack and lost-in-the-middle results corroborate the direction, but the exact crossover where graph retrieval beats raw context for a typed-personal-graph at the scale this essay names has not been benchmarked.

ground_with:

Suggest grounding ↗

S04-scale-3-typed-graph

The personal-graph rewrites of McCarthy — `valid_at` / confidence-decay, inverted edge-density, K-without-A — are extrapolations. McCarthy proves necessity arguments under selection-under-competition; personal-memory has none of those pressures.

Why eyes: The personal case has not been empirically tested at scale. A forty-year-old's graph being node-dense with sparse adjacency is a plausible introspective claim, not a measured property of n personal graphs.

ground_with:

Episodic vs semantic memory literature (Tulving 1972)
Aging and autobiographical memory empirical results

Suggest grounding ↗

S08-deepseek-v4-postscript

The DeepSeek V4 postscript bridges hardware-level memory architecture (KV cache, sparse attention) to retrieval-as-attention as one further scale of graph-traversal-with-ranking. The bridge is suggestive, not benchmarked.

Why eyes: The 7–10% KV-cache figure is real; the framing of compressed attention as "graph-traversal-with-ranking internalized" is the author's. A direct benchmark — DeepSeek V4 long-context retrieval vs typed-graph retrieval at equivalent token counts — would let the postscript pay its own bill.

ground_with:

DeepSeek V4 release notes
Compressed Sparse Attention (CSA) papers

Suggest grounding ↗

S07-two-query-archetypes

"Humans type two words, agents emit declarative sentences" is the author's framing, well-aligned with Exa's training data on link-anchor citations but not yet a published taxonomy.

Why eyes: The query-format split is the load-bearing reason agent search and human search diverge — but the claim would be stronger with measured query-length distributions across human SERPs vs agent tool-call corpora.

ground_with:

Exa training corpus / link-prediction objective

Suggest grounding ↗

What the essay implies for the reader. Etudes drill specific claims; apply items put the principle to work; next points at sibling essays.

etude Two queries →

Type a human two-word query, then an agent declarative sentence; watch the retriever rank differently. Drills S07.

etude Token budget →

Slide the graph size; watch the paste break. The reader is bounded; the corpus is not. Drills S06.

apply Walk a domain through the four scales

Pick a domain you know — your codebase, your docs, your notes. Walk it through Scales 1–4. What's indexed today, what's vectorized, what's typed, what's agent-readable? What's missing?

Share what happened ↗

apply Find your bounded reader

For one corpus you maintain (a wiki, a project, a journal), name the reader and their token budget. If the corpus exceeds it, name the factoring move that would let retrieval scale without degrading.

Share what happened ↗

apply Run the demo →

Clone know-thyself-search, embed the example graph, run the three retrieval modes side by side. The cosine baseline finds the theme; the type filter finds the episode; the provenance rerank demotes the tentative.

Share what happened ↗

next Part I — Know thyself →

The opening of the loop. Why a typed personal-knowledge graph is the architecture, before retrieval enters.

next Part III — Security was never about response →

Same shape, third surface. Security as gates at boundaries — the same deny-by-default, type-first pattern applied to a different domain.