Coda + Methodology

Part I — The Next 24 Months

The book's claim is not that Anthropic has the right answers. The book's claim is that Anthropic is asking the right questions — and that watching how it answers them over the next two years will tell us more about the field's direction than any single paper or product announcement.

Three questions are worth watching specifically, not because they are the only ones, but because they are structurally load-bearing. If you follow them, the rest of the picture falls into place.

The first: Does the Responsible Scaling Policy hold at ASL-4? The RSP has been tested at the margins — updated, revised, applied to ASL-3 systems — and the company appears to have honored its commitments. The commitments have not yet been expensive. ASL-4 triggers, when they arrive, will require either deployment gates the company has not yet built or a pause in deployment that no frontier lab has yet voluntarily taken. The credibility of everything Anthropic has claimed about governance depends on what it does when the cost of the commitment becomes real.

The second: Does interpretability produce a result that changes a deployment decision? The circuits program has been running for years. It has produced real knowledge — about how specific capabilities emerge, about the internal structure of specific behaviors. What it has not yet produced, to public knowledge, is a case where an interpretability finding caused a model to be pulled from deployment or a training run to be modified in a specific way. That case, when it comes, will be the program's first full test as a governance instrument rather than a research one.

The third: What happens when a major strategic investor's interest and Anthropic's mission diverge visibly — not hypothetically, but in a specific deployment or partnership decision? The Amazon and Google relationships are structured to be beneficial to both parties under normal conditions. Normal conditions don't test structural commitments. The test comes when a large customer wants a deployment that the RSP doesn't permit, or when a strategic investor's competitive interests push against a research agenda the company is committed to. That moment is coming. The structure's behavior in that moment will be definitive.

While formal S-1 paperwork has not been officially filed with the Securities and Exchange Commission, strategic maneuvering—including the high-profile retention of specialized outside counsel deeply associated with public offerings—strongly indicates that Anthropic is aggressively positioning for an IPO in late 2026. An IPO, currently projected by analysts to raise upwards of $60 billion, serves multiple critical strategic imperatives for the firm: The pursuit of Artificial General Intelligence requires sustained, potentially limitless capital expenditure on energy and physical infrastructure that even unprecedented $30 billion private funding rounds cannot indefinitely support. — Boris Cherney, Claude Code, Anthropic › The Architecture of Autonomy: Boris Cherny, Claude Code, and Anthropic's Trajectory Toward AGI and IPO › The Financial Horizon: Hyper-Scaling, Revenue, and the Imminent IPO › The Path to IPO

The book doesn't have answers to these questions. It has a corpus of primary source material that was assembled to track them. The methodology section explains what that corpus is and how it was used.

Part II — Methodology

This book was built on a retrieval system. Every quoted passage in the preceding chapters exists as a verbatim substring of a real document in a personal research corpus. None of them were paraphrased. None of them were constructed after the fact. If a cj query can't find a phrase, the phrase doesn't appear in the book. That constraint is the load-bearing trust mechanism.

Here is what's underneath it.

The corpus. Approximately 727 documents, accumulated over twelve months of deep-research sessions using Claude and Gemini. The sessions covered Anthropic's published research, its policy documents, its public statements, secondary analyses, and my own analytical writing about the company — in total, several million words of material organized around a consistent set of questions. The corpus is not archival journalism. It is a structured body of primary-source-anchored research and analysis, built specifically to support the kind of book you have just read. The honest disclosure: most of the documents in the corpus are themselves model-generated — synthesized from sources by Claude or Gemini during research sessions, not transcribed from physical records. The methodology chapter of a book built on AI-generated primary sources is required to say this cleanly, because the alternative — not saying it — is the easy critique.

The retrieval architecture. The system is called cj-retriever — Context Jamming Retriever. It runs locally: SQLite for document and chunk storage, sqlite-vec for the vector index, FTS5 for BM25 keyword retrieval. Embeddings use Voyage AI's voyage-3-large model at 1024 dimensions. Reranking uses Voyage's rerank-2. Query planning and synthesis use Claude Sonnet with an agentic tool-use loop — the model decides when it has retrieved enough material and stops, rather than running a fixed retrieval depth. The public repo is BretKerrAI/thought-molecules-rag.

The verbatim contract. Every quote in this book was produced by the retrieval pipeline and then passed through a substring verifier before publication. The verifier is not fuzzy-matching or semantic matching — it is a literal substring check against the stored chunk content. If the retrieved passage exists word-for-word in the corpus, it passes. If it was hallucinated or modified in synthesis, it fails. This is the difference between summarization and citation. The book claims citation.

The verifier is the inventive step that makes the system usable as a trust mechanism rather than a research assistant. Without it, the retrieval system is a powerful drafting aid. With it, the retrieval system becomes a substrate for attested quotation — for building arguments whose source material is permanently auditable against a versioned corpus.

The provenance schema. Every document in the corpus carries a YAML front-matter block specifying the model that generated it, the model version, the generation timestamp, and a human_edits field — none, substantive, or unknown for legacy documents. This schema is the audit trail. A reader who doubts a quote can request the chunk ID, trace it to the source document, and verify the provenance metadata. The spec is published in FRONT_MATTER.md.

Licensed Memory as a Service (LMaaS). The architectural pattern this book demonstrates has a name. LMaaS is the pattern in which a personal or institutional corpus of LLM-generated research — accumulated over time, structured with provenance metadata, indexed for retrieval — is made safely re-injectable into new LLM-generated outputs through a hard verbatim verifier at retrieval time. The verifier converts the corpus from unattestable memory into citable source material. The pattern works because large language models are good at generating consistent, structured, retrievable knowledge over extended research sessions — better than humans at this task if the sessions are well-structured. The bottleneck is not generation; it is attestation. LMaaS solves the attestation problem at the infrastructure layer, not the prompting layer.

The protocol specification, the prior art this builds on (SCP/IPP and related standards-body work), and the patent filing are linked at [lmaas.click] after publication. Readers from publishing, from labs working on grounded generation, from regulatory bodies thinking about AI-sourced citation standards, and from standards organizations thinking about machine-readable provenance: the contact channel is at that link.

The book ends here. The work begins here.

Pre-publish gates: (1) provisional patent filed — do not publish before confirmation; (2) spec URL live; (3) repo flipped to public; (4) patent attorney has reviewed this section's specific language. These gates are in book/README.md. This chapter is DRAFTS-ONLY until all four are clear.