The Open Questions

The questions Anthropic has not answered publicly are more informative than the ones it has — and this chapter is an attempt to name them precisely, not to resolve them.

This is not a predictions chapter. Predictions in AI have a short half-life and a long embarrassment tail. What the corpus contains, underneath its analyses and arguments, is a set of unresolved tensions — places where the evidence runs in two directions simultaneously, where the company's stated positions and observable behaviors are not fully reconciled, where the next two years will produce evidence that bears directly on whether the founding wager was correct. These are the questions worth watching.

On governance under scale. The founding argument, as laid out in chapter 2, was that structure-first governance would hold under the pressures that culture-first governance had not held at OpenAI. That argument was made when Anthropic was a 11-person company. It is now a company with several hundred researchers and engineers, $7.3 billion raised from Amazon and Google, and a commercial roadmap that requires continued capability advancement to justify its valuations.

The structural question is not whether Anthropic has so far honored its RSP commitments — it appears to have. The structural question is whether the governance mechanisms designed for a small, capital-light research organization can hold their shape as the company scales into a large, capital-intensive platform business. The mechanisms were never tested at this scale. They have never needed to hold against a competitor with Google's compute or Amazon's distribution. They are being tested now.

While founded on safety, Anthropic's Series B pitch deck utilized the exact same aggressive scaling logic ("train the best 2025/26 models... too far ahead for anyone to catch up") to attract investors. This reveals that Anthropic's leadership accepts the "winner-takes-all" dynamic of scaling laws. The schism was not about whether to build the singularity, but who should control the "kill switch" when it arrives. — Analyzing Anthropic GTM Strategy › THE KAPLAN SINGULARITY AND THE THERMODYNAMICS OF STRATEGY: A High-Entropy Analysis of Anthropic's Geopolitical and Economic Manifold (2025-2027) › 8. The Corporate Architecture: Governance and Capital › 8.1 The OpenAI Schism and the "Series B Irony"

On interpretability's tractability. The circuits program's early results — monosemantic neurons, edge detectors, curve detectors, small circuits with nameable functions — were produced on small-to-medium scale models. The working hypothesis of the interpretability program is that the same methodology scales. The open question is whether the complexity of frontier-scale models (hundreds of billions of parameters, emergent behaviors that appear discontinuously at scale) is compatible with the circuits approach's reductionist commitment — whether you can still find the invariants when the system is large enough that the invariants may not exist.

Despite these triumphs, mechanistic interpretability has collided with severe scaling limits. The historical trajectory of the field shows that techniques capable of dissecting small, toy models consistently break down when applied to frontier models like Claude 4.5 or GPT-5.2. The most glaring metric of this limitation is the Attribution Graph Failure. Anthropic's circuit tracing tools and attribution graphs, released in March 2025, can successfully map the full computational paths for only about 25% of complex prompts. Data indicates a notable 75% failure rate where the reasoning pathways remain entirely opaque to researchers. — Deep Research on AI Interpretability Gap › DEEP RESEARCH REPORT: The Interpretability Gap as a Cross-Domain Civilizational Risk › 3. The Mechanistic Interpretability Research Landscape (2024–2026) › 3.2 The Hard Walls: Scaling and Complexity Limits

This matters beyond its research implications. The RSP's current deployment gates rely primarily on behavioral capability evaluations, not interpretability results. Anthropic has made a public commitment that interpretability will become load-bearing as the program matures. If interpretability does not scale, that commitment becomes a different kind of commitment — a promise whose conditions were never met, requiring either a renegotiation of the RSP or a concession that the safety verification method the doctrine was built around was not the right method.

On strategic investment and commercial dependency. The founding was, in part, a bet that a different ownership and funding structure would be more capable of holding safety commitments than OpenAI's capped-profit structure had been. Anthropic chose a public benefit corporation structure. It has held to that structure.

What it has also done is raise several billion dollars from Amazon, which has strategic reasons to prefer a world where Anthropic's models run on AWS; and from Google, which has strategic reasons to prefer a world where Anthropic's models are available on Google Cloud and don't compete directly with Gemini. These are not passive investors. They have preferred infrastructure terms, deployment agreements, and commercial interests in Anthropic's success that are not fully aligned with Anthropic's mission in cases where those diverge.

Anthropic, facing astronomical compute costs, has deliberately pursued a 'multi-cloud' strategy, funding its operations by selling equity stakes to its largest cloud providers. This has resulted in the unprecedented situation where its two main strategic investors, Alphabet and Amazon, are also its chief infrastructure providers and existential rivals. — Google DeepMind Acquiring Anthropic Feasibility › 3.1 A Tangled Web: Alphabet and Amazon as Co-Investors

The question the corpus does not settle is whether the PBC structure is adequate to govern these relationships, or whether the commercial dependency replicates — in a different legal form — the structural problem the founding was meant to avoid. This is not a cynical argument. It is a structural one. The founding bet was that institutional architecture matters more than individual intentions. The same lens applies to the current architecture.

On the competitive frame. The OpenAI story Anthropic's founders told — that the institution was structurally inadequate, that the commercial incentives and safety commitments were in a relationship the governance structure couldn't govern — has not played out cleanly as a validation of Anthropic's thesis. OpenAI has had real governance crises, a board that fired and then reinstated its CEO, a public benefit corporation dissolution process, and continued commercial pressure. It has also continued shipping at the frontier, attracting capital, and expanding its market share. The argument that the structure is inadequate has not produced the outcome the argument implied — that the structure would fail.

the reality of the free market and geopolitical competition forced a severe reckoning. In early 2026, Amodei and Anthropic instituted a highly controversial pivot with the release of RSP v3.0. The updated policy formally and publicly abandoned the commitment to a unilateral pause. The rationale behind this dramatic shift is rooted in the brutal game-theoretic reality of the AI arms race. The talent wars had escalated to unprecedented levels, with competitors like Meta offering $100 million signing bonuses to poach top researchers, creating an environment where any hesitation could be fatal to a company's leadership position. — Dario Amodei AI Insights Analysis › The Trajectory of Artificial General Intelligence: A Decade of Dario Amodei's Evolving Philosophy (2017–2026) › The Evolution of AI Safety Philosophy: From Optimism to Institutional Dread › The Game Theory of Safety: The Responsible Scaling Policy v3.0

What this means for Anthropic is genuinely unclear. If OpenAI's governance problems don't produce a catastrophic failure, the market evidence for "governance matters" becomes ambiguous. If OpenAI's governance problems do produce a catastrophic failure, Anthropic benefits — but a catastrophic failure in frontier AI may not be the kind of event from which any company benefits cleanly.

On the critical threads the corpus contains. The corpus includes material — investigated and labeled in front-matter as speculative — on Anthropic bias claims, alleged strategic maneuvers, and analyses that the company would likely dispute. The editorial decision about these threads matters: they can be carved out into a chapter that owns their speculative status, or they can be excluded from the book's source pool with that exclusion stated. Trying to have it both ways — implicitly drawing on them while not acknowledging the sourcing — would be the worst option.

Editorial aside: The methodology chapter (ch08) owns the corpus-is-model-generated disclosure. But the editorial position on the critical threads is this chapter's responsibility. The honest version is: the corpus contains analysis that hasn't been independently corroborated, that bears on questions of Anthropic's actual behavior versus its stated commitments, and that a reader who cares about those questions deserves to know exists — even if the book can't vouch for it. The carve-out is the right move.

The open questions are not evidence that the founding wager was wrong. They are evidence that the founding wager is still in play — that Anthropic has not yet been tested by the scenarios the founders were most worried about, and that the test, when it comes, will be harder to grade than anyone would like.

What this book was built on, how it was built, and what the methodology reveals about the questions it can and cannot answer — those belong to the coda, which also holds the one claim the earlier chapters couldn't make without giving the method away.