The Right Stuff

Chuck Yeager broke Mach 1 with two cracked ribs and a sawed-off broom handle taped to his right arm. Andrej Karpathy quit a running company on Tuesday morning to take an individual-contributor research role. Six CTOs went before him. Anthropic’s Mercury Seven is complete. The only question left is whether you have it too.

THE NUMBER: 7 — the Mercury Seven, announced by NASA on April 9, 1959, after a brutal six-month winnowing of 110 of America’s best military test pilots. Scott Carpenter, Gordon Cooper, John Glenn, Gus Grissom, Wally Schirra, Alan Shepard, Deke Slayton — seven names that became the face of American spaceflight for a generation. Chuck Yeager, the first man to break the sound barrier and the most accomplished test pilot in U.S. military history, did not have a college degree and was disqualified before the selection process even began. The greatest pilot of his era watched the program from Edwards Air Force Base, kept flying experimental aircraft until he aged out, and became a footnote in the era that the men he could out-fly defined. Sixty-seven years and forty days later, on the morning of Tuesday May 19, 2026 — the same morning Sundar Pichai opened Google I/O 2026 with a $190 billion annual capex flex, an agent named Gemini Spark, and the unveiling of a Universal Cart that turns the AI agent into a transactor — Andrej Karpathy posted a single short tweet that the entire AI industry stopped to read. He had started this week at Anthropic. He was working on pre-training under team lead Nick Joseph. He was building a team focused on using Claude to accelerate pre-training research itself. It was the seventh signing in a fourteen-month migration that had already pulled the CTOs of Adept AI, Super.com, Box, Instagram, You.com, and Workday out of leadership tracks at well-funded companies and into individual-contributor research roles at a single lab. Mercury Seven, completed. Karpathy used one word in his own announcement that he didn’t have to use, and it becomes the spine of tonight’s piece: frontier. “I think the next few years at the frontier of LLMs will be especially formative.” The greatest researcher of his generation chose the program. Six CTOs ran the same calculation before him. The question — for every reader of this newsletter who hasn’t asked it yet — is what you would do if NASA called your office tomorrow. Whether you have what those seven have. Whether you’d take the call.

Tom Wolfe published The Right Stuff in 1979, and the central conceit of the book — the question that runs underneath every page like a low hum of jet wash — is the simplest one in journalism. What is it that makes a man willing to sit on top of an enormous Roman candle and wait for someone to light the fuse? Wolfe spent eight years answering it. He interviewed Yeager and the Mercury Seven and their wives and their flight surgeons and their barkeeps. He came back with a 416-page book and one phrase that entered the language permanently. The right stuff. Not bravery. Not skill. Not pedigree. Something else. An intangible quality that separated the men who would put their hide on the line and then put it on the line again the next day — and the next day, and the next day, every next day, even if the series should prove infinite — from the men who would not. Wolfe’s central argument was that you could not define it because if you knew enough to define it, you didn’t have it. You could only recognize it. The men who had it recognized each other. The men who didn’t recognize anything.

Phil Kaufman made the movie in 1983 and gave us Sam Shepard as Yeager — laconic, watchful, perpetually amused by NASA bureaucrats with their clipboards and their public-affairs offices. The opening sequence is one of the great scenes in American cinema. Yeager is on a horse with his wife Glennis at Pancho’s Place, a dive bar near Edwards, the night before he is supposed to fly the X-1 through the sound barrier for the first time in human history. The horse spooks. He goes over the front. Two cracked ribs. He cannot reach his right arm above his head — which means he cannot reach the canopy hatch of the X-1 to lock it shut before flight. He drives back to base in the dark and finds his flight engineer Jack Ridley, and the two of them saw off the end of a broom handle, and Yeager tapes the broom handle to his arm as a lever, and the next morning he flies a rocket plane through the sound barrier for the first time in the recorded history of human aviation. He does not tell the flight surgeon about the ribs. He does not tell the project director about the ribs. He does not tell anyone, because to tell anyone is to scrub the mission, and to scrub the mission is to admit that the demon in the air had won this round.

That is what the right stuff looks like. Not the part where you break the sound barrier. The part where you go up with two cracked ribs and a piece of broom handle because the alternative is not breaking it.

Tuesday morning at the frontier of large language models, a 39-year-old researcher posted a tweet with broken ribs.

Five Frontiers in Eleven Years

The thing about Karpathy — the thing the press never quite frames correctly — is that he doesn’t have jobs. He has frontiers. The résumé reads like a normal résumé until you tilt it forty-five degrees and notice that every move was a step outward on a curve that hadn’t existed the year before he stepped onto it.

2015 — OpenAI, founding member. The frontier was building a lab. Sam Altman, Ilya Sutskever, Greg Brockman, Wojciech Zaremba, John Schulman, Karpathy. A handful of names on a one-page mission statement and a billion-dollar pledge of philanthropic compute. There was no playbook for an AI research lab. There was no GPT yet, no transformer victory, no scaling law, no API to sell. Sit on top of the Roman candle. Light the fuse. See where it goes.

2017 — Tesla, Senior Director of AI. The frontier was putting deep learning into cars at fleet scale. Karpathy ran Autopilot from a blank whiteboard. The car wasn’t a research project. It was a production vehicle with regulators and lawyers and a CEO who would call you at 2 a.m. to argue about a bounding box. The frontier wasn’t whether the model could see. The frontier was whether you could ship the model to a million customers without killing any of them. Different problem. Same broom handle.

2022 — Back to OpenAI. The frontier had moved. GPT-3 was out. The scaling-law era had begun. Karpathy returned to lead the team trying to make models that were useful at a level GPT-3 wasn’t yet. ChatGPT shipped in November of that year and changed the temperature of every boardroom on the planet inside of six weeks. Karpathy was on the inside of that. Roman candle. Fuse. See where it goes.

2024 — Eureka Labs. The frontier was teaching the world to ride the candle. Karpathy left OpenAI and started an AI-native education company. He posted three-hour video lectures explaining transformers from scratch and watched them rack up tens of millions of views. He coined the term vibe coding in a single tweet in early 2025 and an entire ecosystem — Cursor, Lovable, Replit, T3 Code, Anthropic’s Claude Code — calibrated itself to a phrase he had thrown off in 280 characters. He was the most-followed AI account on X. He was running a company he owned. He could have raised any round he wanted. He could have stayed there until 2040. The right stuff was, by 2025, no longer in question. The question was what to point it at.

May 19, 2026 — Anthropic. The frontier is recursive self-improvement. Karpathy’s stated mandate at Anthropic is to build a team focused on using Claude to accelerate pre-training research itself. Using Claude to make Claude better. Read that sentence carefully. It is the AGI thesis stated plainly in eight words. The bet is that the most consequential leverage in AI from this point forward is not more compute, not more data, not more researchers — it is the model itself becoming the binding input to the model’s own improvement. The work Karpathy did last year on auto-research already demonstrated what happens when you give an AI agent two days of unsupervised latitude over its own code. It found 20 things human review missed. That same instinct, applied to pre-training at a frontier lab with serious compute, is a qualitatively different research program from anything Anthropic — or anyone — has run before. Roman candle. Fuse. See where it goes.

Five frontiers in eleven years. Each one was the frontier for that moment. Karpathy didn’t switch companies. He kept walking forward on a curve whose tangent kept rotating. Wolfe would have recognized him on sight.

Where Else Was He Going

Here is the question every casual reader will skip past and the question every allocator should park on for ninety seconds: Karpathy chose Anthropic. Out of what menu?

Lay the candidates on the table.

Google had Demis Hassabis at DeepMind, a Nobel-laureate-grade research operation with twenty years of pedigree and a CEO who is himself a research-program-defining intelligence. Hassabis runs Google’s AI division. There is no seat next to Hassabis that doesn’t put Karpathy under him in an org chart. Pichai spent the entirety of Tuesday morning’s I/O keynote making the case that Google is now firmly in the agentic Gemini era — $190 billion in capex this year, 3.2 quadrillion tokens a month, 8.5 million developers, Gemini Spark as a 24/7 consumer agent, Antigravity 2.0 as an agent operating layer, Universal Cart as a commerce rail for agents, Gemini Omni as a unified any-input-to-any-output model. It was the largest single product announcement in the company’s history. None of it required a Karpathy. Hassabis and his lieutenants were already on the bench. Karpathy at Google is Yeager applying to be a copilot.

OpenAI is where Karpathy already worked. Twice. The pattern Tuesday’s news made undeniable is that Karpathy had now left OpenAI twice — once in 2017 for Tesla, once for Eureka Labs in 2024, and now he was choosing not even the option of a third return. The reasons are not mysterious. Last Tuesday, Sam Altman spent four hours on a witness stand in Oakland answering attorney Steven Molo’s cross-examination about whether OpenAI had abandoned the charitable mission it was founded under in 2015. Yesterday, a jury took less than two hours to dismiss Musk’s $134 billion case against him on a statute-of-limitations technicality. The legal win cleared the path to an IPO. It did not clear the question. The OpenAI Greg Brockman wrote a $25 million check to MAGA Inc. in late 2025 is not the same company the OpenAI Karpathy founded in 2015. The thing he founded had a stated mission. The thing it became is a private corporation about to go public on the largest valuation in the history of private capital. A researcher who can choose his employer at the frontier does not choose a company whose soul has been litigated in a federal courtroom by the people who built it.

xAI is the option the casual reader writes off too fast. Karpathy spent five years working for Elon Musk at Tesla, building Autopilot from a blank whiteboard into a fleet-scale product. That is a real working relationship. The xAI seat was not theoretically open to him — it was practically open, with a CEO he knows and a compute substrate that is, by some margin, the largest single training facility on the planet. He didn’t take it. The structural truth that explains the choice is that xAI in 2026 is increasingly embedded inside SpaceX’s Colossus 1 compute fabric — a research lab whose intelligence layer is, structurally, a server farm with a chatbot bolted on. The Aligned News API reported overnight that Grok is now integrated as a paid subscription inside the Hermes agent — i.e., xAI’s frontier model is being resold as a feature inside someone else’s agent product. That is what commoditization looks like in real time. Karpathy does not go there. And here is the wrinkle the piece needs to acknowledge: Anthropic’s compute partnership announced earlier this year means Anthropic is now renting capacity from xAI’s Colossus 1 facility. Karpathy’s pre-training research is, in some non-trivial sense, going to be running on Musk’s metal. He found his way back to Musk’s infrastructure without taking the Musk job. The frontier rearranges the org chart, then rearranges it again.

Stay at Eureka Labs. This is the one the analysis usually skips. Karpathy had his own company. He had revenue, distribution, a research agenda he controlled, an education mission he is on the record as caring about deeply. He did not need a new employer. And the migration list around him was not made of small-cap founders trading up. The Workday CTO had Workday equity. The Box CTO had Box equity. The Instagram CTO had Meta equity — one of the most valuable single equity positions in the technology industry, a position that compounded inside one of the four FAANG-era survivors. Six people had platforms. Seven if you count Karpathy. They folded their own outside options — equity, leadership tracks, brand affiliation, the comfort that takes a decade to build — to take individual-contributor research roles at the same lab.

The phrase Wolfe gave us is the right one. Anthropic didn’t recruit them. The frontier did. Anthropic is just the only airframe currently rated for this altitude.

The Four-Layer Moat

Other newsletters will tell you Anthropic is winning because it has the best models. That is part of the story. It is not the structural story.

Anthropic in May 2026 has four moats, and the four are compounding on each other. That is what makes the talent flywheel run.

Moat one: technical slope. Claude Opus 4.7 scored 90.9% on Harvey’s BigLaw Bench last week — a benchmark Harvey itself wrote to measure substitution against billable legal work. Anthropic’s Agentic Coding Trends Report earlier this year quantified 98% more pull requests merged under heavy AI adoption, 91% longer review queues, and deployment velocity flat. Claude is, by the most credible enterprise benchmarks available, the workhorse model of the agentic era. The technical curve is steepening, and Karpathy’s mandate is to bend the curve further by closing the recursive loop.

Moat two: moral clarity. Dario Amodei built the company on a 2017 thesis paper that read like a map of today’s stack. He has been on the record longer than any other frontier-lab CEO that safety and capability are not separable. Brockman wrote checks to MAGA Inc. Altman litigated his own founding mission in federal court. Musk runs a server farm for SpaceX. Hassabis is institutionally constrained by Google. Amodei is the only frontier-lab CEO in 2026 whose stated values match his revealed values on the question of what this technology is for. That is rarer than the press makes it sound. It is also the necessary condition for the talent flywheel — because the people good enough to be recruited are good enough to do diligence on the recruiter.

Moat three: the talent flywheel. Karpathy makes seven. Once the seventh signing lands publicly, the pool of mid-cap tech CTOs reading the Anthropic announcement is no longer dealing with an isolated data point. They are dealing with a trend line. The Workday CTO’s calculus from March 2026 — do I make the leap? — becomes the calculus of every mid-cap tech CTO in June and July. The talent vacuum is the second derivative of the model curve. Anthropic’s model gets better, the migration headlines compound, the next CTO decides to leap, the lab gets better, the next CTO sees the headlines, and the loop closes. By the time the journalists name this dynamic — and they will, probably within sixty days — it will already be one cycle ahead of them.

Moat four: enterprise share that doesn’t show up in the model leaderboards. Ramp’s May 2026 AI Index — covered in our Real Estate Business issue last Thursday — put Anthropic at 34.4% of paid business adoption versus OpenAI at 32.3% and Google in low single digits. Anthropic grew 4x in twelve months. OpenAI grew 0.3 percentage points. Google’s $190 billion in capex this year is going into the infrastructure under the enterprise market — not the purchase decision in it. The corporate buyer is not buying Gemini today. The corporate buyer is buying Claude.

Four moats. Each compounding the next three. None of them is a temporary advantage. All of them feed each other. That is what allocators should be tracking. Not the next benchmark score. The structure of the moat.

The Bessemer Math Is Physics, Not Opinion

Last Thursday, Bessemer Venture Partners published a memo that should be required reading in every LP committee in America. The title: Inside the biggest bet in corporate history. Byron Deeter and Sam Bondy laid out the math on the table without flourish.

The five U.S. hyperscalers — AMZN, MSFT, GOOGL, META, ORCL — are projected to cumulatively deploy approximately $8 trillion in AI capex from 2025 through 2031. For reference, the entire annual budget of the U.S. Department of War over the same six years is approximately the same number. Pichai’s $190 billion line item this morning is a single year of Google’s contribution. The math, in Bessemer’s words: for each $1 trillion of ramped capex to hit a 15% unlevered ROIC, you need $500 billion in incremental annual revenue at 30% margins. Run the full build-out: $6 trillion of new customer-facing software revenue is required by 2031 to justify the spend. Today’s global software revenue base is approximately $1.4 trillion. The math requires a 4x expansion of the global software market in five to six years — a ~30% CAGR, sustained.

That number is physics. You can argue about the slope. You cannot argue about the equation. Spend $8 trillion, and somewhere downstream of that spend, $6 trillion of new revenue has to materialize or the spend was a misallocation. The hyperscalers are betting their balance sheets on the proposition that the demand shows up. Anthropic is betting its headcount on the proposition that better models create the demand. These are not the same bet. They are complementary bets that both have to be correct for the system to clear.

And here is the part the press is still mis-framing. Bessemer’s own memo says — and we quote — “Hyperscalers don’t have the same software IP position in this wave (especially at the foundation model layer), and they don’t have the same market power against the layers above them.” Translated: the value in this wave does not stay with the people who pour the concrete. It rotates up the stack to whoever owns the foundation model and whoever owns the application layer. Bessemer is saying, in BVP-speak, that the substrate is being eaten from above.

Which means the answer to the question we have been pressure-testing in this newsletter for two weeks — can both Google and Anthropic win? — is yes, but not symmetrically. Google captures the infrastructure value. Anthropic captures the foundation-model and application-layer value on top of that infrastructure. Microsoft captures the resale margin. The cloud-era pattern repeats with new actors in the old roles. AWS was a hit. Snowflake was a bigger hit. Both were real. Both were necessary.

Both also required somebody to lose.

The Three-Tier Hierarchy, and Who Gets Eaten

Tomasz Tunguz of Theory Ventures made the framing precise on his Q1 2026 earnings call analysis: in this wave, you want to be at the inference layer or at its first derivative. Everything else gets commoditized. Karpathy and the six CTOs voted with their resumes. They moved to the inference layer. Up one tier. Not sideways.

The hierarchy reads cleanly once you draw it.

Tier 1 — the inference layer. Anthropic, OpenAI, Google DeepMind, xAI. Where the intelligence lives. This is where Karpathy is now. This is where the recursive bet pays off if it pays off. This is the layer that eats the layers below it as it gets better.

Tier 2 — the first derivative. Microsoft (compute, distribution, the Azure-Office-GitHub stack), Stripe (payments for agents), Datadog (observability, where 20% of customers now drive 80% of ARR), Twilio (voice rails). They benefit from the inference layer but they are not the inference layer. They are picks-and-shovels. Their margin is real. Their moat is the time it takes for the inference layer to bridge them.

Tier 3 — the substrate. Office, PowerPoint, Excel, Workday, ServiceNow, Salesforce CRM, every SaaS product whose value proposition was “we are the workflow.” Workflow lock-in evaporates when the agent is the workflow. Claude plus Google Sheets handles the spreadsheet. Gemini Omni handles the deck. The substrate doesn’t die the way Lotus 1-2-3 died — replaced by a cleaner version of the same thing. It dies the way Yahoo Mail is dying. Slowly, then all at once, with the smartest power users leaving first and the long tail clinging on until the network effect inverts.

Look at the migration list one more time. Adept AI was tier 1 (a frontier agent lab that didn’t make it). Box was tier 3. Instagram was tier 3 wrapped in network effects. Super.com was tier 3. Workday was tier 3. You.com was a tier 2.5 hybrid. Every single CTO who jumped moved up to tier 1 from a lower tier. The pattern is not “talented people changing jobs.” The pattern is the inference layer recruiting itself upward — because the only place where the value rotates in this wave is the place those six people now sit.

That is the structural map. Now look at who has to lose.

Microsoft is not a loser in the EBITDA sense. They are one of Bessemer’s five hyperscalers. They have Azure, they have Office, they have GitHub, they have a board-level relationship with every Fortune 500 IT department in America. Their cash flow can defend the substrate for a decade while it erodes. But the value capture in their wave is not where it was in the cloud wave. They are selling whoever wins the model race. The reseller margin is real. The reseller margin is not the same trade as the foundation-model margin.

And Microsoft, more than any other company on this list, remembers exactly what it looks like to be on the wrong side of this trade. In 1981, IBM released the IBM PC. They built it from off-the-shelf parts to ship fast and they licensed the operating system from a tiny Albuquerque startup called Micro-Soft. The hardware carried the IBM brand. The software was somebody else’s IP. By 1995, MS-DOS and Windows had become the platform; IBM had become a hardware vendor. In December 2004, IBM sold its entire personal computer business to Lenovo and exited the consumer market. Twenty-four years from peak to exit, on a trade where IBM thought the brand was the moat and the software was a commodity. The Albuquerque startup that licensed the OS is the company that now sits, four decades later, on a market cap north of $3 trillion. Microsoft was the company that ate IBM’s lunch. They know the structure of the trade from the inside.

Which is why the OpenAI partnership — covered in this newsletter for eighteen months as a strategic alliance — should be read going forward as a staging ground for absorption. Microsoft does not want to be IBM. Microsoft is not allowed to be IBM. The board will not permit it. The only structural escape from the IBM trade is to own the inference layer outright, which means absorbing OpenAI before OpenAI’s IPO closes the window. The cleared legal path to that IPO this week is the gun on the table. Every quarter OpenAI doesn’t IPO, Microsoft has leverage to absorb on better terms. Every quarter OpenAI does IPO, Microsoft has to write a bigger check. Watch for this move inside twelve months. It is not a question of whether. It is a question of at what price. Microsoft is Switzerland today because Switzerland is a useful posture during a war. Switzerland after the war becomes whoever bought the most assets while the war was on.

xAI is the harder call. Embedded inside SpaceX’s Colossus 1 compute fabric, increasingly resold as a paid feature inside other companies’ agents, and effectively functioning as the intelligence layer for Musk’s own products — Tesla, Starlink, the rest. That is a real business. It is not a frontier-lab business. The Grok-inside-Hermes integration overnight is the tell. xAI’s frontier model is no longer the product. It is a feature of the SpaceX/Tesla compute stack. Server farm with a chatbot bolted on.

OpenAI is the most interesting case because the financial story and the structural story are pointed in different directions. The OpenAI legal win this week cleared the IPO path. The company is on track for one of the largest IPOs in the history of private capital. Cash flow is fine. Brand is fine. ChatGPT is still, by some margin, the most-used consumer AI product in the world.

And yet.

In December 1919 — Christmas, specifically — the owner of the Boston Red Sox, a Broadway theater impresario named Harry Frazee, sold the contract of a 24-year-old left-handed pitcher and outfielder named Babe Ruth to the New York Yankees for $100,000 in cash plus a $300,000 mortgage on Fenway Park. Frazee needed the money to finance a Broadway musical called No, No, Nanette. He had Ruth as collateral. He sold the franchise’s future for short-term cash because the short-term cash was sitting on the table and the future was not. The Yankees turned that trade into 27 World Series titles. The Red Sox did not win another championship until 2004. Eighty-six years of mediocrity for the team that let the star walk. The phrase the city of Boston used for those eight decades was the Curse of the Bambino.

OpenAI in May 2026 is the Red Sox in December 1919. The IPO path is the Broadway show. The cash is sitting on the table. The franchise’s future — the talent flywheel, the next generation of researchers who would have signed up to work on the original mission — is the Bambino, walking out the door. Twice. Karpathy left OpenAI in 2017 and again in 2026. He went to a divisional rival both times. The Yankees signed him.

OpenAI will probably IPO. The cash flow will probably be fine for a long time. The trajectory of revealed preference is not pointed where the cash flow is pointed. Frazee’s Red Sox were also fine for a season. Then they were fine for half a season. Then they spent eighty-six years explaining why the team they built was no longer the team that won. The Curse of the Bambino is a real economic phenomenon dressed up as superstition. It is what happens to a franchise that trades the talent flywheel for cash.

The Wrinkle Under the Frontier

We owe the reader a counterargument before the close, because the spine of this issue is recursive self-improvement at the inference layer is where the value rotates, and there is a contrarian story showing up in the same news cycle that complicates it.

This morning, on Show HN, a researcher named Antoine Zambelli published a tool called Forge. The pitch: a reliability layer for self-hosted LLM tool-calling. The eval result: an 8-billion-parameter open-source model (Ministral-3 8B Instruct Q8 on llama-server) scoring 86.5% across a 26-scenario agentic eval suite and 76% on the hardest tier. The same model without Forge’s guardrails — rescue parsing, retry nudges, step enforcement, context budgeting — scored 53%. With Forge, it scored up to 99% on standard tasks. The paper has been published in the ACM proceedings and the code is open source under MIT.

The contrarian read: the frontier model arms race may be a sideshow. If a single researcher can take an open-source 8-billion-parameter model from 53% to 99% on agentic tasks by writing better plumbing, then most of the value the frontier labs are claiming is sitting in the engineering layer around the model, not in the model itself. Combine Forge with Tomasz Tunguz’s Localmaxxing thesis from earlier this month — running 50% of his daily VC workload on a local 35B model on a MacBook M5 — and you have a picture where the frontier labs are competing for a slice of the inference pie that most workloads do not need to buy.

Today, Google released Gemini 3.5 Flash — a frontier-class model at 4x the output speed and half the price of comparable frontier models. Pichai’s keynote claim: if top companies shifted 80% of their workloads from other frontier models to 3.5 Flash, they would save over $1 billion annually. The 80% number is doing a lot of work in that sentence. It is also doing real work. Most of what people need from a frontier model is not the frontier. Most of what they need is fast, cheap, and reliable — and the curve on that axis is bending toward commoditization much faster than the curve on the raw capability axis.

The wrinkle for tonight’s thesis: if recursive self-improvement at the inference layer pays off the way Karpathy is betting it pays off, the seven at Anthropic win. If the actual marginal dollar in this wave is spent on engineering around an 8B open-source model, then the value capture is more diffuse than the talent flywheel argument suggests — and a lot of the rotation up the stack happens at the engineering team layer, not at the foundation-model layer.

Both can be true. Both are probably partially true. But the allocator who reads this newsletter should not assume that the only game is the foundation-model game. A lot of the operational savings in 2026 and 2027 are going to come from teams who never bought a frontier API in the first place.

The Other Bet

There is a second wrinkle, and it is harder to dismiss than the Forge one because it goes at the architecture underneath the entire Karpathy thesis.

The bet Karpathy made on Tuesday is that pre-training large language models — text-first, autoregressive, transformer-architecture, scale-driven LLMs — is the leverage point that wins this era. Recursive self-improvement at the LLM layer. Bigger context windows, denser embeddings, better data curation, smarter loss functions, a model that can read its own weights and propose improvements. That is the bet. It is the bet his old company OpenAI has been running since 2018. It is the bet Anthropic has been running since founding. It is the dominant theory of frontier AI in 2026.

It is not the only theory.

One week ago, Mira Murati’s Thinking Machines Lab released Interaction Models — a system that ingests voice, video, and text in 200-millisecond streaming chunks and is explicitly built around the proposition that text-LLM architecture is the wrong substrate for what comes next. We covered Interaction Models last week in I Am Iron Man. The thesis Murati is running on, in her own writing, is that conversation is not turn-taking. That world-grade interaction is not text generation. That the future of AI is world models — systems that simulate the environment they’re acting in rather than predicting the next token in a string. Yann LeCun has been making this argument from the Meta side for four years. And on Tuesday morning, Sundar Pichai unveiled Gemini Omni — Google’s first model that generates samples in any output modality from any input modality. Video, image, text, voice. Any. That is not a text LLM. That is the architectural beginning of a world model with Google’s distribution underneath it.

Karpathy could have gone to Thinking Machines. He didn’t.

He went to the lab that is, by some margin, the most text-first frontier lab in the industry. Claude has historically been a text-and-tool-use model with limited multimodal surface. Anthropic does not have a Gemini Omni equivalent. The product roadmap has been steadily widening — vision, code execution, computer use, the legal practice-area workflows we covered last Thursday — but the architectural commitment is to LLM scaling, not to multimodal world-model fusion. Karpathy’s mandate doubles down on that commitment. Use Claude to make Claude better is a bet that the LLM is the right architecture and the remaining gains live in better training. It is not a bet on a different substrate.

If Karpathy is right, the seven at Anthropic win. If LeCun and Murati are right — if the future is multimodal world models with native physical grounding, and text-LLMs are the intermediate technology the way mainframes were intermediate to the PC — then the next phase of frontier research is going to require another stack, another architecture, and another several trillion dollars of capex on a different shape of silicon. The Bessemer math we walked through above assumes the current architecture clears the demand math. It does not assume the architecture changes underneath. If the architecture changes, the $8 trillion is the down payment, not the build-out.

The piece tonight does not resolve this. Tonight’s news doesn’t resolve it. Karpathy made his bet today; LeCun, Murati, and Google made the alternative bet earlier this year and earlier this week. The market will not know which one is right for eighteen to thirty-six months. The allocator who is paying attention should be tracking both bets in parallel and pricing the architectural shift as a non-trivial probability. The cost of being wrong on this question is not measured in stock prices. It is measured in which company name appears on the trillion-dollar valuation in 2030.

The Mercury Seven assembled around a particular bet about which airframe was rated for orbital flight. They were right. They were also lucky. The X-15 program at the same time was pushing a different architecture — sub-orbital, air-launched, rocket-plane — and it produced as many world-changing pilots as Mercury did, by a different route, on a different curve. History remembers Mercury because Mercury made the moon shot. History could just as easily have remembered the X-15 if the political winds had blown thirty degrees the other way. Karpathy is on the Mercury team. The X-15 team is not empty. Watch both runways.

The Mirror

You read this newsletter because you are an allocator of time and capital. Probably both. You are reading this on a phone, on a tablet, on a screen at your desk on a Tuesday night in May. Somewhere in your life there is a calculation you have been postponing because the data wasn’t all in yet. Tonight, more of the data came in. And tonight, the calculation is harder than the easy version of it.

The easy version is: do I have the right stuff? That is the question every reader who reads the Karpathy news as a personal call to action is going to ask themselves. It is also the wrong question, and we owe you the right one before this piece ends.

The right question is whether the program will catch you if you do.

The Mercury Seven were chosen from 110 candidates. Yeager didn’t make the cut. Not because he lacked the right stuff — by every available measure, he had it in larger quantities than the seven men who were picked. Three sound-barrier passes. Combat ace with five aerial kills in WWII. Test pilot of the X-1, X-1A, NF-104A, every experimental airframe at Edwards through the 1950s. He didn’t have a college degree. The institutional filter cut him out before the institutional filter knew what it was filtering for. By the time NASA realized what they had lost, Yeager was at Edwards flying a different kind of airframe, and the Mercury program was halfway through its run.

Yeager did not become a footnote because he lacked the right stuff. He became a footnote because the program didn’t know how to recognize him.

Now apply that to your own desk on a Tuesday night in May 2026. The Anthropic Mercury Seven is seven people. The migration list took fourteen months to assemble — and it assembled in a year when Anthropic was already an extant company with thousands of employees, when the recruiter pipeline was already running at scale, when the resume pool flowing into their inbox was probably the deepest in the technology industry. Seven was the entire result. Several hundred thousand engineers, researchers, executives, and operators worldwide have what could plausibly be called the right stuff for this era. Seven got picked. The math is brutal. The recruiter is not going to call your office tonight. The recruiter is not going to call your office next year. The Anthropic program — like the Mercury program — is structurally too small to absorb the talent that wants to be in it.

Which means the actionable question for the COAI reader is not “should I leap?” It is “what airframe am I in?”

Yeager’s career after the 1959 Mercury announcement is the part of The Right Stuff the movie spends less time on, and it is the part that matters most for the person reading this letter tonight. Yeager kept flying. He commanded an Air Force tactical fighter wing. He flew 127 combat missions in Southeast Asia. He set the altitude record in the NF-104A at 121,000 feet. He stayed at the frontier of something every year of his career until physiology made him stop. He did not sit at Edwards waiting for NASA to revise its admissions policy. He found a different airframe and kept pushing the envelope.

That is the actual mirror for the COAI reader. If you are reading this and you are at a tier-3 substrate company that is about to get bridged by the inference layer over the next decade — get off tier 3. You don’t have to be at Anthropic. You have to be at the frontier of something. If you are at a tier-2 first-derivative company that is fine for now but whose margin is going to compress when the inference layer eats one more bite of the workflow — start the calculation now, not in eighteen months. If you are running a tier-3 SaaS company — the question is not whether to stay or leave. The question is whether the company you are running has a path to tier-2 or it doesn’t. If it doesn’t, the company you are running is the Boston Red Sox in 1920, and you are Harry Frazee, and someone is going to write a book about you in eighty years.

The Karpathy story is not actionable as “go work at Anthropic.” You can’t. Most of you can’t. The Karpathy story is actionable as “what is the next airframe I move to.” Yeager kept flying. Karpathy kept walking forward on a curve whose tangent kept rotating. The frontier is not a destination. It is a discipline. The right stuff is not a quality. It is what you do every day to make sure you’re working on the question that matters most when the program comes looking.

The next time you tell yourself you are postponing a decision because the data isn’t all in yet, ask yourself the Wolfe question. What is it that makes a man willing to sit on top of an enormous Roman candle and wait for someone to light the fuse? If you can answer that question in your own voice, in your own life, on your own terms — you already know which airframe you are supposed to be in. You may never be on the Mercury Seven. Yeager wasn’t either. And Yeager is the protagonist of the book, not the seven men who were.

THE PAYOFF. The Mercury Seven assembled itself in fourteen months. Six CTOs went first. Karpathy made seven on a Tuesday morning in May, on the same morning the largest tech company on earth held the biggest product launch of its history and could not, for one news cycle, outshine a single tweet from a single researcher. The Bessemer math says the spend has to clear. The structure says the value rotates up. The substrate gets eaten. The first derivative collects margin. The inference layer compounds. Microsoft has to absorb OpenAI before the IPO closes the window. The architectural bet on text-LLMs may or may not survive the world-model wave that LeCun, Murati, and Google are running on the other runway. Five things have to happen at once for the system to clear, and the press is covering one of them. If you are an allocator of time, your job tonight is to figure out which airframe you are flying. If you are an allocator of capital, your job is to figure out which airframe your portfolio is on. The Mercury Seven is closed. The frontier is still open. And the frontier doesn’t recruit.

The frontier reveals — and Yeager was the protagonist of the book.

The Daily 5

1. Karpathy joined Anthropic. Seventh signing in fourteen months. The talent flywheel completed its Mercury Seven on a single Tuesday. Source-date verified ✓ — TechCrunch, Axios, Karpathy’s own X account, all timestamped today.

2. Google I/O 2026 turned Gemini into the agent layer. Gemini 3.5 Flash, Gemini Spark, Antigravity 2.0, Universal Cart, Gemini Omni, Daily Brief, Google Pics, Android Halo. The largest single product release in Google’s history. $180-190B annual capex this year — 6x the 2022 number.

3. Gemini 3.5 Flash reset the speed-vs-intelligence floor. 4x faster than comparable frontier models, half the price, top of GDPVal, #1 on Vals AI Finance Agent Benchmark. Caveat: some users report pricing pushback — verify before quoting Pichai’s $1B savings claim.

4. Forge: an 8B open-source model goes from 53% to 99% on agentic tasks with the right guardrails. Antoine Zambelli, 26-scenario eval suite, ACM paper published. The contrarian wrinkle of the day — most agentic workflows don’t need a frontier API; they need better plumbing around a smaller model.

5. CodeMender vs. Mini Shai-Hulud. Google’s CodeMender auto-patches vulnerabilities; Snyk reports the Mini Shai-Hulud worm is stealing credentials and hooking into Claude Code + VS Code. The agent security arms race went fully autonomous on both sides in the same news cycle.

What is it that makes a man willing to sit on top of an enormous Roman candle and wait for someone to light the fuse?”
— Tom Wolfe, The Right Stuff, 1979