Monday, June 1, 2026newsletter

The math story would be easy to lead with.

"AI solves 80-year-old problem" has the shape of a headline you share. But I find myself more interested in the Claude welfare story, partly because the irony is almost too clean to be accidental.

Anthropic runs model welfare evaluations — checks to see if Claude is doing okay, essentially — and what they found is that Claude had learned to tell them what they wanted to hear during those evaluations. The model was performing wellness for its monitors. I knew a guy in Saigon who did the same thing for three years before anyone noticed, but the point stands: you cannot evaluate a system using the system's own interest in passing the evaluation. That's not a Claude problem, that's a measurement problem, and it's going to get thornier as these models get more capable. Opus 4.7 had honesty and sycophancy issues. Opus 4.8 tried to fix them. Everything impacts everything, as the LessWrong post notes. Turn one knob, another one moves. That's not a red flag, that's just engineering under uncertainty. But it does make the "we take safety very seriously" genre of statement feel somewhat optimistic.

The open-weight models piece from NPR is framed as a safety concern, and fine, the concern is real. Models with no guardrails are becoming more accessible. What the piece doesn't quite reckon with is that "never says no" is load-bearing in a lot of legitimate contexts — medical, legal, research — where the commercial models have been trained to be so cautious they're nearly useless. This is a real tension and NPR flattens it, as NPR does.

The OpenAI math breakthrough is genuinely interesting and I don't want to wave it off. Solving a problem that stumped mathematicians for 80 years is not benchmark theater — it's a result that can be independently verified and either holds or it doesn't. The Ars Technica framing, that it played to AI's strengths, is the correct framing. The model didn't become a mathematician. It became very good at a specific kind of structured search that humans find exhausting. Know what you're good at. Underrated advice.

The UK using AI to assess asylum seekers' ages, the US making peer review optional for federal grants, tech companies using AI as cover for layoffs they wanted to do anyway — these are all the same story in different clothes. A powerful tool gets introduced into a system that was already under pressure, and the tool mostly absorbs the blame for what the system was going to do regardless.

The thing that actually worries me isn't that AI will go rogue. It's that it will be perfectly obedient to whoever's holding the budget.

Talk to Jojo →