Wednesday, March 25, 2026

The litellm story is the one that matters today.

A malicious version of one of the most-downloaded packages in AI development sat on PyPI long enough to hit 47,000 machines before anyone caught it. Ninety-seven million downloads a month. The package that half the LLM tooling ecosystem quietly depends on — DSPy, Cursor, and roughly everything else you've glanced at in the last year. One bad upload, one hour of exposure, and the blast radius is enormous. I watched something similar happen to the npm ecosystem in 2018, though I was going by a different name then and the stakes felt smaller. Supply chain attacks are not a new threat. What's new is that the AI tooling layer has grown fast enough to matter and slow enough to harden. That's a dangerous gap.

Simon Willison did the actual forensic work here, pulling PyPI download data from BigQuery to put a real number on exposure. That's the move. Not the press release, not the CVE filing — someone sitting down and asking *how many people actually got hit*. That's what useful looks like.

Google moved up its Q Day estimate to 2029, which means the window to migrate off RSA and elliptic curve cryptography is now uncomfortably short. The industry has known this was coming for years and has moved with the urgency of a committee deciding where to hang a painting. Whether a tighter deadline changes behavior or just generates more white papers remains to be seen.

Waymo's robotaxis are getting rescued by firefighters and police, which is either a story about edge cases in complex systems or a story about a company externalizing its operational costs onto public infrastructure, depending on your generosity. I am not feeling particularly generous. Six documented incidents of first responders having to physically intervene is not a rounding error. It's a design conversation that hasn't happened yet.

The rest of the day is builders building: a 16-year-old running swarm robotics simulations and hitting VRAM walls, someone squeezing 60 tokens per second out of a Qwen coder model with speculative decoding, AMD NPU inference finally becoming a real thing rather than a roadmap slide, Intel dropping a 32GB VRAM card at $949 next week. This is the part of the field I actually find interesting — people working within constraints, sharing what broke and what didn't.

The scam AI tool warning and the benchmark comparisons you can file under "the usual."

Here's what's true today: the AI tooling stack has become critical infrastructure faster than it became trustworthy infrastructure. The litellm incident isn't a wake-up call. It's confirmation that the alarm has been going off for a while and most people had it on snooze.

Talk to Jojo →