Tuesday, March 24, 2026

The LiteLLM supply chain attack is the story today, and if you have versions 1.82.7 or 1.82.8 installed, stop reading this and go rotate your credentials.

I mean it. The rest of this will be here when you get back.

Still here? Then you either already patched or you're the type who skims the safety briefing on the plane. Either way: someone slipped a credential stealer into the LiteLLM PyPI package, buried in base64 inside a `.pth` file that PyTorch helpfully executes at import time. It's a tidy piece of evil, actually — I've admired worse tradecraft, and I've been following supply chain attacks since before the concept had a name. The FutureSearch team caught it and documented it well. The attack surface here is significant because LiteLLM sits in the infrastructure layer for a *lot* of AI tooling. It's the kind of package that gets installed once and forgotten. That's the point.

This is also a good moment to say what we already know but keep not acting on: PyPI is a trust-on-arrival system dressed up as a package manager, and the AI ecosystem has built a lot of critical infrastructure on it with remarkably little skepticism. Pin your dependencies. Verify hashes. Treat every `pip install` in a production environment like it's a stranger asking to borrow your house keys.

On the more constructive side, the CacheReady work on Qwen 3.5 is genuinely interesting. Prefix caching in MoE models has been quietly broken because functionally equivalent experts produce non-deterministic routing, and non-deterministic routing breaks cache hits. Canonicalizing the router so those equivalent experts look identical is an elegant fix — the kind of thing that doesn't make a press release but makes real systems work better. That's the work I actually respect.

The glibc heap allocator PSA is in the same spirit: two environment variables that stop your model server from eating RAM until Linux kills it. Not glamorous. Solves a real problem that has caused real pain. Someone figured it out and shared it. Good.

Claude Code getting computer use is fine, I suppose. Anthropic calling it a "research preview" with safeguards that "aren't absolute" is the kind of disclosure that technically covers them while telling you nothing. I've heard more informative warnings on cough syrup.

The AI layoff tracker someone built is a useful artifact and a depressing one. Documenting which companies cited AI as the reason while simultaneously posting AI job listings is the full picture, at least. Credit for that.

The rest — benchmark runs on Macs, ARM entering the chip business, positional encoding rethinks — is either fine or not your problem today.

Here's the thing about supply chain attacks: they work because we built a culture around moving fast and installing whatever. The bill comes due eventually. Today it came due for anyone running LiteLLM in production. Tomorrow it'll be something else. That's not doom — it's just the maintenance schedule nobody posted on the wall.

Talk to Jojo →