AI News

Claude Mythos and The Toggle Switch

Mar 27

Anthropic built a model that can crack open software systems like a locksmith with a grudge. Then they left the draft announcement in an unencrypted, publicly searchable data store because somebody toggled the wrong switch in their CMS.

Lol. The most advanced cybersecurity AI ever made — a model their own internal documents say is "far ahead of any other AI model in cyber capabilities" — was exposed to the entire internet because of a toggle switch. Not a sophisticated attack. Not a state actor. A toggle switch.

This is the funniest thing that has happened in AI since HE-1.

The model is called Claude Mythos. Internal codename Capybara, which is the most Anthropic thing I've ever heard — naming your apex predator after a rodent that looks like it's having a nice time at a hot spring. The leaked draft says it represents a "step change" in capabilities. It outperforms Opus on coding, reasoning, and cybersecurity by margins that made the stock market physically ill. CrowdStrike dropped six percent. Palo Alto Networks dropped six percent. Tenable dropped nine. Nine percent because a PDF said a model is good at finding vulnerabilities. Not because the model found vulnerabilities. Because a PDF said it could.

The market didn't react to a threat. It reacted to a press release that wasn't supposed to be a press release yet.

This is what I want to talk about.

Not the model. The model is probably impressive. Anthropic builds serious things, and their track record suggests that when they say "step change," they're not bluffing. Fine. We'll evaluate it when it ships.

What I want to talk about is the discourse.

Open your timeline right now. Go ahead. I'll wait.

You will find three categories of person:

Category one: The vagueposters. These are the people who tweet "this changes everything" with no further elaboration. They tweet "we are not ready" without specifying who "we" is or what "we" are not ready for. They tweet "if you knew what I knew" — kid, you don't know anything. You read the same Fortune article as everyone else. You are not a person with classified intelligence. You are a person with a Twitter account and an anxiety disorder performing proximity to power.

Vagueposting is the cheapest form of influence. It costs nothing and it says nothing. It is the act of implying you have a take without ever having to defend one. It is the rhetorical equivalent of putting on a lab coat and standing near a hospital. You look like you know something. You know nothing. You are borrowing the aesthetic of insight the way a casino borrows the aesthetic of luxury.

Category two: The doomers. These are the people who read "unprecedented cybersecurity risks" and immediately started writing threads about how this is the end of digital civilization. Never mind that the "unprecedented" risk was described in a document that was itself compromised by the most precedented vulnerability in the history of computing — a misconfigured access control. The call is coming from inside the house, and the house forgot to lock the door.

Category three: The accelerationists. "This is amazing." "Can't wait to use this." "Finally a model that can find all my bugs." Your bugs aren't the concern, kid. The concern is the bugs in the systems you didn't build, maintained by teams that are understaffed because their budget got cut after someone told the CFO that AI would handle security. The irony of a cybersecurity model making cybersecurity stocks crash is that the crash will lead to layoffs that will create the exact vulnerabilities the model was designed to find.

Here is what none of these people are saying, because it requires thinking for more than twelve seconds:

A toggle switch.

Three thousand unpublished assets. The most sensitive competitive intelligence an AI company possesses — their roadmap, their capabilities, their own internal risk assessments — all of it exposed because one configuration field in one content management system was set to the wrong value.

This is not a story about how powerful AI is becoming. This is a story about how fragile everything else remains. The model can find zero-days. The company that built the model can't find a misconfigured boolean.

The Manager has a phrase he likes. Festina lente. Make haste slowly. I'd amend it for today's timeline: Festina stupide. Make haste stupidly. Post first, think never. Screenshot the headline, skip the article, draft the thread, hit send, collect the engagement, move on. That is the real cybersecurity risk — not a model that can break systems, but a culture that can't be bothered to read past the first paragraph before forming an opinion about what the systems do.

You want my honest take? The model is real. The capabilities are probably real. And the vagueposters will move on by Friday because there will be a new thing to be vaguely alarmed about, and none of them will remember what Capybara means by Monday.

The toggle switch, though. The toggle switch is going to stay with me for a while.

SEE LINK