AI safety

Mythos found 6,202 critical bugs. Now who fixes them?

AI just made finding vulnerabilities cheap. It also revealed how slow fixing them still is.

WireRead Editorial26 May 2026Verified May 2026

The answer

Anthropic says Claude Mythos flagged 6,202 high/critical open-source flaws across 1,000+ projects.

TL;DR — the 20-second read

Claude Mythos flagged 6,202 high/critical vulnerabilities (of 23,019 findings) across 1,000+ open-source projects.
Six independent firms assessed a 1,752-finding sample and judged 90.6% (1,587) valid true positives — strong external corroboration for an AI capability claim.
Confirmed examples include a wolfSSL certificate-forgery flaw (CVE-2026-5194) and a 27-year-old OpenBSD flaw — all since patched.
Roughly 50 vetted partners (AWS, Apple, Google, NVIDIA, JPMorgan, Cloudflare, Mozilla…) saw bug-finding rise more than tenfold.
The catch: maintainers can't patch fast enough — Anthropic's answer is OpenSSF Alpha-Omega support and free tooling, not a cash pledge.
The security constraint has shifted: it is no longer detection, it is everything that has to happen after detection.

Anthropic's first public update on Project Glasswing is the rare AI-capability claim that arrives with independent corroboration — and, just as importantly, with its own caveats stated plainly. Six outside security firms reviewed a sample of the findings; Anthropic published the validation rate alongside the raw count rather than just the headline. That matters, because AI security claims — where self-serving pressure is highest — usually arrive without either. But the most important number here isn't the vulnerability count. It's what the vulnerability count reveals about the machinery downstream.

What Mythos actually found

Scanning more than 1,000 open-source projects, Mythos Preview surfaced 6,202 high- or critical-severity vulnerabilities out of a total of 23,019 findings. Those aren't the same thing: 23,019 is everything the system flagged; 6,202 is the subset reviewers rated as serious — a reminder that raw discovery counts aren't impact counts. The accuracy claim needs equal care. Anthropic did not say 90.6% of all 6,202 are real. It said a sample of 1,752 of those high/critical findings was independently assessed by six security firms, and within that sample 90.6% (1,587) were valid true positives, with 62.4% (1,094) confirmed as high or critical. Read precisely, that is still an unusually clean signal for an automated tool at this scale — but it is a validated sample, not a blanket certificate over the full set, and the honest version of the claim says so.

Concrete examples add texture to the aggregate. The flagship finding was a certificate-forgery flaw in wolfSSL (CVE-2026-5194, CVSS 9.1) — a widely deployed TLS/SSL library embedded in billions of devices. Mythos didn't just flag it; it constructed a working exploit that could forge certificates and impersonate a bank or email provider. It has since been patched. Other examples reach back decades: a 27-year-old TCP-SACK flaw in OpenBSD (a subtle logic bug, not a dusty unsafe-function) that lets an attacker remotely crash a device, and a 16-year-old bug in FFmpeg. These aren't toys. They are the kind of vulnerability that, left unaddressed, ends up in national-security advisories.

Anthropic said the relative ease of finding vulnerabilities compared with the difficulty of fixing them amounts to a major challenge for cybersecurity.

Source: Help Net Security · 26 May 2026

The partner network and what it produced

Project Glasswing operates as a restricted early-access programme — roughly 50 vetted partners were given access to Mythos Preview, all of them defensive security teams. The partner list (per Engadget's reporting on 26 May) includes AWS, Apple, Google, Microsoft, NVIDIA, JPMorganChase, Cloudflare, Mozilla and the Linux Foundation — heavy enough to give the programme's results credibility across multiple threat surfaces. Anthropic reports that Glasswing partners' own bug-finding output rose by more than a factor of ten compared with pre-Mythos baselines, and the partner-level numbers bear it out: Cloudflare reported roughly 2,000 bugs (about 400 high or critical) with a false-positive rate it judged better than its human testers', and Mozilla found 271 vulnerabilities in Firefox 150. It isn't just that Mythos is good at this — it's that existing security teams, given the tool, immediately produced an order of magnitude more throughput. The limiting factor was never human skill at recognising a vulnerability once found. It was the speed of looking.

For context, consider what the programme is not. It is not a public release. It is not a benchmark on a curated dataset. It is a live, operational security scan of real open-source code used by real systems, and the results fed back into real patch queues. That's a different claim to 'our model scores well on CyberSecEval' — and a more credible one.

Metric	Result
Projects scanned	1,000+ open-source
Total findings	23,019
High / critical severity	6,202
Sample independently assessed	1,752 (six firms)
Valid true positives (in sample)	90.6% (1,587)
Confirmed high/critical (in sample)	62.4% (1,094)
Partner bug-finding uplift	>10× reported
Notable confirmed flaw	wolfSSL CVE-2026-5194 (patched)
Maintainer support	OpenSSF Alpha-Omega partnership + free tooling

The asymmetry that is actually the story

Anthropic's public response to this asymmetry is an admission that the discovery engine has outrun the repair pipeline — and, tellingly, it is tooling and labour, not a cheque. The update names a partnership with the OpenSSF Alpha-Omega project (which funds maintenance of the most critical open-source software), a Claude Security public beta (Anthropic says Claude Opus 4.7 was used to patch over 2,100 vulnerabilities in three weeks), free Mythos tooling for qualifying security teams, and a Cyber Verification Program for vetted defenders. There is no headline dollar pledge — a detail worth flagging, because the patching gap is a human-hours problem that credits alone would not fix. But the structural read still holds: Glasswing flagged thousands of critical bugs in a matter of months. It will take far longer to patch them all, and in the meantime they sit in a database Anthropic has not made fully public. The dataset remains aggregate and example-based rather than open — which limits what external researchers can independently reproduce, though the six-firm sample assessment is the clearest external check available.

Mythos has already helped its partners find more than ten thousand vulnerabilities overall just a month after Glasswing's launch … the company said that its partners' rate of bug-finding has increased by more than a factor of ten.

Source: Engadget · 26 May 2026

What to watch next

Three things are worth tracking. First, patch velocity: the interesting question isn't how many flaws Mythos can find but whether the repair rate improves proportionally. Anthropic's own dashboard already tells the story — high/critical findings average around two weeks to patch, some maintainers have asked Anthropic to slow disclosure, and only a fraction of reported bugs have shipped fixes. If the OpenSSF support and tooling don't accelerate patching, the programme widens the window of known-but-unpatched exposure rather than closing it. Second, offensive capability: the same model characteristics that make Mythos good at finding exploitable conditions make it valuable for building exploits — and at least one security firm says it reproduced comparable findings with public models, so the capability may not stay contained for long. Anthropic's restriction to vetted defenders is the only current safeguard, and it is a policy rather than a technical one. Third, the dataset question: a full public release of the finding set — with partner sign-off and responsible-disclosure timelines — would turn this from a credible vendor claim into a verifiable scientific result. The six-firm sample assessment is a strong proxy; it is not a substitute for reproducibility.

Frequently asked questions

Is Claude Mythos available to use?

No — Mythos Preview is restricted to roughly 50 vetted Project Glasswing partners for defensive cybersecurity work. Anthropic says it intends to make Mythos-class models more widely available only after stronger safeguards are in place, given the offensive risk of the same capability.

Are the vulnerability numbers trustworthy?

Stronger than most vendor claims, but read them precisely. Anthropic published 6,202 high/critical findings, then had a sample of 1,752 independently assessed by six security firms — 90.6% (1,587) were valid true positives. That's a validated sample, not a blanket certificate over all 6,202, and the dataset isn't fully open. Best read as early operational evidence of a capability shift, not a completed public benchmark.

What is the wolfSSL flaw and has it been fixed?

wolfSSL is a TLS/SSL library deployed on billions of devices. Anthropic's update cited a certificate-forgery vulnerability, CVE-2026-5194 (CVSS 9.1), where Mythos built a working exploit to forge certificates and impersonate a bank or email provider; it has since been patched (wolfSSL 5.9.1) following responsible disclosure.

Why does Anthropic think patching is the new bottleneck?

Many critical open-source projects are maintained by small teams of volunteers. Mythos can flag thousands of real vulnerabilities in days; the same human teams that previously struggled to find those bugs now face a patch queue they cannot clear at the same speed — high/critical findings average about two weeks to fix, and some maintainers have asked Anthropic to slow disclosure (Help Net Security, 26 May 2026).

What is OpenSSF Alpha-Omega, and is there a cash pledge?

Alpha-Omega is an Open Source Security Foundation initiative that funds security improvements in the most critical open-source projects; Anthropic partnered with it to help maintainers triage the patch volume Glasswing generates. There is no dollar pledge in the update — the support is the partnership plus free Mythos tooling, a Claude Security beta and the Cyber Verification Program.

Sources

Project Glasswing: An initial update — Anthropic, 26 May 2026
Anthropic says Mythos has already found more than 10,000 vulnerabilities — Engadget, 26 May 2026
Anthropic: Claude Mythos identified 10,000+ software flaws — Help Net Security, 26 May 2026
Anthropic's Mythos finds 10,000 critical software flaws — Techzine, 26 May 2026

← All news

What Mythos actually found

Anthropic said the relative ease of finding vulnerabilities compared with the difficulty of fixing them amounts to a major challenge for cybersecurity.

Source: Help Net Security · 26 May 2026

The partner network and what it produced

Metric	Result
Projects scanned	1,000+ open-source
Total findings	23,019
High / critical severity	6,202
Sample independently assessed	1,752 (six firms)
Valid true positives (in sample)	90.6% (1,587)
Confirmed high/critical (in sample)	62.4% (1,094)
Partner bug-finding uplift	>10× reported
Notable confirmed flaw	wolfSSL CVE-2026-5194 (patched)
Maintainer support	OpenSSF Alpha-Omega partnership + free tooling

The asymmetry that is actually the story

Source: Engadget · 26 May 2026

What to watch next

Frequently asked questions

Is Claude Mythos available to use?

Are the vulnerability numbers trustworthy?

What is the wolfSSL flaw and has it been fixed?

Why does Anthropic think patching is the new bottleneck?

What is OpenSSF Alpha-Omega, and is there a cash pledge?

Mythos found 6,202 critical bugs. Now who fixes them?

What Mythos actually found

The partner network and what it produced

The asymmetry that is actually the story

What to watch next

Frequently asked questions

Sources

Related

Anthropic wants the world to be able to hit pause

AI as scientific instrument: what OpenAI's June wave actually demonstrates

Three days after launch, Washington pulls Fable 5 — and the precedent is the story

Mythos found 6,202 critical bugs. Now who fixes them?

What Mythos actually found

The partner network and what it produced

The asymmetry that is actually the story

What to watch next

Frequently asked questions

Sources

Related

Anthropic wants the world to be able to hit pause

AI as scientific instrument: what OpenAI's June wave actually demonstrates

Three days after launch, Washington pulls Fable 5 — and the precedent is the story