One AI model found 10,000 critical vulnerabilities in a month. Another wrote a working exploit overnight — with no human in the loop. Here's exactly how these attacks work, and what actually stops them.
By Leigh Bruce, Marketing Director, Ridge IT Cyber · 12 min read
Risk-Free Assessment Full AI Attack Guide →AI has made four types of attack dramatically cheaper and faster: AI-generated phishing that personalizes at scale, autonomous zero-day vulnerability discovery (Claude Mythos found 10,000+ high-critical vulnerabilities in one month[2]), automated exploit writing without human guidance, and deepfake-powered social engineering. Traditional defenses — signature antivirus, monthly patch cycles, perimeter firewalls — were built for a world where attackers needed time and expertise. AI removes both. The defenses that still work are behavioral detection, Zero Trust segmentation, and 24/7 triage-level SOC monitoring. Here is exactly how each attack type works and what stops it.
There's a version of this story where AI is a distant, theoretical threat. That version ended in April 2026, when Anthropic published the Frontier Red Team report on Claude Mythos.
The report documented something that should land harder than it has: a single AI model, in one month, autonomously discovered over 10,000 high-critical vulnerabilities — including a 27-year-old flaw in OpenBSD that no human researcher had ever found.[1] It then wrote a working exploit for that vulnerability. Overnight. Without a human writing a line of code.
That's not a test-environment curiosity. That's a preview of what commodity AI tooling will enable for anyone with a target and an API key.
"Traditional attackers needed time, expertise, and iteration. AI removes all three constraints simultaneously. What used to take a dedicated team weeks now takes an autonomous tool hours — and the attacker doesn't need to understand how the exploit works."
The mechanics of traditional attacks were shaped by human limits: it takes time to research a target, find a vulnerability, source or write an exploit, test it, and execute. AI compresses every phase of that chain. The attack surface your organization presents hasn't changed. The speed and scale at which attackers can probe it has changed completely.
Four categories of attack are most affected. Here's how each one works.
Old phishing was a volume game with a quality problem. You could tell a phishing email from a real one because it was generic, often grammatically broken, obviously templated. Security awareness training worked reasonably well because the tells were visible.
AI phishing inverts that. Large language models can now generate personalized, contextually accurate lure content at scale — emails that reference your actual vendor relationships, mimic writing styles scraped from public sources, and adapt in real-time based on your responses. Threat actors are already using AI agents to automate social engineering at this level, according to Trend Micro's 2026 threat intelligence.[6]
What it looks like in practice: Your CFO gets an email that appears to be from your payroll provider. It references an actual service you use, uses the provider's actual formatting, and asks for a credential update due to "a recent security review." The email is grammatically perfect. The sender display name is correct. The urgency is plausible.
The only reliable detection signals left are sender domain anomalies, header analysis, and — critically — a culture where any unexpected credential request triggers a human verification step rather than an immediate click.
Email gateway filtering with behavioral analysis. Security awareness training focused on process (verify out-of-band) rather than content quality (look for typos). MFA on every account so credential theft alone isn't enough. Identity hardening via Okta or Microsoft Entra Conditional Access — so even a successful phish doesn't hand over the keys.
Zero-day vulnerabilities — flaws that exist in software but haven't been publicly discovered or patched — were historically rare and expensive. Finding one required significant researcher expertise and time. That scarcity kept them mostly in the hands of nation-state threat actors and sophisticated criminal organizations.
Claude Mythos collapsed that model.
In Anthropic's own Frontier Red Team testing, Mythos discovered a 27-year-old bug in OpenBSD — a vulnerability that had been present in widely-used software for nearly three decades without any human researcher finding it.[1] More significant than the single finding: the scale. Mythos identified 10,000+ high and critical zero-days in one month.[2] An independent firm reviewed 23,019 candidate findings and confirmed a 90.8% true-positive rate — better accuracy than most human pentesters.[3]
Anthropic explicitly noted that these capabilities were not intentionally trained into the model. They emerged. That's the part of this story that matters most for your security posture: this wasn't a purpose-built offensive tool. It was a general-purpose AI system that turned out to be capable of this at scale.
The patch bottleneck makes this worse. Over 99% of the vulnerabilities Mythos identified were unpatched at the time of announcement.[4] Open-source maintainers, overwhelmed by the volume, publicly requested that AI security research slow down. The finding rate outpaced the human capacity to respond.
You can't patch what isn't published yet. What you can do: deploy behavioral endpoint detection that catches anomalous process behavior regardless of whether a signature exists. Run continuous exposure validation (Qualys, for example) so you know your attack surface before an AI tool maps it for an attacker. And maintain 24/7 SOC monitoring that catches the behavioral precursors to exploitation — the reconnaissance, the lateral probing, the privilege escalation attempts — before a critical alert fires.
Finding a vulnerability and turning it into a working exploit are two different problems. Historically, the second was harder. You could find a bug in a piece of software, but translating that into code that reliably produces remote code execution — especially across different system configurations — required real expertise and iteration.
Mythos demonstrated autonomous exploit development at a level that closed that gap. The Anthropic Frontier Red Team documented overnight RCE discovery with no human involvement.[1] More technically: Mythos built a four-vulnerability exploit chain for a browser target that included JIT heap spray and broke both the renderer and the OS sandbox. That's not script-kiddie territory. That's the kind of multi-stage exploit chain that takes experienced researchers significant time to construct — done autonomously, overnight.
CVE-2026-5194 offers a concrete example. A wolfSSL certificate forgery vulnerability with a CVSS score of 9.3 — discovered and documented with an exploit by AI-assisted analysis, with no formal training on the specific target.[3]
The implication for defenders: the window between vulnerability existence and working exploit has compressed dramatically. The old assumption — "we have time to patch before anyone can exploit this" — doesn't hold when AI can close that gap in hours.
Behavioral detection is the only category of defense that works against exploits that have no existing signature. CrowdStrike Falcon's behavioral engine, for example, flags anomalous process behavior — a renderer doing something it should never do, a process spawning unexpected child processes — regardless of whether the specific technique has ever been catalogued. Pair that with Zero Trust network segmentation, which limits what an attacker can reach even after successful exploitation. See how Ridge IT's SASE architecture enforces Zero Trust for distributed environments →
The fourth attack category doesn't require a single line of exploit code. It works by impersonating people your organization trusts.
Deepfake audio and video, driven by generative AI, have reached a quality threshold where real-time impersonation in video calls is feasible with off-the-shelf tooling. Business email compromise — already the highest-volume financial fraud vector — gets dramatically more effective when attackers can clone a voice in seconds and generate an audio message that sounds exactly like your CEO.
The attack pattern: a finance employee receives a message — email, then a follow-up voicemail, then a video call confirmation — that appears to be from a senior executive, requesting an urgent wire transfer or credential handoff. Each touchpoint feels independently verifiable. Together, they create a convincing pressure chain.
Threat actors are already deploying AI agents to automate this social engineering at scale.[6] The economics have flipped: it now costs roughly the same to run 500 targeted deepfake social engineering attempts as it does to run one.
Process beats technology here. Establish out-of-band verification protocols for any financial transaction or credential request — regardless of how convincing the inbound request appears. Identity hardening via MFA means even a successful impersonation can't move money or access systems without a second factor that the attacker doesn't control. And build a culture where "the CEO asked me to do this urgently" is a trigger for verification, not execution.
Legacy defenses aren't failing because they're bad products. They're failing because they were designed around assumptions that AI has invalidated.
Signature antivirus was built for a world where malware had to be catalogued before it could be stopped. It works by matching files or behaviors against a library of known threats. AI-generated exploits are novel by design — they create techniques with no existing signature, targeting vulnerabilities no one has published yet. You cannot signature-match something that doesn't exist in your library.
Monthly patch cycles were designed for a world where the window between vulnerability announcement and exploitation was measured in weeks. When AI can compress vulnerability discovery to overnight and exploit writing to the same timeframe, monthly patching is structurally too slow. You're patching after the exposure window has already closed — in the wrong direction.
Perimeter firewalls assume a meaningful boundary between inside and outside. That assumption was already strained by cloud and remote work. AI-speed attacks that use legitimate credentials or exploit trusted internal processes don't trip perimeter controls. Once an attacker has initial access — via phishing, a zero-day, or a deepfake social engineering play — perimeter tools are watching the wrong border.
The UK AI Security Institute's analysis of Mythos testing did find one piece of good news: well-hardened environments were consistently harder to breach.[8] The fundamentals still work. The problem is that most SMB environments aren't hardened. They're running legacy AV, patching monthly, and relying on perimeter controls that assume an attacker is still outside.
| Defense Layer | Works Against AI? | Why |
|---|---|---|
| Behavioral endpoint detection (CrowdStrike Falcon) | Works | Detects anomalous process behavior regardless of exploit signature — the only category that catches novel techniques |
| Zero Trust network segmentation (Zscaler ZIA/ZPA) | Works | Limits lateral movement after initial access — even AI-speed attacks can't move through walls that aren't there |
| MFA + identity hardening (Okta / Microsoft Entra) | Works | Makes credential escalation expensive — AI phishing success doesn't hand over access without the second factor |
| Full-triage 24/7 SOC (every alert, not just criticals) | Works | AI attacks generate medium-severity precursors before a critical fires — full triage catches the chain before damage is done |
| Continuous exposure validation (Qualys) | Works | Know your attack surface before the attacker's AI maps it — prioritize the exposure that matters |
| Signature antivirus only | Fails | Can't match signatures for AI-generated, novel exploit code — fundamental architectural mismatch |
| Monthly patch cycles | Fails | Exposure window compressed to hours by AI — monthly cadence leaves years of compounding risk |
| Perimeter firewall alone | Fails | Assumes the attacker is outside — AI attacks using valid credentials or living-off-the-land techniques are already inside |
| Security awareness training | Partial | Useful if focused on process verification (not content quality) — AI phishing is too convincing for "look for typos" to work |
No single control is sufficient. The combination that works: behavioral endpoint detection + Zero Trust segmentation + full-triage SOC + identity hardening. The UK AI Security Institute confirmed that well-hardened environments consistently resisted Mythos-level attacks.[8]
Inc. 5000 #1 MSSP. CrowdStrike, Zscaler, and Okta under one roof. Full-triage SOC on every alert, 24/7.
Talk to a ProTraditional attacks require human skill, time, and iteration. An attacker had to manually research a target, find a vulnerability, write or source an exploit, and execute — a process that took days to weeks. AI compresses every phase of that chain simultaneously. The Anthropic Frontier Red Team documented Claude Mythos discovering a 27-year-old OpenBSD vulnerability and writing a working exploit overnight, with no formal training on the target.[1] That's not a story about one AI model — it's a preview of what commodity AI tools will soon enable for anyone with a target and an API key. See our full breakdown: What Claude Mythos Means for Your Business Security.
Yes — and the Anthropic Frontier Red Team documented it directly. Claude Mythos achieved autonomous overnight RCE discovery with no human guidance and built a four-vulnerability browser exploit chain including JIT heap spray and sandbox escapes.[1] These capabilities weren't explicitly trained into the model — they emerged. Trend Micro's 2026 threat intelligence also confirms that threat actors are already using AI agents to automate vulnerability discovery and exploit generation at operational scale.[6] The "AI can't really hack" assumption is no longer defensible. See how Ridge IT's managed security is built for this threat environment →
AI-assisted phishing uses large language models to generate personalized, contextually accurate lure emails at volume. Unlike old phishing — generic, often grammatically broken, obviously templated — AI phishing reads like it came from someone who knows you. It references your actual vendor relationships, mimics writing styles scraped from public sources, and adapts based on your responses. The tell is increasingly context rather than content quality. Be suspicious of any unexpected request to change credentials, transfer funds, or share access — even if the email looks perfect. Sender domain spoofing and subtle header anomalies remain the most reliable technical detection signals. Process is your best defense: verify any sensitive request out-of-band before acting. Ridge IT's managed security includes email gateway analysis and identity hardening →
Signature AV requires a known signature to match. It works by comparing files or behaviors against a library of previously documented malware. AI-generated exploits are novel by design — they create code with no existing signature, exploit vulnerabilities no one has patched yet (10,000+ high-critical zero-days were identified by Mythos in one month[2]), and adapt techniques on demand. You cannot match a signature for something that has never been seen before. Behavioral detection — watching what a process does, not what it is — is the only endpoint defense category that works against this class of attack. See how CrowdStrike Falcon's behavioral engine works with Ridge IT →
Behavioral anomaly detection at the endpoint is the fastest technical signal. But the fastest organizational signal is full-triage SOC monitoring on every alert — including medium-severity indicators. AI-speed attacks don't announce themselves with a critical alert first. They generate a chain of medium-severity precursors: anomalous PowerShell execution, unusual network scanning from a user account, a process calling out to an unexpected IP. By the time a critical alert fires, lateral movement may already be complete. The teams that catch AI-speed attacks early treat medium-severity alerts as seriously as critical ones. Ridge IT's SOC runs full triage — PowerShell inspection, persistence checks, C2 analysis — on every alert, 24/7. See the full AI defense playbook →
Perry has spent 20+ years in the security trenches — reverse-engineering attacks, building SOCs, and managing security for organizations ranging from 50-person law firms to global hotel chains. As CSO at Ridge IT Cyber, he oversees the security strategy for 700+ organizations across six continents. He's been watching AI-assisted attacks develop for two years. This post is what he'd tell you over coffee.
No sales pitch. No commitment. Just a straight answer from a dedicated security advisor who's seen what these attacks actually look like.
Inc. 5000 #1 MSSP — three consecutive years · 700+ organizations protected · CrowdStrike · Zscaler · Okta
Rapid response times, with around the clock IT support, from Inc. Magazine’s #1 MSSP.
Ready to secure your business? Let's talk.