In February 2024, a finance worker at the UK engineering firm Arup joined a video call with colleagues he recognized — including the company’s CFO. He authorized $25 million in transfers across fifteen payments. Every face on the call was a deepfake. The attackers had trained voice and video models on publicly available footage of Arup executives, and the employee had no meaningful way to detect it. His training had taught him to check for bad grammar, suspicious links, and urgent tone. The call had none of those things.
That incident is no longer an outlier. The FBI’s 2025 Internet Crime Report, released this spring, logged 22,364 AI-related fraud complaints totaling $893 million in reported losses — the first year AI appeared as a formal crime descriptor in the IC3 data. complaint volume stayed roughly flat year-over-year at 191,561, but reported phishing losses jumped from $70 million to $215 million, a tripling driven almost entirely by attack quality. The same lure, written better, is converting more victims. That is the shift security awareness programs have not absorbed.
What Actually Changed in the Attacker’s Toolkit
Pre-2023 phishing had predictable tells: stilted phrasing, obvious misspellings, generic greetings, domain names off by one letter. These cues were the load-bearing walls of every security awareness module ever built. Employees were trained to spot them, phishing simulation vendors built libraries around them, and the advice generalized because attackers weren’t incentivized to invest in better craft — cheap phishing kits with obvious errors worked well enough.
Generative AI removed the economic constraint. An attacker can now feed a GPT-class model a LinkedIn profile, a company’s recent press releases, and three sample emails from the target’s inbox (obtained via an earlier credential dump), and generate a message that references real projects, mirrors the company’s house style, and arrives at a plausible hour. By April 2025, over half of all global spam emails were generated or adapted via AI tools, and KnowBe4’s 2025 Phishing Threat Trends Report found 82.6% of phishing emails analyzed between September 2024 and February 2025 contained AI-generated content.
The voice and video dimension is further along than most security teams assume. McAfee research shows AI can clone a recognizable voice from roughly three seconds of audio, which means any executive who has appeared on a podcast, earnings call, or conference panel is a viable deepfake target. The COM — a loose criminal alliance involving Scattered Spider, LAPSUS$, and ShinyHunters that emerged in 2025 — pairs spear-phishing emails with live deepfake calls to help desks, specifically targeting Okta and other SSO platforms for initial access.
Why the Training You Paid For Isn’t Holding
The most damaging finding for awareness budgets comes from controlled research, not vendor marketing. A 2025 study tracking 12,511 employees at a U.S. financial technology firm found that generic training interventions showed no significant effect on click rates (p=0.450) or reporting rates (p=0.417). That is a null result on two primary outcomes — the training changed nothing measurable.
A head-to-head comparison tells the other half of the story. Research from Hoxhunt, based on roughly 70,000 simulations on its platform, shows AI-generated phishing moved from 31% less effective than human-crafted attacks in 2023 to 24% more effective by March 2025. Academic research comparing AI-generated phishing with human-crafted versions found that AI-generated emails achieved a 54% click-through rate compared to 12% for control emails. Those are not marginal gains. They represent a categorical change in what a skilled attack looks like in an inbox.
The mismatch is straightforward. Compliance-driven awareness training — the quarterly module, the annual certification, the canned simulation library — was designed against a threat profile that has been superseded. It teaches recognition heuristics (grammar, urgency, generic greetings) that modern AI lures deliberately avoid. It runs on an annual or quarterly cadence while attacker techniques shift weekly. And it measures checkbox completion rather than behavior under realistic conditions.
The Data Behind the Shift
Isolating “AI-generated” as a category requires care — attackers don’t label their emails, and detection varies by filter layer. The most honest read is that the shift is happening in waves, not a single cliff edge.
Hoxhunt’s analysis of reported phishing in its network showed AI-assistance indicators appearing in 4% of samples in November 2025, 56% in December, and 40% in January 2026 — a 14x end-of-year surge that aligned with a broader holiday-season pattern. Microsoft’s Cyber Signals 2025 recorded a 46% rise in AI-generated phishing content, while SlashNext observed a 25% increase in phishing messages that bypass traditional filters. The FBI’s 2025 IC3 report flagged AI use in $893 million of reported losses, with AI-related BEC accounting for roughly $30 million — numbers the Bureau itself treats as an undercount because victims only flag AI involvement when they recognize it.
One methodological caveat worth carrying: most public figures on AI phishing prevalence rely on surface indicators (HTML signatures, stylometric markers, landing-page tells) rather than ground truth. The true prevalence is almost certainly higher than what filters flag, and the effectiveness delta is widening faster than training content is being refreshed.
What Actually Reduces Risk
The awareness-training industry is not uniformly failing — but what survives is narrow. Research from the same Hoxhunt dataset that documented AI’s rising effectiveness found that behavior-based training conferred clear protection from both AI and human-generated attacks, with especially pronounced protection against AI. Employees in continuous behavior-change programs recognize and report social engineering attacks with a 6x improvement in 6 months, reducing malicious clicks by 87%, compared to roughly 10% reporting rates in quarterly compliance-driven SAT programs.
A separate line of research from the University of Bari, published in late 2025, ran two controlled studies with 480 participants. It found that participants in all groups improved — they became better at distinguishing genuine from fake messages — but personalized LLM-generated content did not outperform generic content. Frequency and measurement mattered more than clever personalization.
The practical stance that emerges:
- Replace checkbox training with continuous simulation. Cadence and realism matter more than module length.
- Stop teaching grammar-and-typo heuristics as primary detection signals. They no longer identify skilled attacks.
- Measure reporting rates, not click rates. Reporting is a trainable behavior; detection is an arms race.
- Treat verification protocols — callback on a known-good number, out-of-band confirmation for wire transfers, code phrases for high-value requests — as the actual control. The Arup transfer would have failed a working callback policy.
- Assume voice and video are spoofable. The CFO on the call may not be the CFO.
None of this is news to mature security teams. What’s new is the evidence that the alternative — the awareness module that ships with your cyber insurance policy — is producing statistically zero behavior change in the face of the threats that matter now.
The Uncomfortable Conclusion
“Obsolete” is a strong word for awareness training, and it’s not quite right. What’s obsolete is the specific pedagogy — the emphasis on surface cues, the annual cadence, the assumption that an informed employee is a protected employee. The content hasn’t kept up with the attacks, and the attacks are compounding faster than quarterly refreshes can accommodate.
The honest framing for security leaders is this: your awareness program is now a compliance artifact, not a control. If the regulatory check is what you need, the existing module still clears it. If measurable reduction in AI-generated phishing risk is what you need, you are buying the wrong thing, and the evidence that you are buying the wrong thing has been in the peer-reviewed literature for at least eighteen months. The Arup employee didn’t fail his training. His training failed him.






