At DevWorld Conference 2026, Checkmarx delivered one of the most sobering keynotes of the event: “Is Scanning with AI the Answer?” The short answer — no, not by itself. The longer answer is far more interesting.

The Hidden Story: AI Scanning and Noise
The talk opened with a provocative question: “Will AI-only scanning be our saviour? We need to talk about noise.”
The speaker referenced recent work on LLM-based zero-day hunting with Claude Code’s Opus 4.8 — AI models that can autonomously discover and exploit real vulnerabilities at scale. Impressive on paper. Terrifying in practice.

Discovery Got Easier. Exploitation Got Faster.
The central thesis was stark:
- Anthropic’s Claude Opus/Mythos can discover and exploit real vulnerabilities, autonomously, at scale
- Patch-to-exploit window has collapsed from weeks to minutes
- Cyber insurance, regulators, and boards are repricing accordingly
- The constraint is no longer discovery — it is remediation capacity
Gartner confirmed: “finding vulnerabilities is now cheap and fast… the bottleneck has shifted from discovery to remediation.”
Forrester was even more direct: “Patch Tuesday is dead.”

What AI Means for Application Security
The Checkmarx analysis was refreshingly balanced. AI scanning has clear benefits — and clear blind spots.
What AI enables:
- Discover and exploit real vulnerabilities autonomously at scale
- Patch-to-exploit window collapsed from weeks to minutes
- “Find now, fix eventually” is no longer an option
What AI does not fix:
- AI-generated code is still insecure
- LLMs miss old and new vulnerabilities
- Inconsistent results (AI is probabilistic)
- Scaling, cost, and performance issues
- Exposes zero-day vulnerabilities to attackers (maybe)

The Cost of Depth
The pricing table told a powerful story about the trade-off between scan depth and cost:
- Claude Haiku — $1/$5 per 1M tokens (1x cost) → Light scan
- Claude Sonnet 4.8 — $3/$15 (3x) → Standard scan
- Claude Opus 4.7 — $5/$25 (5x) → Deep analysis
- Claude Mythos — $25/$125 (25x) → Agent-level / autonomous
Going from light scan to autonomous agent-level analysis costs 25 times more. For enterprise codebases with millions of lines, that math gets expensive fast.
The Numbers Validate the Model
Checkmarx presented a six-stage funnel showing how their hybrid approach reduces noise from raw scan findings to actionable remediation:
- Raw Findings: 10,000 (all SAST, SCA, IaC scan results)
- AI Findings Analysis: 4,000 remaining → -60% reduction in false positives
- Attack Vector Grouping: 1,500 remaining → -85% reduction in duplicates
- Severity Focus: 800 remaining → -92% reduction in criticality
- AI Triage: 600 remaining → -94% reduction of irrelevant
- Autonomous Remediation: 500 remaining → -95% reduction in vulnerabilities
From 10,000 raw findings to 500 actionable items. That is the difference between drowning in alerts and actually fixing things.

The Hybrid Approach
The conclusion was clear: neither AI-only nor traditional scanning alone is sufficient.
AI-Only (Probabilistic):
- Fast contextual reasoning
- Natural language guidance
- Learns new coding patterns quickly
Traditional AppSec (Deterministic):
- Pattern matching and policy enforcement
- Strong governance
- Reliable detection of known vulnerabilities
The Hybrid (AI + Deterministic Security):
- Focus on what is exploitable
- Combine trusted detection with AI reasoning
- Prevent issues in the IDE pre-commit
- Prioritise risk with context and exploitability
- Visibility of AI components (AI-SBOM)

Three Questions to Take With You
The talk closed with three questions every security team should answer:
How can we remediate at scale and at machine speed? — Shift the conversation from vulnerability count to remediation throughput. If your programme reports on findings discovered, you are measuring the wrong thing.
Do we know what AI is in our codebase? — LLMs, agent frameworks, MCP servers, open-source models from Hugging Face — if you cannot answer this, you have a supply chain blind spot. That is your first gap to close.
Where does security feedback reach our developers? — If the answer is “after the PR,” you are already too late. The cost of fixing after code review is an order of magnitude higher than fixing in the IDE.

My Take
This was one of the most grounded AI security talks I have seen in 2026. No hype about AI replacing security teams. No fear-mongering about AI-powered attacks. Just a clear-eyed assessment: the patch-to-exploit window collapsed, remediation is the bottleneck, and the answer is combining deterministic and probabilistic approaches.
The AI-SBOM concept — knowing exactly what AI components live in your codebase — is going to become as important as software composition analysis was five years ago. If you are shipping agents, MCP servers, or LLM-powered features, you need visibility into that supply chain.
The OWASP Top 10 for LLM Applications covers many of the same risks from a different angle. Together they paint a complete picture of AI application security in 2026.
