Checkmarx at DevWorld 2026: AI Is Not Enough for AppSec

At DevWorld Conference 2026, Checkmarx delivered one of the most sobering keynotes of the event: “Is Scanning with AI the Answer?” The short answer — no, not by itself. The longer answer is far more interesting.

Checkmarx title slide: Is Scanning with AI the Answer?

The Hidden Story: AI Scanning and Noise

The talk opened with a provocative question: “Will AI-only scanning be our saviour? We need to talk about noise.”

The speaker referenced recent work on LLM-based zero-day hunting with Claude Code’s Opus 4.8 — AI models that can autonomously discover and exploit real vulnerabilities at scale. Impressive on paper. Terrifying in practice.

Discovery Got Easier. Exploitation Got Faster.

The central thesis was stark:

Anthropic’s Claude Opus/Mythos can discover and exploit real vulnerabilities, autonomously, at scale
Patch-to-exploit window has collapsed from weeks to minutes
Cyber insurance, regulators, and boards are repricing accordingly
The constraint is no longer discovery — it is remediation capacity

Gartner confirmed: “finding vulnerabilities is now cheap and fast… the bottleneck has shifted from discovery to remediation.”

Forrester was even more direct: “Patch Tuesday is dead.”

Discovery Got Easier Exploitation Got Faster with Gartner and Forrester quotes

What AI Means for Application Security

The Checkmarx analysis was refreshingly balanced. AI scanning has clear benefits — and clear blind spots.

What AI enables:

Discover and exploit real vulnerabilities autonomously at scale
Patch-to-exploit window collapsed from weeks to minutes
“Find now, fix eventually” is no longer an option

What AI does not fix:

AI-generated code is still insecure
LLMs miss old and new vulnerabilities
Inconsistent results (AI is probabilistic)
Scaling, cost, and performance issues
Exposes zero-day vulnerabilities to attackers (maybe)

Checkmarx AI model pricing comparison: Claude Haiku to Claude Mythos

The Cost of Depth

The pricing table told a powerful story about the trade-off between scan depth and cost:

Claude Haiku — $1/$5 per 1M tokens (1x cost) → Light scan
Claude Sonnet 4.8 — $3/$15 (3x) → Standard scan
Claude Opus 4.7 — $5/$25 (5x) → Deep analysis
Claude Mythos — $25/$125 (25x) → Agent-level / autonomous

Going from light scan to autonomous agent-level analysis costs 25 times more. For enterprise codebases with millions of lines, that math gets expensive fast.

The Numbers Validate the Model

Checkmarx presented a six-stage funnel showing how their hybrid approach reduces noise from raw scan findings to actionable remediation:

Raw Findings: 10,000 (all SAST, SCA, IaC scan results)
AI Findings Analysis: 4,000 remaining → -60% reduction in false positives
Attack Vector Grouping: 1,500 remaining → -85% reduction in duplicates
Severity Focus: 800 remaining → -92% reduction in criticality
AI Triage: 600 remaining → -94% reduction of irrelevant
Autonomous Remediation: 500 remaining → -95% reduction in vulnerabilities

From 10,000 raw findings to 500 actionable items. That is the difference between drowning in alerts and actually fixing things.

The Numbers Validate the Model: 10000 findings reduced to 500 through six-stage pipeline

The Hybrid Approach

The conclusion was clear: neither AI-only nor traditional scanning alone is sufficient.

AI-Only (Probabilistic):

Fast contextual reasoning
Natural language guidance
Learns new coding patterns quickly

Traditional AppSec (Deterministic):

Pattern matching and policy enforcement
Strong governance
Reliable detection of known vulnerabilities

The Hybrid (AI + Deterministic Security):

Focus on what is exploitable
Combine trusted detection with AI reasoning
Prevent issues in the IDE pre-commit
Prioritise risk with context and exploitability
Visibility of AI components (AI-SBOM)

The Hybrid Approach: AI plus deterministic security

Three Questions to Take With You

The talk closed with three questions every security team should answer:

How can we remediate at scale and at machine speed? — Shift the conversation from vulnerability count to remediation throughput. If your programme reports on findings discovered, you are measuring the wrong thing.
Do we know what AI is in our codebase? — LLMs, agent frameworks, MCP servers, open-source models from Hugging Face — if you cannot answer this, you have a supply chain blind spot. That is your first gap to close.
Where does security feedback reach our developers? — If the answer is “after the PR,” you are already too late. The cost of fixing after code review is an order of magnitude higher than fixing in the IDE.

Three Questions to Take With You closing slide at DevWorld 2026

My Take

This was one of the most grounded AI security talks I have seen in 2026. No hype about AI replacing security teams. No fear-mongering about AI-powered attacks. Just a clear-eyed assessment: the patch-to-exploit window collapsed, remediation is the bottleneck, and the answer is combining deterministic and probabilistic approaches.

The AI-SBOM concept — knowing exactly what AI components live in your codebase — is going to become as important as software composition analysis was five years ago. If you are shipping agents, MCP servers, or LLM-powered features, you need visibility into that supply chain.

The OWASP Top 10 for LLM Applications covers many of the same risks from a different angle. Together they paint a complete picture of AI application security in 2026.

Wide shot of Checkmarx presentation at DevWorld Conference 2026 theater

Checkmarx at DevWorld 2026: AI Is Not Enough for AppSec

The Hidden Story: AI Scanning and Noise

Discovery Got Easier. Exploitation Got Faster.

What AI Means for Application Security

The Cost of Depth

The Numbers Validate the Model

The Hybrid Approach

Three Questions to Take With You

My Take

Related Articles

Differential Privacy: How Math Protects Your Privacy

GLM-5.2 744B: Sparse Attention Meets Efficient MoE

Reliable AI Agents in Java with LangChain4J — Workshop

AI Gateway on Kubernetes: Route and Load-Balance LLM Traffic