TL;DR

Anthropic open-sourced **defending-code-reference-harness** — a Claude-powered pipeline that finds, verifies, and patches code vulnerabilities — and it hit ~1.3k GitHub stars the week of June 4, 2026. It pairs file-safe skills (/vuln-scan, /triage, /patch) with a 7-stage autonomous loop, each Claude agent gVisor-sandboxed. The headline stat from Anthropic's own data: 1,596 vulnerabilities found, only 97 patched by May 22, 2026 — discovery is solved; triage and fixing are the bottleneck. Run /vuln-scan on your repo today; never let autonomous patching auto-merge.

Claude AI Vulnerability Scanner: Discovery Is Cheap, Fixing Is Not

By Rohit Raj — Founding Engineer · 10+ yrs MVP shipping · LinkedIn

Security teams using Claude disclosed 1,596 vulnerabilities by May 22, 2026. They patched 97. That gap — roughly 6%, straight from Anthropic's own engineering write-up (May 27, 2026) — is the most honest number in AI security right now, and it reframes the whole conversation. AI made *finding* bugs almost free and trivially parallel; it did nothing to make *fixing* them faster.

That gap is also why Anthropic's freshly open-sourced **defending-code-reference-harness** — a reference pipeline for using Claude to find and patch vulnerabilities — shot to ~1.3k GitHub stars the week of June 4, 2026. When a vulnerability scanner, not a frontier model, is the week's breakout repo, the AI-security question has clearly moved from "can LLMs find bugs?" to "what do we do with the firehose?" The harness is Anthropic's attempt to answer that, and the candor baked into it is exactly why it earns a working developer's attention.

Below: exactly what shipped, how to point /vuln-scan at your own repository, how the open-source harness stacks up against the claude-code-security-review GitHub Action, the managed Claude Security product, and traditional scanners like Snyk and Semgrep — plus where this quietly breaks and how I'd wire it into a real production pipeline.

What Anthropic Actually Open-Sourced

Strip the launch noise and here is the concrete surface area, from the repository README:

Two modes, different risk levels. A set of interactive skills — /threat-model, /vuln-scan, /triage, /patch, /customize — that only read and write files (safe to run unsandboxed inside Claude Code), and a separate autonomous harness that actually executes code and therefore requires a sandbox.
A 7-stage autonomous loop: Build → Recon → Find → Verify → Dedupe → Report → Patch. Recon partitions the source into input-parsing subsystems; *N* parallel agents craft malformed inputs and run until they reproduce a crash 3 out of 3 times; a separate grader agent re-runs each crash in a fresh container before it counts.
Sandboxed by default. Every agent runs in a gVisor-isolated container with network egress allow-listed to the Claude API only — so a scanner agent can't exfiltrate your code.
Ships for C/C++ memory bugs (Docker + AddressSanitizer out of the box) but is explicitly language- and vuln-class-agnostic — the /customize skill rewrites the pipeline for your stack.
Bring your own Claude access: works against the Claude API, Amazon Bedrock, Google Vertex, or Azure; the subagent model is set via CLAUDE_CODE_SUBAGENT_MODEL.
It's a reference implementation — Python (92.7% of the repo), not maintained and not accepting contributions. You fork it and own it.

The design choice that matters most is adversarial verification. Anthropic reports that adding an independent agent to disprove each finding "roughly halved the rate of non-exploitable findings," and a team that required a working proof-of-concept before reporting drove false positives to "near zero." That is the difference between an AI scanner you'll actually use and one you'll mute after a week of noise. It's the same multi-agent, verify-before-you-trust pattern I dug into in Claude Code dynamic workflows — here it's pointed at your attack surface instead of your feature backlog.

How Do You Run /vuln-scan on Your Own Repo?

You do not need the full sandbox to get value on day one. The interactive skills are file-only and run inside Claude Code, so the fastest first pass is three commands:

bash

git clone https://github.com/anthropics/defending-code-reference-harness
cd defending-code-reference-harness
claude            # open Claude Code in the repo

# 30-second guided run against the bundled "canary" target
> /quickstart

# Then point the same skills at your own code:
> /threat-model bootstrap ~/code/my-service
> /vuln-scan ~/code/my-service
> /triage ~/code/my-service/VULN-FINDINGS.json

That sequence does threat-modeling, a static scan, and triage without executing anything — read/write files only. It's the part I'd run first on any client codebase because it's zero-risk and surfaces the obvious data-flow and access-control issues fast.

When you're ready for execution-verified findings (real crashes, not pattern matches), you opt into the autonomous harness:

bash

python3 -m venv .venv && .venv/bin/pip install -e .
./scripts/setup_sandbox.sh          # one-time: installs gVisor, builds agent images
export ANTHROPIC_API_KEY=sk-ant-...

# recon -> find -> verify -> report, 3 runs in parallel
bin/vp-sandboxed run my-service --model <model-id> --runs 3 --parallel --stream --auto-focus

# generate candidate patches from the verified findings
bin/vp-sandboxed patch results/my-service/<timestamp>/ --model <model-id>

The thing the quickstart won't tell you: cost and time concentrate in the autonomous `find` and `patch` stages, because that's where you're paying for *N* parallel agents to fuzz and re-fuzz. The interactive /vuln-scan is cheap; the full harness on a large target is not. Scope it to a subsystem with --auto-focus before you turn it loose on a monorepo — the same token-budget discipline I argued for in LLM context compression.

Where Does an AI Vulnerability Scanner Earn Its Keep?

This is not a blanket replacement for your existing scanners. It pays off in three specific shapes of work.

1. Context-dependent bugs that pattern matchers miss. Traditional SAST (Snyk, Semgrep, CodeQL) is excellent at known signatures — a hardcoded secret, a SQL string concatenation, a vulnerable dependency version. It is weak at business-logic flaws, broken access control, and unsafe data flows that span multiple files. Claude reasons about those the way a reviewer does. On a fintech build like myFinancial, the bugs that scared me were never the ones a regex catches — they were "this endpoint trusts a user-supplied account ID three functions deep," and that's exactly the class an LLM scanner is built to find.

2. Triaging a backlog you already have. If you've ever run a commercial scanner and gotten 400 "findings," you know the real work is deciding which 12 are real. The harness's /triage skill with multi-vote confirmation (--votes 5) is genuinely useful *on findings you already have* — point it at an existing SARIF/JSON export and let it rank exploitability and kill the false positives.

3. C/C++ and memory-unsafe code. The out-of-the-box pipeline targets memory bugs with ASAN, which is the highest-stakes, hardest-to-audit category. If you maintain a parser, a codec, or any native library, this is the configuration that ships ready to use.

The thread through all three: it shines when the bug requires reasoning about intent, not matching a known bad string. For dependency CVEs and secret detection, your existing tools are faster and cheaper — keep them.

Harness vs GitHub Action vs Claude Security vs Snyk/Semgrep

Anthropic shipped *three* security things in the same window, and they're easy to confuse. Here's the honest split, including the traditional scanners you probably already run:

Tool	Type	Runs where	Cost	Best for	Main tradeoff
defending-code-reference-harness	Open-source reference pipeline	Your machine / CI, self-hosted	Free + Claude API tokens	Deep audits, custom stacks, execution-verified findings	You fork and maintain it; not supported
claude-code-security-review	Free GitHub Action	CI on every PR	Free + API tokens	Diff-scoped review on pull requests	Scans the change, not the whole repo
Claude Security (managed)	Hosted product, Claude Opus 4.7	Anthropic cloud / Claude Code on web	Paid (Enterprise public beta)	Teams that want scanning without owning a pipeline	Closed beta; per-seat cost; less control
Snyk / Semgrep / CodeQL	Traditional SAST + SCA	CI, IDE	Free tier → paid	Dependency CVEs, secrets, known patterns, compliance	Misses multi-file logic flaws; noisy on novel bugs

#### Claude Security vs Snyk — do you replace your scanner?

No — you layer it. Per The New Stack, managed Claude Security (built on Opus 4.7 — I compared that model's tradeoffs in Opus 4.8 vs 4.7) re-examines every finding to prove or disprove it before showing you, which is the verification layer Snyk lacks. But Snyk's dependency graph and license scanning are things the LLM doesn't do. The right 2026 stack is traditional SAST for known-pattern coverage + an AI scanner for the reasoning-heavy bugs — not one or the other.

When Should You Skip (or Gate) This?

Because the discovery side works so well, the failure modes all live downstream — which is exactly where Anthropic is most candid.

Autonomous patching is not production-ready. Anthropic's own write-up notes models generate inconsistent patches and that one team's fixes were "as restrictive as possible, to the point that they would break connections." A patch that closes a hole by breaking a feature is a regression with a security excuse. Never let `/patch` auto-merge — treat every generated fix as a draft PR for human review. This is the same invisible-failure trap I wrote about in AI-generated code anti-patterns: the code looks right and is wrong in a way tests don't catch.

Severity inflation is real. Without an understanding of your threat boundaries and compensating controls, the model "inflates severity." A finding it scores critical may be unreachable behind auth you didn't describe — which is why the /threat-model step isn't optional decoration; it's what calibrates everything after it.

It assumes you can sandbox. The autonomous loop needs Docker + gVisor. On a locked-down corporate laptop or a CI runner you don't fully control, you may be limited to the interactive (file-only) skills — still useful, but not the execution-verified mode.

Token cost scales with thoroughness. Three parallel runs fuzzing a large target is real money. If your need is "block the obvious stuff on every PR," the lightweight GitHub Action or your existing SAST is the better-fit, cheaper tool. Reach for the full harness for *audits*, not for *every commit*.

How I'd Wire This Into a Production Pipeline

Here's the concrete way I'd actually adopt this on a client build, not the demo version.

Split it by risk, not by hype. The file-only /threat-model + /vuln-scan skills go in early and often — they're cheap and safe. The autonomous execution harness runs as a scheduled audit (weekly, or pre-release), never inline on every commit. That keeps the token bill predictable and matches each mode to its real cost.

Gate the diff, not the repo, on PRs. For per-PR coverage I'd run the claude-code-security-review Action scoped to the changed files — scanning the whole monorepo on every push is how you burn budget and train the team to ignore the bot. Diff-scoped + a required-check status is the integration that actually changes behavior.

Make a human own every patch. I wire /patch output into a *draft* PR with the original proof-of-concept attached, assigned to a person. The 1,596-found / 97-patched gap is the warning: the bottleneck is human review capacity, and pretending an LLM closes that gap is how you ship a broken "fix." On a regulated build — payments, health, anything in fintech — that human gate is non-negotiable.

Treat it as a layer, log everything. It sits *alongside* Snyk/Semgrep in CI, not instead of them, and I persist every finding + verdict so the false-positive rate is measurable over time. Security tooling you can't measure is security theater. This is the same production-hardening mindset I bring to securing MCP servers — the quickstart gets you a demo; the wiring gets you something you can trust on a real codebase.

If you want this kind of security-and-reliability engineering built into your product from day one instead of bolted on after an incident, that's the work I do: I run fixed-scope 6-week MVP builds, or you can hire a founding engineer in India to own the whole pipeline end to end.

TL;DR

Claude AI Vulnerability Scanner: Discovery Is Cheap, Fixing Is Not

By Rohit Raj — Founding Engineer · 10+ yrs MVP shipping · LinkedIn

What Anthropic Actually Open-Sourced

Strip the launch noise and here is the concrete surface area, from the repository README:

Two modes, different risk levels. A set of interactive skills — /threat-model, /vuln-scan, /triage, /patch, /customize — that only read and write files (safe to run unsandboxed inside Claude Code), and a separate autonomous harness that actually executes code and therefore requires a sandbox.
A 7-stage autonomous loop: Build → Recon → Find → Verify → Dedupe → Report → Patch. Recon partitions the source into input-parsing subsystems; *N* parallel agents craft malformed inputs and run until they reproduce a crash 3 out of 3 times; a separate grader agent re-runs each crash in a fresh container before it counts.
Sandboxed by default. Every agent runs in a gVisor-isolated container with network egress allow-listed to the Claude API only — so a scanner agent can't exfiltrate your code.
Ships for C/C++ memory bugs (Docker + AddressSanitizer out of the box) but is explicitly language- and vuln-class-agnostic — the /customize skill rewrites the pipeline for your stack.
Bring your own Claude access: works against the Claude API, Amazon Bedrock, Google Vertex, or Azure; the subagent model is set via CLAUDE_CODE_SUBAGENT_MODEL.
It's a reference implementation — Python (92.7% of the repo), not maintained and not accepting contributions. You fork it and own it.

How Do You Run /vuln-scan on Your Own Repo?

You do not need the full sandbox to get value on day one. The interactive skills are file-only and run inside Claude Code, so the fastest first pass is three commands:

bash

git clone https://github.com/anthropics/defending-code-reference-harness
cd defending-code-reference-harness
claude            # open Claude Code in the repo

# 30-second guided run against the bundled "canary" target
> /quickstart

# Then point the same skills at your own code:
> /threat-model bootstrap ~/code/my-service
> /vuln-scan ~/code/my-service
> /triage ~/code/my-service/VULN-FINDINGS.json

When you're ready for execution-verified findings (real crashes, not pattern matches), you opt into the autonomous harness:

bash

python3 -m venv .venv && .venv/bin/pip install -e .
./scripts/setup_sandbox.sh          # one-time: installs gVisor, builds agent images
export ANTHROPIC_API_KEY=sk-ant-...

# recon -> find -> verify -> report, 3 runs in parallel
bin/vp-sandboxed run my-service --model <model-id> --runs 3 --parallel --stream --auto-focus

# generate candidate patches from the verified findings
bin/vp-sandboxed patch results/my-service/<timestamp>/ --model <model-id>

Where Does an AI Vulnerability Scanner Earn Its Keep?

This is not a blanket replacement for your existing scanners. It pays off in three specific shapes of work.

Harness vs GitHub Action vs Claude Security vs Snyk/Semgrep

Anthropic shipped *three* security things in the same window, and they're easy to confuse. Here's the honest split, including the traditional scanners you probably already run:

Tool	Type	Runs where	Cost	Best for	Main tradeoff
defending-code-reference-harness	Open-source reference pipeline	Your machine / CI, self-hosted	Free + Claude API tokens	Deep audits, custom stacks, execution-verified findings	You fork and maintain it; not supported
claude-code-security-review	Free GitHub Action	CI on every PR	Free + API tokens	Diff-scoped review on pull requests	Scans the change, not the whole repo
Claude Security (managed)	Hosted product, Claude Opus 4.7	Anthropic cloud / Claude Code on web	Paid (Enterprise public beta)	Teams that want scanning without owning a pipeline	Closed beta; per-seat cost; less control
Snyk / Semgrep / CodeQL	Traditional SAST + SCA	CI, IDE	Free tier → paid	Dependency CVEs, secrets, known patterns, compliance	Misses multi-file logic flaws; noisy on novel bugs

#### Claude Security vs Snyk — do you replace your scanner?

When Should You Skip (or Gate) This?

Because the discovery side works so well, the failure modes all live downstream — which is exactly where Anthropic is most candid.

How I'd Wire This Into a Production Pipeline

Here's the concrete way I'd actually adopt this on a client build, not the demo version.

Claude AI Vulnerability Scanner: Anthropic's Open-Source Code-Security Harness (2026)

TL;DR

Claude AI Vulnerability Scanner: Discovery Is Cheap, Fixing Is Not

What Anthropic Actually Open-Sourced

How Do You Run /vuln-scan on Your Own Repo?

Where Does an AI Vulnerability Scanner Earn Its Keep?

Harness vs GitHub Action vs Claude Security vs Snyk/Semgrep

When Should You Skip (or Gate) This?

How I'd Wire This Into a Production Pipeline

Read Next

OmniRoute Review (2026): Is the 20k-Star Free AI Gateway Worth It vs OpenRouter & LiteLLM?

MCP Goes Stateless: Migrate Your Server Before the 2026-07-28 Spec

Claude AI Vulnerability Scanner: Anthropic's Open-Source Code-Security Harness (2026)

TL;DR

Claude AI Vulnerability Scanner: Discovery Is Cheap, Fixing Is Not

What Anthropic Actually Open-Sourced

How Do You Run /vuln-scan on Your Own Repo?

Where Does an AI Vulnerability Scanner Earn Its Keep?

Harness vs GitHub Action vs Claude Security vs Snyk/Semgrep

When Should You Skip (or Gate) This?

How I'd Wire This Into a Production Pipeline

Read Next

OmniRoute Review (2026): Is the 20k-Star Free AI Gateway Worth It vs OpenRouter & LiteLLM?

MCP Goes Stateless: Migrate Your Server Before the 2026-07-28 Spec