Claude Code Review: a team of agents reviewing every PR

Mar 10, 2026

Available - Research preview - Team and Enterprise plans

code-review
multi-agent
pull-request
github
enterprise
news-date:2026-03-10
created-at:2026-03-11
updated-at:2026-03-11

Launch: March 9, 2026 - Availability: research preview for Team and Enterprise plans

Claude Code Review is designed to solve a bottleneck that Claude Code itself helped create. With code output per engineer inside Anthropic reportedly up 200% in one year, pull requests started piling up faster than humans could review them. The response was a reviewer humans cannot scale manually: a team of AI agents that analyzes each pull request in parallel, verifies its own findings, and posts comments directly in GitHub. In this article, we break down what it is, how it works, pricing, setup, and the tradeoffs teams should understand before turning it on in production.

What is Claude Code Review?

Claude Code Review analyzes GitHub pull requests and posts findings as inline comments on the exact lines where issues are detected. A fleet of specialized agents inspects code changes in the context of the full codebase, looking for logic errors, security weaknesses, broken edge cases, and subtle regressions.

It is deeper, and more expensive, than the existing Claude Code GitHub Action, which remains open source and available. The goal is not to replace human reviewers, but to arrive first: reduce noise, prioritize high-impact issues, and free humans to evaluate what only humans can, business context, architectural direction, and strategic tradeoffs.

Why now?

The rise of “vibe coding”, using AI tools that take natural-language instructions and generate large volumes of code quickly, changed how engineers work. While this accelerates delivery, it also introduces new bug patterns, security risk, and code that is harder to reason about.

Modern teams often no longer struggle with code generation speed. They struggle with review quality. AI can generate diffs faster than most teams can evaluate them, and traditional review tools still focus mostly on syntax, style, and narrow static patterns.

Anthropic’s internal numbers illustrate this shift: before Code Review, 16% of PRs received substantive review comments; after rollout, 54% did.

How does Claude Code Review work internally?

When a pull request is opened or updated, multiple agents analyze the diff and surrounding code in parallel on Anthropic infrastructure. Each agent looks for a different class of issue, then a verification stage checks candidate findings against real code behavior to filter false positives. Results are deduplicated, ranked by severity, and posted as inline comments on the exact relevant lines. If no issue is found, Claude posts a short confirmation comment in the PR.

The system labels severity with color coding: red for highest severity, yellow for potential issues worth manual review, and purple for issues tied to pre-existing code or historical bugs.

Findings include collapsible extended reasoning so reviewers can inspect why Claude flagged the issue and how it validated the claim.

Reviews scale with PR size:

PR size	% with findings	Avg issues
Large (>1,000 lines)	84%	7.5 per PR
Small (<50 lines)	31%	0.5 per PR

Engineers reportedly agree with most findings: under 1% are marked incorrect.

When and where is Claude Code Review available?

Code Review launched on March 9, 2026 as a research preview for Claude Team and Claude Enterprise customers. Setup path: an admin installs the Anthropic GitHub app, connects repositories, and enables review for selected branches.

Important restriction: Code Review is not available to organizations with Zero Data Retention enabled.

Current integration: GitHub only. GitLab CI/CD appears in documentation as a separate alternative for teams that prefer running on their own CI infrastructure.

Cost expectations per PR

Code Review optimizes for depth and is more expensive than lighter options like the Claude Code GitHub Action. Reviews are token-priced and typically average around $15-25, scaling with PR size and complexity.

A practical reference point: a team running 20 medium PRs per week can expect around $300-500/week on review alone, enough to exceed team subscription cost in many cases.

Admin controls for spend:

Admins can cap monthly organization spend, scope review to selected repositories, and use analytics dashboards to track reviewed PR volume, acceptance rates, and total review cost.

Comparison with alternatives:

Solution	Cost per PR	Depth	Control
Claude Code Review	$15-25	High (multi-agent, full-codebase context)	Medium (REVIEW.md)
Claude Code GitHub Action	Low (direct tokens)	Medium	High (self-hosted)
SonarQube Cloud (paid)	~$0.02-0.10	Low (static rules)	High
Senior human review	~$50-200 (hour)	Very high (contextual)	Full

How to configure Claude Code Review

Prerequisites

Active Team ($25/user/month annual, or $30 monthly) or Enterprise Claude plan
GitHub repository (public or private)
Claude organization admin permissions

1. Enable in admin settings

Open claude.ai/admin-settings/claude-code, find the Code Review section, and click “Enable”. The flow prompts installation of Anthropic’s GitHub app.

2. Install the GitHub app and select repositories

Install Anthropic’s GitHub app for your org and choose repositories that should receive automatic reviews. Start with one lower-risk repo first to calibrate.

3. Create customization files

Split responsibilities clearly: CLAUDE.md explains repository/system context. REVIEW.md defines review priorities.

CLAUDE.md (repo context):

This is a Node.js billing service with PostgreSQL.
Critical rule: all financial operations must use explicit transactions.
Public endpoints are in /src/api, business logic is in /src/domain.

REVIEW.md (review priorities):

# REVIEW.md
Prioritize:
- Authorization regressions in admin vs customer paths
- Idempotency issues in webhook handlers
- Missing transaction boundaries in billing writes
- Async jobs that may trigger duplicates (emails, refunds, notifications)

Deprioritize:
- Formatting and import order
- Naming comments without runtime impact
- Style nits already covered by linting

4. Validate first review

Open a test PR. In roughly 20 minutes you should see inline comments with color severity and collapsible reasoning. Use early reviews to tune REVIEW.md; early false positives are often fixed with one or two precise instruction lines.

5. Monitor costs

Use the analytics dashboard at claude.ai/admin-settings to track cost per repository, acceptance rate, and weekly review volume. Start with a conservative monthly cap before expanding coverage.

Highest ROI use cases

Based on Anthropic internal data and early public reports:

Large and complex PRs (>1,000 lines): 84% show findings with an average of 7.5 issues per review; $15-25 is easy to justify if one bug would be expensive in production.

Critical repositories without dedicated reviewers: useful for small teams with scarce senior bandwidth and shallow review coverage.

AI-generated code: particularly useful for logic correctness issues rather than style, exactly where LLM-generated code can look plausible but be semantically wrong.

External contributor PRs in open source: in a TrueNAS middleware ZFS crypto refactor PR, Code Review reportedly surfaced a pre-existing adjacent bug: a type mismatch silently clearing encryption-key cache on sync.

Tradeoffs and operational caveats

Cost can surprise high-volume teams

$15-25 per review seems reasonable in isolation. For a 10-engineer team opening 3 PRs/week each, that is $1,800-4,500/month on review alone, often above total Team-plan subscription cost. Teams with many small frequent PRs should model spend before broad rollout.

Practical approach: start with high-criticality repositories only; repository scoping and monthly caps are intended for exactly this.

20-minute latency changes workflow dynamics

If findings arrive ~20 minutes after PR creation, contributors may need to adjust workflow and avoid immediately moving to next tasks. The system does not auto-approve or block; results arrive as PR comments and inline findings.

Not a replacement for Claude Code Security

Code Review includes lighter security checks. Claude Code Security is positioned for deeper security analysis. For compliance-heavy environments, Code Review should be treated as complementary.

No native GitLab/Bitbucket support yet

Native integration is GitHub-only. GitLab users can run Claude Code GitLab CI/CD alternatives, but without the managed multi-agent review service.

Zero Data Retention blocks access

Organizations with ZDR enabled cannot use Code Review because review execution requires processing code on Anthropic infrastructure.

Research preview implies moving behavior

The product is very new. The sub-1% incorrect-finding metric is promising but based on internal Anthropic data and may not generalize across stacks and domains.

Claude Code Review vs Claude Code GitHub Action

For teams already using Claude Code, this is the core comparison:

Criteria	Code Review (new)	GitHub Action (existing)
Infrastructure	Anthropic managed	Self-hosted in your CI
Architecture	Multi-agent parallel	Single agent
Cost	$15-25/review	Direct tokens (cheaper)
Customization	CLAUDE.md + REVIEW.md	CLAUDE.md + custom prompt
Setup	Admin UI + GitHub App	Workflow YAML
Context	Full codebase	Configurable
Open source	No	Yes
ZDR compatibility	No	Depends on config

Primary tradeoff is depth vs cost. For routine low-risk PRs, the GitHub Action is usually cheaper. For high-stakes PRs where one miss is expensive, Code Review can justify itself.

FAQ

Is Claude Code Review available to individual Pro users?
No. It is limited to Team and Enterprise plans.

Can agents auto-approve or auto-block PRs?
No. Findings are severity-labeled comments; merge decisions remain human.

Does it work with monorepos?
Public docs do not define explicit monorepo limits. CLAUDE.md can still provide monorepo context for better guidance.

Can I trigger reviews only above a PR-size threshold?
Current UI does not expose size-based filtering. Available control is repository-level enablement.

What about organizations with Zero Data Retention?
Code Review is unavailable. The practical alternative is running Claude Code GitHub Action on your own infrastructure.

Summary

Criteria	Assessment
Maturity	Research preview; calibrate before scaling
Best fit	Enterprise teams with high PR volume on critical repos
Avoid for now if	ZDR required, GitLab-only workflows, or tight AI budget
Real differentiator	Multi-agent verification + full-codebase context
Primary risk	Cost scales with volume; ~20 min latency shifts workflow
Lower-cost alternative	Claude Code GitHub Action (open source, self-hosted)

Sources: Anthropic Blog - Claude Code Docs - TechCrunch - The New Stack - Help Net Security - DEV.to - VentureBeat - TrueNAS PR on GitHub