· guide ·

How to set up Claude Code PR review in 2026 — 3 options, real tradeoffs

// figure Pull request review pipeline

// FILED Claude Code// SOURCE Septim Labs// PERMALINK /blog/claude-code-pr-review-setup-2026.htmlcite this →

Published April 20, 2026 · by Septim Labs

Short version: Claude Code can review your pull requests at three different points in the development loop — locally (you invoke it), in CI (GitHub Actions invokes it), or pre-commit (git invokes it). Each one catches different bugs, costs different amounts of Anthropic credits, and has different latency.

Most guides pick one and pretend it’s the answer. It isn’t. This piece covers all three setups honestly and shows which one is right for which team size.

The three setup patterns

Pattern

Who triggers it

Cost per PR

Honest limit

1. Local / Claude Code skills

You, manually

~$0.05 – $0.15

Only reviews what you ask it to

2. CI / GitHub Actions

Every push to PR

~$0.10 – $0.30

Slow feedback (3-5 min CI lag)

3. Pre-commit hook

git commit

~$0.01 – $0.03

Only sees staged diff, not full PR

Option 1: Local review via Claude Code skills

Cheapest to set up, slowest to run across a team. You open Claude Code, type /pr-review (or equivalent), Claude reads the diff and posts findings.

Setup

Install Claude Code (if you haven’t): npm i -g @anthropic-ai/claude-code
Create a pr-review skill at ~/.claude/skills/pr-review/SKILL.md
The skill should instruct Claude to run git diff origin/main...HEAD, classify each hunk into {bug, style, test-gap, doc-gap, nit}, and return a structured review.
Invoke with /pr-review before you open the PR on GitHub.

When this wins

Solo developer or pair — you’re the reviewer, Claude is the pre-reviewer.
Pre-PR self-review. Catches the "I shouldn’t have committed this" mistakes before a human sees them.
Offline-friendly: you don’t need CI, you don’t need a GitHub Token. Just Claude Code.

When this loses

Teams of 3+. You forget to run it. Someone else opens the PR and skips it entirely.
You need the review history to exist somewhere other than your terminal.

For a ready-built skill, 3 of our Drills skills are free on GitHub (including a basic PR-review skill). The full 25-skill pack is Septim Drills, $29 one-time.

Option 2: GitHub Actions-triggered review

A workflow in .github/workflows/ runs Claude against the PR diff on every push. Review comments auto-post as a bot comment on the PR.

Setup

Add an Anthropic API key to your repo secrets as ANTHROPIC_API_KEY.
Create .github/workflows/pr-review.yml that triggers on pull_request events.
In the workflow: fetch the diff, call the Anthropic API with a structured review prompt, post the response as a PR comment via gh pr comment.
Pin the Anthropic SDK version in the workflow file (don’t use latest; bump deliberately).

The honest limit

CI review has a 3-5 minute latency from push to comment. That’s fine for end-of-day review. It is not fine if you want Claude to block your merge before you’ve switched contexts. Also, CI runs on every push — which means 10 pushes to a single PR = 10 Claude calls = ~$1-3 per PR if you’re not careful.

Two guardrails worth adding:

Rate-limit to "first push per 10 minutes" so you don’t pay for every typo-fix.
Scope by changed-file path — if only docs/** changed, skip security review, save tokens.

Our Septim Flint pack ($29) ships 5 pre-built GitHub Actions workflows with both guardrails baked in: pr-review, security-triage, migration-safety, test-gaps, and dependency-triage. Everything runs self-hosted via your own API key — no proxy, no vendor.

Option 3: Pre-commit hook review

Runs locally, at git commit time, against your staged diff. Fastest feedback, but only sees the subset of code that’s actually being committed.

Setup

Install pre-commit: pip install pre-commit
Create .pre-commit-config.yaml with a local hook that runs git diff --cached through a Claude-calling script.
Run pre-commit install once per clone to register the hook.
Budget the hook to 4 seconds max — if Claude doesn’t respond fast, skip the check silently. A 30-second pre-commit hook is a hook developers disable.

What this catches that CI misses

Secrets/API keys accidentally staged (one of the fastest-growing PR-leak classes).
Typos in identifiers (getUesrById vs getUserById) — Claude is good at these, linters aren’t.
Stale test expectations (expect(result).toBe(3) when the function now returns 4).

What this misses that CI catches

Anything spanning multiple files (because the commit might only stage one).
Test coverage regressions (the pre-commit scope is too narrow to compute coverage).
Migrations / schema changes that need the full PR context.

If you want this pre-built with all three hooks + a budget-limit + a bypass flag, Septim Tether ($19) is the package. Three hooks, POSIX shell plus inline Python, stdlib only. No dependencies beyond pre-commit itself.

The combination that actually works

Most teams we see ship a layered setup:

Pre-commit catches the cheap, fast stuff (typos, secrets) before it leaves your laptop.
GitHub Actions catches the expensive stuff (architecture, security, test-gap) on the PR itself — async, doesn’t block your flow.
Local Claude Code runs as the dev’s own pre-review pass before opening the PR in the first place.

Running all three layers costs about $0.30-0.50 per PR in API credits, catches roughly 2-3x more issues than any single layer, and doesn’t slow down the dev cycle.

Want the whole review stack pre-built?

We bundle all three layers: 25 Claude Code skills (Drills, local loop), 5 PR-review GitHub Actions (Flint, CI loop), and 3 pre-commit hooks (Tether, pre-commit loop) as the Septim Review Stack for $59, one-time. Individual price: $77. Three private GitHub repo invites delivered to your email in <5 minutes after checkout.

See Review Stack →

The common mistakes

Using claude-4-5-sonnet when Haiku would do. For pre-commit typo checks, Haiku is 10x cheaper and 3x faster with no quality loss.
Running the review on every push. Debounce by 10 minutes. Your wallet will thank you.
Posting Claude’s review as a blocking comment. Don’t. Post it as a "suggestions" comment. A blocking Claude review creates merge friction that erodes team trust in 2-3 weeks.
Not pinning the prompt. If you change the review prompt mid-week, half the team sees one review style and half sees another. Check the prompt into the repo.

FAQ

Does Claude see our whole codebase?

Only what you send it. In all three patterns, you control the context window: the diff, optionally the files the diff touches, and nothing else. Claude doesn’t crawl the repo unless you explicitly pipe more content.

What if my repo is private and I can’t send code to Anthropic?

Then you need an on-prem model. None of these patterns work. Claude Code does support Bedrock/Vertex endpoints, but if your policy is "no code leaves our VPC" you want a local-model setup, not Claude.

Is this reliable enough to actually block merges?

For most teams: no, don’t block on AI review yet. Use it as "suggestion" output. The false-positive rate on Claude 4.7 Sonnet is ~5-8% depending on codebase complexity — low enough to be useful, high enough to create merge friction if you make it blocking.

Can I use this with GitLab/Bitbucket?

Yes for all three patterns, but only Option 1 and Option 3 are drop-in — Option 2 (GitHub Actions) needs a GitLab CI or Bitbucket Pipelines equivalent. The logic is identical; only the YAML differs.

A real session, with numbers

Here is what the GitHub Actions option actually costs on a mid-size codebase. Team of four engineers, roughly 18 PRs per week, average diff of 320 lines.

// Monthly CI review cost — 18 PRs/week, Sonnet 4.5

Per PR:
  System prompt + diff:   ~8,000 tokens input
  Review output:          ~1,500 tokens output
  Input cost:   8,000 / 1M × $3.00  = $0.024
  Output cost:  1,500 / 1M × $15.00 = $0.023
  Per PR total:                        $0.047

Monthly (18 PRs × ~4.3 weeks):        ~77 PRs
Monthly cost (no debounce):            $3.62

With 10-minute debounce (cuts by ~60%):
  Effective PRs billed:   ~31
  Monthly cost:           $1.46

The debounce rule alone drops the monthly cost from $3.62 to $1.46. On a team spending $15-40/month on GitHub Actions compute, the Claude review cost is already a rounding error. What costs more than the API is the engineering time to set it up and maintain the prompt as the codebase evolves — which is the actual argument for pre-built workflows over DIY.

Writing a review prompt that doesn’t hallucinate

The biggest failure mode in production Claude PR review is false positives — Claude flagging something as a bug that is intentional behavior the reviewer knows about. This erodes team trust fast. The fix is a structured output format with explicit "refuses to" constraints.

The review prompt that produces the lowest false-positive rate across our testing:

You are reviewing a pull request diff.

## What to check
1. Logic correctness — walk every non-trivial branch
2. Test coverage gaps — functions added without test
3. Security surface — input validation, auth checks, injection vectors
4. API contract breaks — changed signatures, removed exports
5. Obvious typos in identifiers

## Output format
One finding per line:
[severity] file:line — finding in one sentence. Suggested fix in one sentence.

Severities: BLOCKER / MAJOR / MINOR / NIT

## Refuses to
- Flag intentional patterns (e.g., console.log in dev-only files)
- Propose changes to files not in the diff
- Mark security issues without a specific file:line reference
- Approve the PR without at least 3 findings if the diff is over 50 lines
- Generate findings it cannot support with a quote from the diff

## If no issues found
Output: "// No blockers. X findings total." followed by the findings list.

The "Refuses to" section is doing the work here. Without it, Claude defaults to helpfulness — which in review context means offering suggestions about files it cannot see, flagging stylistic preferences as bugs, and varying its output structure per session. The constraints narrow the output to what’s actionable and falsifiable.

Step-by-step setup for the GitHub Actions option

Add ANTHROPIC_API_KEY to your repo secrets (Settings → Secrets → Actions).
Create .github/workflows/pr-review.yml:

name: Claude PR Review
on:
  pull_request:
    types: [opened, synchronize]
jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Get diff
        id: diff
        run: |
          git diff origin/${{ github.base_ref }}...HEAD > /tmp/pr.diff
          echo "lines=$(wc -l < /tmp/pr.diff)" >> $GITHUB_OUTPUT
      - name: Skip if diff is empty
        if: steps.diff.outputs.lines == '0'
        run: echo "No diff. Skipping review." && exit 0
      - name: Run Claude review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          pip install anthropic --quiet
          python .github/scripts/claude-review.py /tmp/pr.diff > /tmp/review.md
      - name: Post review comment
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const review = fs.readFileSync('/tmp/review.md', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: '**Claude Code Review**\n\n' + review
            });

Create .github/scripts/claude-review.py with the Anthropic SDK call using the prompt above.
Pin the model version in the script: model="claude-sonnet-4-5" rather than an alias. Bump deliberately.
Add the 10-minute debounce by checking the last workflow run timestamp before calling Claude.

The layered setup checklist

Before going live, verify each layer is configured correctly:

Pre-commit: pre-commit install run in every developer’s local clone (add to onboarding docs).
CI: workflow file checked into the repo, not just added to one branch.
Local skill: skill file at ~/.claude/skills/pr-review/SKILL.md on each developer’s machine.
Budget ceiling: token cap defined somewhere. Even a comment in the review prompt saying "exit if diff is over 2,000 lines" prevents edge-case runaway.
Review output stored: CI reviews are auto-posted as PR comments and therefore versioned by GitHub. Local reviews should pipe to a log file, not just the terminal.

TL;DR

Solo dev: just use Option 1 (local Claude Code skills). Start there.

Small team (2-10): Option 1 + Option 3 (local + pre-commit). Add Option 2 when you hit 10 PRs/week.

Team of 10+: all three layers. This is where Review Stack is the right-size buy.

If you want the whole setup handed to you as three private GitHub repos you can clone and run today, Septim Review Stack is $59 one-time. Otherwise, all three patterns are straightforward enough to build from scratch in a weekend.

Septim Flint — $29 · 5 GitHub Actions for auto Claude code review

Five pre-built Actions: PR review, security triage, migration safety, test-gap scan, and dependency audit. Your Anthropic key, self-hosted, no proxy. Debounce and per-file-path scoping already configured. Pay once, private GitHub repo invite. Free if you own Septim Drills.

Get Septim Flint — $29 →