gh pr-review: LLM-friendly PR review workflows in your CLI

LLM agents struggle with PR review workflows when the context is noisy, fragmented, and spread across multiple tool calls. Raw endpoints include optional fields, URLs, hashes, and heterogeneous shapes. Each extra field inflates tokens and compounds error risk. The result: brittle chains, ambiguous parsing, higher cost, and slower iterations.

This post introduces gh pr-review, a GitHub CLI extension that collapses multi-step review tasks into compact, deterministic outputs. It’s designed for agents and developers to read, reply, resolve, and submit reviews with minimal JSON and GraphQL-only operations.

Problem

GitHub’s gh api exposes powerful primitives, but everyday review tasks become multi-flag GraphQL invocations with jq post-processing. You assemble queries, handle pagination, normalize shapes, and deduplicate results. For agents, this increases chain length and error accumulation. For developers, it increases time-to-result.

Motivation

We needed a higher-level interface that:

Returns exactly the meaningful context agents need.
Avoids low-signal fields to keep prompts small.
Collapses multi-call chains into a single, well-scoped command.
Produces deterministic JSON for reliable parsing.

gh pr-review implements that interface as a first-class extension to gh.

Constraints

Agents have limited context windows; payloads must be small and targeted.
Deterministic, stable JSON is essential for robust parsing between non-deterministic agents and deterministic tools.
Fewer moving parts reduce failure modes: avoid REST mixing and numeric IDs.
Outputs should include only what agents need to decide and act; omit low-signal fields.

Design details for LLMs

Agents are only as powerful as the tools we give them: a good tool definition should define a clear, narrow purpose, return exactly the meaningful context the agent needs, and avoid burdening the model with low-signal intermediate results. In many tasks, wrapping the necessary functionality into a single well-designed command (rather than chaining multiple tool calls) reduces token overhead, diminishes error accumulation, and lets the agent focus on solving the user’s intent rather than stitching together fragmented results. — Anthropic: Writing tools for agents

We align our design to Anthropic’s guidance:

Single-command aggregation: “Tools can consolidate functionality, handling potentially multiple discrete operations (or API calls) under the hood.” review view returns reviews → threads → comments with filters to avoid multi-call chains.
Minimal payloads: “Tool implementations should take care to return only high signal information back to agents.” We omit optional fields, URLs, and hashes; we produce deterministic JSON with stable field names and ordering.
Token efficiency: “We suggest implementing some combination of pagination, range selection, filtering, and/or truncation…” We support --tail N and server-side filters for reviewer, states, and unresolved threads.
Clear contract: Tools bridge deterministic/non-deterministic systems; predictable outputs reduce ambiguity and agent failure modes.

Design principles

Purpose-built interface: commands map directly to review tasks (view, reply, resolve, start/add/submit).
Filter-first retrieval: server-side filters return only relevant review context (--reviewer, --states, --unresolved, --tail).
Compact outputs: minimal, deterministic JSON; omit fields that do not affect decisions; include IDs only when action is required.
GraphQL-only: one data model avoids endpoint drift and shape mismatches.
Namespacing and ergonomics: subcommands follow natural subdivisions and gh conventions.

Implementation

gh pr-review consolidates PR review workflows:

View filtered review context: reviews, threads, and comments in one compact JSON.
Reply to threads and resolve them as you address feedback.
Start, add comments to, and submit reviews via GraphQL-only operations.
Deterministic output order and minimal field sets tuned for agent prompts.
Prebuilt linux-arm64 assets for agent/dev environments on ARM (Graviton, Apple Silicon).

Examples

View context: unresolved threads, last two comments, specific reviewer, include comment node IDs.

gh pr-review review view 42 -R owner/repo \
  --reviewer octocat \
  --states CHANGES_REQUESTED \
  --unresolved \
  --tail 2 \
  --include-comment-node-id

Reply to a thread:

gh pr-review comments reply 42 -R owner/repo \
  --thread-id PRRT_123456 \
  --body "Thanks! We'll update tests."

Resolve a thread:

gh pr-review threads resolve 42 -R owner/repo \
  --thread-id PRRT_123456

Start a review on a commit, add a comment, then submit:

# Start a pending review
gh pr-review review --start 42 -R owner/repo

# Optional: pin to a specific commit (defaults to current head)
# gh pr-review review --start 42 -R owner/repo --commit <sha>

# Add a review comment (file/line args vary by diff context)
gh pr-review review --add-comment 42 -R owner/repo \
  --path src/foo/bar.ts --line 128 \
  --body "Consider extracting to a helper."

# Submit the review (COMMENT | APPROVE | REQUEST_CHANGES)
gh pr-review review --submit 42 -R owner/repo \
  --event COMMENT --body "Looks good overall."

Minimal deterministic JSON (example output)

{
  "pull_request_number": 42,
  "filters": {
    "reviewer": "octocat",
    "states": ["CHANGES_REQUESTED"],
    "unresolved": true,
    "tail": 2,
    "include_comment_node_id": true
  },
  "reviews": [
    { "author": "octocat", "state": "CHANGES_REQUESTED" }
  ],
  "threads": [
    {
      "id": "PRRT_123456",
      "resolved": false,
      "comments": [
        { "author": "octocat", "body": "Nit: naming is inconsistent.", "node_id": "RVWC_abc123" },
        { "author": "octocat", "body": "Please update the tests.", "node_id": "RVWC_def456" }
      ]
    }
  ]
}

Before vs. After

Before: chained gh api + jq for unresolved threads with reviewer filters and tail(2):

gh api graphql \
  -f query='
    query($owner:String!, $repo:String!, $number:Int!, $login:String!) {
      repository(owner:$owner, name:$repo) {
        pullRequest(number:$number) {
          reviewThreads(first:100) {
            nodes {
              id
              isResolved
              comments(last:10) {
                nodes { id author { login } body }
              }
            }
          }
          reviews(first:100, states:[CHANGES_REQUESTED]) {
            nodes { author { login } state }
          }
        }
      }
    }' \
  -F owner=owner -F repo=repo -F number=42 -F login=octocat \
| jq '
  .data.repository.pullRequest
  | {
      threads: (
        .reviewThreads.nodes
        | map(select(.isResolved==false)
          | .comments.nodes
          | map(select(.author.login=="octocat"))
          | (.[-2:] // []))
      ),
      reviews: (.reviews.nodes | map(select(.author.login=="octocat")))
    }'

After: single aggregation with server-side filters and compact JSON:

gh pr-review review view 42 -R owner/repo \
  --reviewer octocat --states CHANGES_REQUESTED \
  --unresolved --tail 2 --include-comment-node-id

Caveats

GraphQL-only: no REST fallback. This avoids shape drift but may require field updates if GraphQL schema changes.
Pagination and limits: large PRs may require multiple invocations; --tail N restricts comment payload size deterministically.
Reviewer matching: filters rely on author login; team-based filters are not yet supported.
Commit scoping: --commit <sha> pins review context; ensure the SHA matches the PR’s head or appropriate commit range.
Platform/auth: requires gh auth login; prebuilt linux-arm64 binaries available; other platforms may build from source.

Results

Token reduction: compact JSON eliminates optional fields, URLs, and hashes; typical agent prompts shrink substantially compared to raw gh api outputs.
Reliability gains: fewer tool calls and a single deterministic payload reduce failure modes (less jq, fewer partial results).
Latency improvements: server-side filtering cuts client processing and network round-trips; faster end-to-end review automation.

Installation

gh extension install agynio/gh-pr-review

References

Extension repo: https://github.com/agynio/gh-pr-review
GitHub CLI extensions docs: https://cli.github.com/manual/gh_extension
Anthropic: Writing effective tools for agents (using AI agents): https://www.anthropic.com/engineering/writing-tools-for-agents?utm_source=chatgpt.com

Call to action

Try the examples above on a live PR. Start with review view filters to generate a compact context payload, reply to a thread, resolve it, and submit your review (all from gh). If you build agents, adopt deterministic JSON and single-command aggregation in your tool definitions. Contributions and feedback welcome via the extension repo.