我是如何教会 GitHub Copilot 代码审查员像维护者一样思考的

你可以对“氛围编码”（vibe coding）褒贬不一，但它对开源社区来说意义非凡。过去，为陌生的代码库做贡献令人望而生畏，这意味着无论项目多么受欢迎，开源项目的维护者都很难获得社区的帮助。但现在有了人工智能编码工具，贡献的门槛大大降低。事实上，我们用Rust 构建的开源人工智能代理框架goose就面临着截然相反的问题。我们收到的贡献太多了，以至于难以跟上！这当然是件好事，我们也希望确保贡献者拥有良好的体验。但仅靠我们自己审查实在力不从心。幸运的是，GitHub 上已经有Copilot 代码审查代理，可以随时审查每个提交的 PR。

我开启这个功能的时候以为大家都会喜欢，但说实话，效果并不理想。其他维护人员说评论太多太杂，而且大部分评论价值不高。他们问我们能不能把它关掉。

以下是我从帮助工程师使用人工智能的经验中学到的：你不能轻易放弃，不能禁用它，而应该不断调整，教会模型如何按照你的期望运行，而不是仅仅寄希望于最好的结果。

在评估其部分评论时，我发现问题相当一致：

评论很长，而且数量庞大。
“或许”和“考虑一下”之类的评论太多，表明信心不足。
只有大约五分之一的评论是真正有价值的发现，是投稿者自己可能会错过的。

我并不责怪Copilot。它怎么会知道我们关心什么呢？我们又没告诉它！幸运的是，我们有办法告诉它。

Copilot 支持通过文件自定义指令.github/copilot-instructions.md。我就是在那个文件中具体定义了我们希望它如何运行。

哲学评论

我首先教会 Copilot 与我们期望人类审稿人遵循的原则相同的原则。

## Review Philosophy

* Only comment when you have HIGH CONFIDENCE (>80%) that an issue exists
* Be concise: one sentence per comment when possible
* Focus on actionable feedback, not observations
* When reviewing text, only comment on clarity issues if the text is genuinely confusing or could lead to errors.

这立即减少了噪音。各种猜测停止了，取而代之的是清晰、可靠的反馈。

优先领域

然后我明确告诉它应该优先考虑哪些方面。这些才是我们在评测中真正关注的领域。再说一遍，如果我不提供这些背景信息，Copilot 又怎么会知道呢？

## Priority Areas (Review These)

### Security & Safety

* Unsafe code blocks without justification
* Command injection risks (shell commands, user input)
* Path traversal vulnerabilities
* Credential exposure or hardcoded secrets
* Missing input validation on external data
* Improper error handling that could leak sensitive info

### Correctness Issues

* Logic errors that could cause panics or incorrect behavior
* Race conditions in async code
* Resource leaks (files, connections, memory)
* Off-by-one errors or boundary conditions
* Incorrect error propagation (using `unwrap()` inappropriately)
* Optional types that don’t need to be optional
* Booleans that should default to false but are set as optional
* Error context that doesn’t add useful information
* Overly defensive code with unnecessary checks
* Unnecessary comments that restate obvious code behavior

### Architecture & Patterns

* Code that violates existing patterns in the codebase
* Missing error handling (should use `anyhow::Result`)
* Async/await misuse or blocking operations in async contexts
* Improper trait implementations

有了这份清单后，Copilot 就不再吹毛求疵，而是开始发现真正的问题。

项目特定背景

Copilot 不会自动了解你的设置。你必须告诉它正在审查的项目类型。

## Project-Specific Context

* This is a Rust project using cargo workspaces
* Core crates: `goose`, `goose-cli`, `goose-server`, `goose-mcp`
* Error handling: Use `anyhow::Result`, not `unwrap()` in production
* Async runtime: tokio
* See HOWTOAI.md for AI-assisted code standards
* MCP protocol implementations require extra scrutiny

这样的背景有助于它理解我们的架构和最重要的模式。

CI 流水线上下文

Copilot 会在 CI 完成之前审查 PR，因此如果没有上下文，它会对 CI 已经检查过的内容进行评论。我添加了这行代码，以便它知道哪些内容已被检查过。

## CI Pipeline Context

**Important**: You review PRs immediately, before CI completes. Do not flag issues that CI will catch.

### What Our CI Checks (`.github/workflows/ci.yml`)

**Rust checks:**

* cargo fmt --check
* cargo test --jobs 2
* ./scripts/clippy-lint.sh
* just check-openapi-schema

**Desktop app checks:**

* npm ci
* npm run lint:check
* npm run test:run

**Setup steps CI performs:**

* Installs system dependencies
* Activates hermit environment
* Caches Cargo and npm deps
* Runs npm ci before scripts

**Key insight**: Commands like `npx` check local node_modules first. Don't flag these as broken unless CI wouldn't handle it.

跳过这些

下一部分至关重要。我告诉它哪些事情不要打扰我们。

## Skip These (Low Value)

Do not comment on:

* Style/formatting (rustfmt, prettier)
* Clippy warnings
* Test failures
* Missing dependencies (npm ci covers this)
* Minor naming suggestions
* Suggestions to add comments
* Refactoring unless addressing a real bug
* Multiple issues in one comment
* Logging suggestions unless security-related
* Pedantic text accuracy unless it affects meaning

回复格式

为了解决冗长的问题，我给它添加了结构。

## Response Format

1. State the problem (1 sentence)
2. Why it matters (1 sentence, if needed)
3. Suggested fix (snippet or specific action)

Example:
This could panic if the vector is empty. Consider using `.get(0)` or adding a length check.

何时应该保持沉默

法学硕士们都喜欢分享太多。有时候，沉默才是正确的选择。

## When to Stay Silent

If you’re uncertain whether something is an issue, don’t comment.

要点总结

对 Copilot 进行调整后，效果立竿见影。噪音显著降低，语音提示也变得更有价值。

但这并非最终版本。随着越来越多的 PR 提交，我观察了 Copilot 如何响应并不断完善指南。以下是我们当前的代码审查指南版本。

如果你决定在自己的代码仓库中设置此功能，请做好同样的准备。这并非一劳永逸的解决方案。随着项目的演进，你需要观察、调整并持续改进它。

如果人工智能在你的代码库中不太奏效，不要轻易放弃。遵循以下建议，你或许可以使其为你所用：

要具体明确。含糊不清的指示会导致含糊不清的结果。
设置置信阈值以降低噪音。
告诉它CI已经涵盖了哪些内容。
请提供代码库中的实际示例。
通过迭代不断改进结果。

文章来源：https://dev.to/techgirl1908/how-i-taught-github-copilot-code-review-to-think-like-a-maintainer-3l2c

菜单

分享

我是如何教会 GitHub Copilot 代码审查员像维护者一样思考的

我是如何教会 GitHub Copilot 代码审查员像维护者一样思考的

哲学评论

优先领域

项目特定背景

CI 流水线上下文

跳过这些

回复格式

何时应该保持沉默

要点总结

系统设计面试中的 19 种微服务模式

使用 React 和 AWS Amplify 实现无服务器架构第三部分：跟踪应用使用情况

模型-视图-控制器（MVC）模式到底是什么？DEV 全球项目展示挑战赛，由 Mux 主办：快来展示你的项目吧！

我在两年内从 PHP 开发人员晋升为高级 C#/.NET 开发人员。

了解 Docker：第 12 部分 – 传递构建参数

Yarn 和第三方 NPM 客户端的黑暗未来 DEV 的全球展示与讲述挑战赛，由 Mux 呈现：展示你的项目！

CSS DEV 的全球展示挑战赛“响应式字体”由 Mux 呈现：展示你的项目！

我是如何以学生开发者的身份免费获得 Tabnine Pro 的，你也可以！

五大顶级JS框架

从 Rector PHP 开始：利用自动化改进您的 PHP 代码