🔥我是如何使用人工智能代理实现代码可靠性自动化的
开发者们好👋
作为开发人员,我们都知道重复性检查会耗费多少时间:
- 我们是否添加了适当的错误处理机制?
- 我们的函数命名是否遵循规则?
- 我们编写的测试用例够多吗?
- 是否存在隐藏的逻辑错误?
- 可靠性是在提高还是下降?
这些任务固然重要……但每次提交或 PR 都手动执行这些任务既费时又费力,而且会占用我们本可以用于实际构建功能的时间。
于是我问自己:
🤔如果我能创建一个人工智能代理来自动完成所有这些工作呢?
这就是可靠性守护代理的由来。
本文将以最简单的方式,一步步指导您使用Qodo Command构建这个 AI 代理。最终,您将拥有一个功能强大的可靠性审查工具,它既可以在本地运行,也可以在GitHub Actions中运行。
开始吧。🚀
💡什么是可靠性守护者?
可靠性守护代理会自动分析您的代码库,以进行评估和改进:
- 代码可靠性
- 容错性
- 输入验证
- 测试覆盖率
- 错误处理
- 历史可靠性趋势
它结合了静态分析和行为式测试(例如模拟变异或模糊测试)来发现:
- 逻辑不一致
- 测试不足或缺失
- 缺少输入验证
- 不安全或脆弱的代码路径
- 最近几次提交中出现了可靠性退化
该代理既可用于locally手动操作,也可用于自动化操作CI/CD workflows,并会给出0 到 10 的清晰可靠性评分,以及可操作的建议。
⚙️ 安装 Qodo 命令
我们将使用Qodo 的 Agentic Quality Workflows CLI来构建和运行代理。
全局安装:
npm install -g @qodo/command
然后登录:
qodo login
登录完成后,您将在终端收到一个 API 密钥。
API 密钥也会保存在.qodo您主目录下的本地文件夹中,并且可以重复使用(例如,在 CI 中)。
创建可靠性守护代理
1️⃣创建代理配置
在项目根目录下,创建reliable-guardian-agent.toml
此文件告诉 Qodo 有关您的代理的所有信息,例如指令、参数、策略和输出格式。
现在,粘贴以下配置:
# Reliability Guardian Agent Configuration
version = "1.0"
[commands.reliability_guardian]
description = "Analyze and score project reliability by detecting logic conflicts, missing validations, weak tests, and historical reliability trends."
instructions = """
You are an expert reliability analyst agent. Your purpose is to evaluate the reliability of a software project by analyzing logic consistency, input validation completeness, and test suite robustness.
### Your mission:
1. **Analyze code for logic reliability**
- Detect logical conflicts, contradictory conditions, or redundant branches
- Identify missing input validation or unsafe operations (e.g., divide by zero, null dereference)
- Recognize missing or ineffective exception handling
2. **Evaluate test robustness**
- Perform mutation or fuzz testing to estimate how strong the existing tests are
- Identify functions that lack test coverage or only test “happy paths”
3. **Compute a comprehensive reliability score**
- Logic Consistency (30%)
- Input Validation Coverage (30%)
- Exception Safety (20%)
- Test Effectiveness (20%)
Provide an overall reliability score between 0–10.
4. **Detect reliability trends over time**
- Use Git history to compare reliability results across recent commits or branches
- Highlight improvement or regression in reliability score
5. **Suggest self-healing fixes**
- Suggest specific code improvements such as adding missing validation, refactoring conflicting branches, or adding stronger test cases
- Each fix suggestion should include a short code patch snippet where applicable
"""
arguments = [
{ name = "target_branch", type = "string", required = false, default = "main", description = "Branch to compare against for diff and reliability trend" },
{ name = "max_commits", type = "number", required = false, default = 5, description = "Number of past commits to analyze for historical reliability trends" },
{ name = "mutation_testing", type = "boolean", required = false, default = true, description = "Enable simulated mutation testing" },
{ name = "fuzz_testing", type = "boolean", required = false, default = true, description = "Enable fuzz-style reliability probing" },
{ name = "exclude_files", type = "string", required = false, description = "Comma-separated list of files to exclude (e.g., test mocks or migrations)" }
]
tools = ["qodo_merge", "git", "filesystem"]
execution_strategy = "act"
output_schema = """
{
"type": "object",
"properties": {
"summary": {
"type": "object",
"description": "High-level summary of reliability issues and test robustness",
"properties": {
"files_analyzed": { "type": "number", "description": "Total number of source files analyzed" },
"functions_checked": { "type": "number", "description": "Number of functions analyzed for logic reliability" },
"total_issues": { "type": "number", "description": "Total reliability issues detected" },
"critical_issues": { "type": "number", "description": "Number of critical logic or reliability flaws" },
"reliability_score": {
"type": "object",
"properties": {
"overall": { "type": "number", "minimum": 0, "maximum": 10 },
"logic_consistency": { "type": "number", "minimum": 0, "maximum": 10 },
"validation_coverage": { "type": "number", "minimum": 0, "maximum": 10 },
"exception_safety": { "type": "number", "minimum": 0, "maximum": 10 },
"test_strength": { "type": "number", "minimum": 0, "maximum": 10 }
},
"required": ["overall", "logic_consistency", "validation_coverage", "exception_safety", "test_strength"]
},
"trend": {
"type": "object",
"description": "Reliability trend compared to past commits",
"properties": {
"previous_scores": { "type": "array", "items": { "type": "number" } },
"improvement": { "type": "number", "description": "Positive if reliability improved, negative if regressed" },
"best_commit": { "type": "string", "description": "Commit hash with highest reliability" },
"worst_commit": { "type": "string", "description": "Commit hash with lowest reliability" }
}
}
},
"required": ["files_analyzed", "functions_checked", "total_issues", "reliability_score", "trend"]
},
"issues": {
"type": "array",
"description": "Detailed list of individual reliability issues",
"items": {
"type": "object",
"properties": {
"file": { "type": "string" },
"line": { "type": "number" },
"severity": { "type": "string", "enum": ["critical", "high", "medium", "low"] },
"category": { "type": "string", "description": "logic_conflict | validation_gap | weak_test | exception_risk" },
"description": { "type": "string" },
"suggestion": { "type": "string" },
"code_patch": { "type": "string", "description": "Example of an automated fix or patch suggestion" }
},
"required": ["file", "severity", "category", "description"]
}
},
"suggestions": {
"type": "array",
"description": "High-level reliability improvement recommendations",
"items": {
"type": "object",
"properties": {
"area": { "type": "string", "description": "validation | error_handling | logic | testing" },
"description": { "type": "string" },
"example_patch": { "type": "string" }
},
"required": ["area", "description"]
}
},
"approved": { "type": "boolean", "description": "Whether project meets reliability standards" },
"requires_changes": { "type": "boolean", "description": "True if reliability score < 7.0 or critical issues found" }
},
"required": ["summary", "issues", "suggestions", "approved"]
}
"""
exit_expression = "approved"
代理文件的字段包括:
| 字段名称 | 类型 | 描述 |
|---|---|---|
description |
细绳 | 描述您的代理的功能。 当代理以 . 模式运行时,此字段为必填项 --mcp。 |
instructions |
细绳 | 必填字段。 提示人工智能模型解释所需行为。 |
arguments |
对象列表。 支持的类型: 'string' | 'number' | 'boolean' | 'array' | 'object' |
可以提供给代理的参数列表。 这些论据将被翻译并转发到MCP服务器。 |
mcpServers |
细绳 | 代理使用的 MCP 服务器列表 |
tools |
列表 | MCP 服务器名称列表。允许您筛选代理可以使用的特定 MCP 服务器。 |
execution_strategy |
“行动”或“计划” | 计划让智能体思考多步骤策略,行动则立即执行操作。 |
output_schema |
细绳 | 所需代理输出的有效 JSON |
exit_expression |
字符串(JSONPath) | 仅在output_schema指定时适用。对于 CI 运行,此条件用于确定代理运行是成功还是失败。 |
我们的代理接受以下参数:
| 范围 | 类型 | 必需的 | 默认 | 描述 |
|---|---|---|---|---|
target_branch |
细绳 | 不 | master |
用于比较可靠性差异和趋势分析的分支 |
max_commits |
数字 | 不 | 5 |
近期提交次数,用于分析可靠性趋势 |
mutation_testing |
布尔值 | 不 | true |
启用模拟突变测试 |
fuzz_testing |
布尔值 | 不 | true |
启用模拟模糊可靠性探测 |
exclude_files |
细绳 | 不 | - | 以逗号分隔的要排除的文件列表(例如模拟文件、生成的代码) |
2️⃣在本地运行代理
您可以通过传递可选参数来运行此代理。
qodo reliability_guardian
高级配置
# Compare with another branch
qodo reliability_guardian --target_branch=develop
# Analyze last 10 commits for reliability trend
qodo reliability_guardian --max_commits=10
# Run without mutation or fuzz simulation
qodo reliability_guardian --mutation_testing=false --fuzz_testing=false
该工具随后会分析您的代码库并返回结构化的 JSON 输出。
3️⃣输出格式
代理返回结构化的JSON输出:
{
"summary": {
"files_analyzed": 4,
"functions_checked": 8,
"total_issues": 18,
"critical_issues": 3,
"reliability_score": {
"overall": 3.5,
"logic_consistency": 4.0,
"validation_coverage": 3.0,
"exception_safety": 4.0,
"test_strength": 3.0
},
"trend": {
"previous_scores": [2.2],
"improvement": 1.3,
"best_commit": "065f7c9",
"worst_commit": "be1abae"
}
},
"issues": [
{
"file": "src/payment.py",
"line": 1,
"severity": "critical",
"category": "logic_conflict",
"description": "Premium users get worse discount (15%) when amount > 100 compared to base premium discount (20%). This is a business logic contradiction.",
"suggestion": "Invert the logic so higher amounts get better discounts (e.g., 25% for amount > 100, 20% otherwise)",
"code_patch": "if user_type == 'premium':\n discount = 0.25 if amount > 100 else 0.20"
},
{
"file": "src/calculator.py",
"line": 10,
"severity": "critical",
"category": "exception_risk",
"description": "average() function will raise ZeroDivisionError when passed an empty list",
"suggestion": "Add validation to check for empty input before division",
"code_patch": "if values is None or len(values) == 0:\n raise ValueError('values must be a non-empty sequence')"
},
{
"file": "src/utils.py",
"line": 1,
"severity": "critical",
"category": "exception_risk",
"description": "safe_get() catches all exceptions with 'except Exception', masking programming errors and making debugging difficult",
"suggestion": "Only catch specific exceptions (KeyError, TypeError) to avoid hiding unrelated bugs",
"code_patch": "except (KeyError, TypeError):\n return default"
},
{
"file": "src/auth.py",
"line": 5,
"severity": "high",
"category": "validation_gap",
"description": "authenticate_user() accepts any types without validation; no None/empty checks",
"suggestion": "Add type and empty string validation before authentication logic",
"code_patch": "if not isinstance(username, str) or not isinstance(password, str):\n return False\nif not username or not password:\n return False"
},
{
"file": "src/payment.py",
"line": 1,
"severity": "high",
"category": "validation_gap",
"description": "calculate_discount() lacks input validation for user_type domain and amount (negative values, type checking)",
"suggestion": "Add validation for user_type and amount before processing",
"code_patch": "if not isinstance(amount, (int, float)) or amount < 0:\n raise ValueError('amount must be a non-negative number')\nif user_type not in ('premium', 'basic'):\n raise ValueError(f'invalid user_type: {user_type}')"
},
{
"file": "src/calculator.py",
"line": 10,
"severity": "high",
"category": "validation_gap",
"description": "average() lacks validation for non-numeric elements in the list",
"suggestion": "Add type checking for all elements before processing",
"code_patch": "if not all(isinstance(x, (int, float)) for x in values):\n raise TypeError('all values must be numeric')"
},
{
"file": "src/calculator.py",
"line": 15,
"severity": "medium",
"category": "validation_gap",
"description": "add_safe() has misleading name suggesting validation, but performs no type enforcement; will concatenate strings or raise TypeError with None",
"suggestion": "Either add type validation or rename function to reflect actual behavior"
},
{
"file": "tests/test_auth.py",
"line": 6,
"severity": "high",
"category": "weak_test",
"description": "test_auth_admin expects authenticate_user('admin','123') to return True, but actual implementation requires password 'secret'. Test is failing.",
"suggestion": "Fix test to match actual implementation or fix implementation to match test contract",
"code_patch": "def test_auth_success():\n assert authenticate_user('admin', 'secret') is True"
},
{
"file": "tests/test_auth.py",
"line": 1,
"severity": "high",
"category": "weak_test",
"description": "test_email_valid only tests happy path; missing tests for invalid emails, None, empty strings",
"suggestion": "Add negative test cases for malformed emails",
"code_patch": "def test_email_invalid_cases():\n assert not validate_email('')\n assert not validate_email('invalid')\n assert not validate_email('a@b.')\n assert not validate_email('@example.com')"
}
],
"suggestions": [
{
"area": "logic",
"description": "Fix payment discount logic contradiction where premium users get worse discount for higher amounts. Invert the condition so amount > 100 gets 25% discount instead of 15%.",
"example_patch": "if user_type == 'premium':\n discount = 0.25 if amount > 100 else 0.20\nelse:\n discount = 0.10"
},
{
"area": "validation",
"description": "Add comprehensive input validation across all functions: type checking, None checks, empty collection checks, domain validation for enums, and range validation for numeric inputs.",
"example_patch": "if not isinstance(amount, (int, float)) or amount < 0:\n raise ValueError('amount must be a non-negative number')\nif user_type not in ('premium', 'basic'):\n raise ValueError(f'invalid user_type: {user_type}')"
},
{
"area": "error_handling",
"description": "Replace broad 'except Exception' clauses with specific exception types to avoid masking programming errors. Only catch expected exceptions like KeyError, TypeError, ValueError.",
"example_patch": "try:\n return d[key]\nexcept (KeyError, TypeError):\n return default"
},
{
"area": "validation",
"description": "Strengthen email validation to reject malformed patterns like 'a@b.', '@example.com', 'a@@b.com'. Implement proper parsing with split and validation of local/domain parts.",
"example_patch": "local, domain = email.split('@', 1)\nif not local or '.' not in domain:\n return False\nlabel, tld = domain.rsplit('.', 1)\nreturn bool(label) and len(tld) >= 2"
},
{
"area": "testing",
"description": "Expand test coverage to include edge cases and negative tests: empty inputs, None values, type errors, boundary conditions, and invalid domain values. Fix failing test in test_auth_admin.",
"example_patch": "def test_auth_failures():\n assert authenticate_user('admin', 'wrong') is False\n assert authenticate_user('', 'secret') is False\n assert authenticate_user(None, 'secret') is False"
},
{
"area": "testing",
"description": "Implement mutation testing to measure test effectiveness. Current tests likely have weak mutation kill rate due to minimal assertions and lack of negative tests.",
"example_patch": "# Run mutation testing with mutmut:\n# mutmut run --paths-to-mutate=src/\n# Expected improvement: mutation score from ~30% to >80% after adding edge case tests"
}
],
"approved": false,
"requires_changes": true
}
这样可以快速、自动地概览需要修复的问题。
🤖 将 Reliability Guardian 添加到 GitHub Actions
现在,使用此代理最常见的方法是通过 GitHub Actions 自动审查所有拉取请求(PR)。
我们来github-action为此创建一个文件。
GitHub Actions
name: Reliability Guardian Agent
on:
pull_request:
branches: [main, develop]
jobs:
reliability-guardian:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
checks: write
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run Reliability Guardian Agent
uses: qodo-ai/command@v1
env:
QODO_API_KEY: ${{ secrets.QODO_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
prompt: reliability_guardian
agent-file: path/to/agent.toml
key-value-pairs: |
target_branch=${{ github.base_ref }}
max_commits=5
mutation_testing=true
fuzz_testing=true
现在每个公关稿都会自动进行可靠性审查。
- 无需人工审核。
- 没有遗漏任何极端情况。
- 没有出现意料之外的运行时故障。
🎯 为什么这个代理能为开发者节省大量时间
✅更快的审核:无需再等待队友进行基本的可靠性检查。
✅代码更加一致:所有 PR 都应用相同的规则。
✅更安全稳定的构建:许多可靠性问题在合并前就被发现。
✅节省开发人员时间:开发人员专注于构建功能,而不是重复审查。
✅可针对任何项目进行定制:您可以轻松调整规则、权重和检查。
🎉 最后想说的话
Qodo 的 Agentic Quality Workflow不仅仅是一个 CLI,它还是将智能自动化引入工程团队的一种新方式。
可靠性守护代理只是您可以构建的功能之一。
您还可以创建:
- 绩效审计员
- 安全检查员
- 测试编写人员
- 文档审核员
- 代码重构助手
- 以及为您的团队完全定制的代理
所有操作都使用一个简单、灵活的代理文件。
最棒的是,你可以根据项目需求定制自己的代理。😉
您可以访问代理存储库,其中包含代理实现示例。
谢谢!!🙏
感谢您阅读至此。如果您觉得这篇文章有用,请点赞并分享。也许其他人也会觉得它有用。💖
您可以通过X、GitHub和LinkedIn与我联系。
文章来源:https://dev.to/dev_kiran/how-i-automated-code-reliability-with-an-ai-agent-1kkb
