发布于 2026-01-06 1 阅读
0

🔥我是如何使用人工智能代理实现代码可靠性自动化的

🔥我是如何使用人工智能代理实现代码可靠性自动化的

开发者们好👋

作为开发人员,我们都知道重复性检查会耗费多少时间:

  • 我们是否添加了适当的错误处理机制?
  • 我们的函数命名是否遵循规则?
  • 我们编写的测试用例够多吗?
  • 是否存在隐藏的逻辑错误?
  • 可靠性是在提高还是下降?

这些任务固然重要……但每次提交或 PR 都手动执行这些任务既费时又费力,而且会占用我们本可以用于实际构建功能的时间。

于是我问自己:

🤔如果我能创建一个人工智能代理来自动完成所有这些工作呢?

这就是可靠性守护代理的由来。

本文将以最简单的方式,一步步指导您使用Qodo Command构建这个 AI 代理。最终,您将拥有一个功能强大的可靠性审查工具,它既可以在本地运行,也可以在GitHub Actions中运行。

开始吧。🚀

💡什么是可靠性守护者

可靠性守护代理会自动分析您的代码库,以进行评估和改进:

  • 代码可靠性
  • 容错性
  • 输入验证
  • 测试覆盖率
  • 错误处理
  • 历史可靠性趋势

它结合了静态分析和行为式测试(例如模拟变异或模糊测试)来发现:

  • 逻辑不一致
  • 测试不足或缺失
  • 缺少输入验证
  • 不安全或脆弱的代码路径
  • 最近几次提交中出现了可靠性退化

该代理既可用于locally手动操作,也可用于自动化操作CI/CD workflows,并会给出0 到 10 的清晰可靠性评分,以及可操作的建议。

⚙️ 安装 Qodo 命令

我们将使用Qodo 的 Agentic Quality Workflows CLI来构建和运行代理。

全局安装:

npm install -g @qodo/command
Enter fullscreen mode Exit fullscreen mode

然后登录:

qodo login
Enter fullscreen mode Exit fullscreen mode

Qodo 指挥部

登录完成后,您将在终端收到一个 API 密钥。

API 密钥也会保存在.qodo您主目录下的本地文件夹中,并且可以重复使用(例如,在 CI 中)。

创建可靠性守护代理

1️⃣创建代理配置
在项目根目录下,创建reliable-guardian-agent.toml

此文件告诉 Qodo 有关您的代理的所有信息,例如指令、参数、策略和输出格式。

现在,粘贴以下配置:

# Reliability Guardian Agent Configuration
version = "1.0"

[commands.reliability_guardian]
description = "Analyze and score project reliability by detecting logic conflicts, missing validations, weak tests, and historical reliability trends."

instructions = """
You are an expert reliability analyst agent. Your purpose is to evaluate the reliability of a software project by analyzing logic consistency, input validation completeness, and test suite robustness.
### Your mission:
1. **Analyze code for logic reliability**
   - Detect logical conflicts, contradictory conditions, or redundant branches
   - Identify missing input validation or unsafe operations (e.g., divide by zero, null dereference)
   - Recognize missing or ineffective exception handling
2. **Evaluate test robustness**
   - Perform mutation or fuzz testing to estimate how strong the existing tests are
   - Identify functions that lack test coverage or only test “happy paths”
3. **Compute a comprehensive reliability score**
   - Logic Consistency (30%)
   - Input Validation Coverage (30%)
   - Exception Safety (20%)
   - Test Effectiveness (20%)
   Provide an overall reliability score between 0–10.
4. **Detect reliability trends over time**
   - Use Git history to compare reliability results across recent commits or branches
   - Highlight improvement or regression in reliability score
5. **Suggest self-healing fixes**
   - Suggest specific code improvements such as adding missing validation, refactoring conflicting branches, or adding stronger test cases
   - Each fix suggestion should include a short code patch snippet where applicable
"""

arguments = [
    { name = "target_branch", type = "string", required = false, default = "main", description = "Branch to compare against for diff and reliability trend" },
    { name = "max_commits", type = "number", required = false, default = 5, description = "Number of past commits to analyze for historical reliability trends" },
    { name = "mutation_testing", type = "boolean", required = false, default = true, description = "Enable simulated mutation testing" },
    { name = "fuzz_testing", type = "boolean", required = false, default = true, description = "Enable fuzz-style reliability probing" },
    { name = "exclude_files", type = "string", required = false, description = "Comma-separated list of files to exclude (e.g., test mocks or migrations)" }
]

tools = ["qodo_merge", "git", "filesystem"]

execution_strategy = "act"

output_schema = """
{
  "type": "object",
  "properties": {
    "summary": {
      "type": "object",
      "description": "High-level summary of reliability issues and test robustness",
      "properties": {
        "files_analyzed": { "type": "number", "description": "Total number of source files analyzed" },
        "functions_checked": { "type": "number", "description": "Number of functions analyzed for logic reliability" },
        "total_issues": { "type": "number", "description": "Total reliability issues detected" },
        "critical_issues": { "type": "number", "description": "Number of critical logic or reliability flaws" },
        "reliability_score": {
          "type": "object",
          "properties": {
            "overall": { "type": "number", "minimum": 0, "maximum": 10 },
            "logic_consistency": { "type": "number", "minimum": 0, "maximum": 10 },
            "validation_coverage": { "type": "number", "minimum": 0, "maximum": 10 },
            "exception_safety": { "type": "number", "minimum": 0, "maximum": 10 },
            "test_strength": { "type": "number", "minimum": 0, "maximum": 10 }
          },
          "required": ["overall", "logic_consistency", "validation_coverage", "exception_safety", "test_strength"]
        },
        "trend": {
          "type": "object",
          "description": "Reliability trend compared to past commits",
          "properties": {
            "previous_scores": { "type": "array", "items": { "type": "number" } },
            "improvement": { "type": "number", "description": "Positive if reliability improved, negative if regressed" },
            "best_commit": { "type": "string", "description": "Commit hash with highest reliability" },
            "worst_commit": { "type": "string", "description": "Commit hash with lowest reliability" }
          }
        }
      },
      "required": ["files_analyzed", "functions_checked", "total_issues", "reliability_score", "trend"]
    },
    "issues": {
      "type": "array",
      "description": "Detailed list of individual reliability issues",
      "items": {
        "type": "object",
        "properties": {
          "file": { "type": "string" },
          "line": { "type": "number" },
          "severity": { "type": "string", "enum": ["critical", "high", "medium", "low"] },
          "category": { "type": "string", "description": "logic_conflict | validation_gap | weak_test | exception_risk" },
          "description": { "type": "string" },
          "suggestion": { "type": "string" },
          "code_patch": { "type": "string", "description": "Example of an automated fix or patch suggestion" }
        },
        "required": ["file", "severity", "category", "description"]
      }
    },
    "suggestions": {
      "type": "array",
      "description": "High-level reliability improvement recommendations",
      "items": {
        "type": "object",
        "properties": {
          "area": { "type": "string", "description": "validation | error_handling | logic | testing" },
          "description": { "type": "string" },
          "example_patch": { "type": "string" }
        },
        "required": ["area", "description"]
      }
    },
    "approved": { "type": "boolean", "description": "Whether project meets reliability standards" },
    "requires_changes": { "type": "boolean", "description": "True if reliability score < 7.0 or critical issues found" }
  },
  "required": ["summary", "issues", "suggestions", "approved"]
}
"""

exit_expression = "approved"
Enter fullscreen mode Exit fullscreen mode

代理文件的字段包括:

字段名称 类型 描述
description 细绳 描述您的代理的功能。
当代理以 . 模式运行时,此字段为必填项--mcp
instructions 细绳 必填字段。
提示人工智能模型解释所需行为。
arguments 对象列表。

支持的类型:'string' | 'number' | 'boolean' | 'array' | 'object'

可以提供给代理的参数列表。

这些论据将被翻译并转发到MCP服务器。

mcpServers 细绳 代理使用的 MCP 服务器列表
tools 列表 MCP 服务器名称列表。允许您筛选代理可以使用的特定 MCP 服务器。
execution_strategy “行动”或“计划” 计划让智能体思考多步骤策略,行动则立即执行操作。
output_schema 细绳 所需代理输出的有效 JSON
exit_expression 字符串(JSONPath 仅在output_schema指定时适用。
对于 CI 运行,此条件用于确定代理运行是成功还是失败。

我们的代理接受以下参数:

范围 类型 必需的 默认 描述
target_branch 细绳 master 用于比较可靠性差异和趋势分析的分支
max_commits 数字 5 近期提交次数,用于分析可靠性趋势
mutation_testing 布尔值 true 启用模拟突变测试
fuzz_testing 布尔值 true 启用模拟模糊可靠性探测
exclude_files 细绳 - 以逗号分隔的要排除的文件列表(例如模拟文件、生成的代码)

2️⃣在本地运行代理
您可以通过传递可选参数来运行此代理。

qodo reliability_guardian
Enter fullscreen mode Exit fullscreen mode

高级配置

# Compare with another branch
qodo reliability_guardian --target_branch=develop


# Analyze last 10 commits for reliability trend
qodo reliability_guardian --max_commits=10


# Run without mutation or fuzz simulation
qodo reliability_guardian --mutation_testing=false --fuzz_testing=false
Enter fullscreen mode Exit fullscreen mode

该工具随后会分析您的代码库并返回结构化的 JSON 输出。

3️⃣输出格式

代理返回结构化的JSON输出:

{
  "summary": {
    "files_analyzed": 4,
    "functions_checked": 8,
    "total_issues": 18,
    "critical_issues": 3,
    "reliability_score": {
      "overall": 3.5,
      "logic_consistency": 4.0,
      "validation_coverage": 3.0,
      "exception_safety": 4.0,
      "test_strength": 3.0
    },
    "trend": {
      "previous_scores": [2.2],
      "improvement": 1.3,
      "best_commit": "065f7c9",
      "worst_commit": "be1abae"
    }
  },
  "issues": [
    {
      "file": "src/payment.py",
      "line": 1,
      "severity": "critical",
      "category": "logic_conflict",
      "description": "Premium users get worse discount (15%) when amount > 100 compared to base premium discount (20%). This is a business logic contradiction.",
      "suggestion": "Invert the logic so higher amounts get better discounts (e.g., 25% for amount > 100, 20% otherwise)",
      "code_patch": "if user_type == 'premium':\n    discount = 0.25 if amount > 100 else 0.20"
    },
    {
      "file": "src/calculator.py",
      "line": 10,
      "severity": "critical",
      "category": "exception_risk",
      "description": "average() function will raise ZeroDivisionError when passed an empty list",
      "suggestion": "Add validation to check for empty input before division",
      "code_patch": "if values is None or len(values) == 0:\n    raise ValueError('values must be a non-empty sequence')"
    },
    {
      "file": "src/utils.py",
      "line": 1,
      "severity": "critical",
      "category": "exception_risk",
      "description": "safe_get() catches all exceptions with 'except Exception', masking programming errors and making debugging difficult",
      "suggestion": "Only catch specific exceptions (KeyError, TypeError) to avoid hiding unrelated bugs",
      "code_patch": "except (KeyError, TypeError):\n    return default"
    },
    {
      "file": "src/auth.py",
      "line": 5,
      "severity": "high",
      "category": "validation_gap",
      "description": "authenticate_user() accepts any types without validation; no None/empty checks",
      "suggestion": "Add type and empty string validation before authentication logic",
      "code_patch": "if not isinstance(username, str) or not isinstance(password, str):\n    return False\nif not username or not password:\n    return False"
    },
    {
      "file": "src/payment.py",
      "line": 1,
      "severity": "high",
      "category": "validation_gap",
      "description": "calculate_discount() lacks input validation for user_type domain and amount (negative values, type checking)",
      "suggestion": "Add validation for user_type and amount before processing",
      "code_patch": "if not isinstance(amount, (int, float)) or amount < 0:\n    raise ValueError('amount must be a non-negative number')\nif user_type not in ('premium', 'basic'):\n    raise ValueError(f'invalid user_type: {user_type}')"
    },
    {
      "file": "src/calculator.py",
      "line": 10,
      "severity": "high",
      "category": "validation_gap",
      "description": "average() lacks validation for non-numeric elements in the list",
      "suggestion": "Add type checking for all elements before processing",
      "code_patch": "if not all(isinstance(x, (int, float)) for x in values):\n    raise TypeError('all values must be numeric')"
    },
    {
      "file": "src/calculator.py",
      "line": 15,
      "severity": "medium",
      "category": "validation_gap",
      "description": "add_safe() has misleading name suggesting validation, but performs no type enforcement; will concatenate strings or raise TypeError with None",
      "suggestion": "Either add type validation or rename function to reflect actual behavior"
    },
    {
      "file": "tests/test_auth.py",
      "line": 6,
      "severity": "high",
      "category": "weak_test",
      "description": "test_auth_admin expects authenticate_user('admin','123') to return True, but actual implementation requires password 'secret'. Test is failing.",
      "suggestion": "Fix test to match actual implementation or fix implementation to match test contract",
      "code_patch": "def test_auth_success():\n    assert authenticate_user('admin', 'secret') is True"
    },
    {
      "file": "tests/test_auth.py",
      "line": 1,
      "severity": "high",
      "category": "weak_test",
      "description": "test_email_valid only tests happy path; missing tests for invalid emails, None, empty strings",
      "suggestion": "Add negative test cases for malformed emails",
      "code_patch": "def test_email_invalid_cases():\n    assert not validate_email('')\n    assert not validate_email('invalid')\n    assert not validate_email('a@b.')\n    assert not validate_email('@example.com')"
    }
  ],
  "suggestions": [
    {
      "area": "logic",
      "description": "Fix payment discount logic contradiction where premium users get worse discount for higher amounts. Invert the condition so amount > 100 gets 25% discount instead of 15%.",
      "example_patch": "if user_type == 'premium':\n    discount = 0.25 if amount > 100 else 0.20\nelse:\n    discount = 0.10"
    },
    {
      "area": "validation",
      "description": "Add comprehensive input validation across all functions: type checking, None checks, empty collection checks, domain validation for enums, and range validation for numeric inputs.",
      "example_patch": "if not isinstance(amount, (int, float)) or amount < 0:\n    raise ValueError('amount must be a non-negative number')\nif user_type not in ('premium', 'basic'):\n    raise ValueError(f'invalid user_type: {user_type}')"
    },
    {
      "area": "error_handling",
      "description": "Replace broad 'except Exception' clauses with specific exception types to avoid masking programming errors. Only catch expected exceptions like KeyError, TypeError, ValueError.",
      "example_patch": "try:\n    return d[key]\nexcept (KeyError, TypeError):\n    return default"
    },
    {
      "area": "validation",
      "description": "Strengthen email validation to reject malformed patterns like 'a@b.', '@example.com', 'a@@b.com'. Implement proper parsing with split and validation of local/domain parts.",
      "example_patch": "local, domain = email.split('@', 1)\nif not local or '.' not in domain:\n    return False\nlabel, tld = domain.rsplit('.', 1)\nreturn bool(label) and len(tld) >= 2"
    },
    {
      "area": "testing",
      "description": "Expand test coverage to include edge cases and negative tests: empty inputs, None values, type errors, boundary conditions, and invalid domain values. Fix failing test in test_auth_admin.",
      "example_patch": "def test_auth_failures():\n    assert authenticate_user('admin', 'wrong') is False\n    assert authenticate_user('', 'secret') is False\n    assert authenticate_user(None, 'secret') is False"
    },
    {
      "area": "testing",
      "description": "Implement mutation testing to measure test effectiveness. Current tests likely have weak mutation kill rate due to minimal assertions and lack of negative tests.",
      "example_patch": "# Run mutation testing with mutmut:\n# mutmut run --paths-to-mutate=src/\n# Expected improvement: mutation score from ~30% to >80% after adding edge case tests"
    }
  ],
  "approved": false,
  "requires_changes": true
}

Enter fullscreen mode Exit fullscreen mode

这样可以快速、自动地概览需要修复的问题。

🤖 将 Reliability Guardian 添加到 GitHub Actions

现在,使用此代理最常见的方法是通过 GitHub Actions 自动审查所有拉取请求(PR)。

我们来github-action为此创建一个文件。

GitHub Actions

name: Reliability Guardian Agent
on:
  pull_request:
    branches: [main, develop]

jobs:
  reliability-guardian:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write
      checks: write

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Reliability Guardian Agent
        uses: qodo-ai/command@v1
        env:
          QODO_API_KEY: ${{ secrets.QODO_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        with:
          prompt: reliability_guardian
          agent-file: path/to/agent.toml
          key-value-pairs: |
            target_branch=${{ github.base_ref }}
            max_commits=5
            mutation_testing=true
            fuzz_testing=true
Enter fullscreen mode Exit fullscreen mode

现在每个公关稿都会自动进行可靠性审查。

  • 无需人工审核。
  • 没有遗漏任何极端情况。
  • 没有出现意料之外的运行时故障。

🎯 为什么这个代理能为开发者节省大量时间

更快的审核:无需再等待队友进行基本的可靠性检查。

代码更加一致:所有 PR 都应用相同的规则。

更安全稳定的构建:许多可靠性问题在合并前就被发现。

节省开发人员时间:开发人员专注于构建功能,而不是重复审查。

可针对任何项目进行定制:您可以轻松调整规则、权重和检查。

🎉 最后想说的话

Qodo 的 Agentic Quality Workflow不仅仅是一个 CLI,它还是将智能自动化引入工程团队的一种新方式。

可靠性守护代理只是您可以构建的功能之一。
您还可以创建:

  • 绩效审计员
  • 安全检查员
  • 测试编写人员
  • 文档审核员
  • 代码重构助手
  • 以及为您的团队完全定制的代理

所有操作都使用一个简单、灵活的代理文件。

最棒的是,你可以根据项目需求定制自己的代理。😉

您可以访问代理存储库,其中包含代理实现示例。

谢谢!!🙏

感谢您阅读至此。如果您觉得这篇文章有用,请点赞并分享。也许其他人也会觉得它有用。💖

您可以通过XGitHubLinkedIn与我联系。

文章来源:https://dev.to/dev_kiran/how-i-automated-code-reliability-with-an-ai-agent-1kkb