-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
workflowGitHub ActionsGitHub Actions
Description
Task
@WhiteElephantIsNotARobot 解决gemini的review
Context
{
"repo": "WhiteElephant-abc/agent",
"event_type": "pullrequest",
"event_id": "22416419401",
"trigger_user": "WhiteElephant-abc",
"issue_number": 182,
"pr_title": "fix: 丢弃 diff_content 当 context 超过 64KB 限制",
"pr_body": "## 问题\n当 context 字符数超过 GitHub Actions inputs 的 64KB 限制时,workflow dispatch 失败(422 Unprocessable Entity)。\n\n## 解决方案\n在 `trigger_workflow` 函数中添加 context 大小检查逻辑,当超过 64KB 限制时按优先级丢弃数据:\n\n1. **优先丢弃 diff_content** - diff 通常很大,且评论历史已有智能节选算法(3新1老比例)处理\n2. **截断评论历史** - 如果仍然超限,只保留最近 5 条评论\n3. **丢弃 review_comments_batch** - 如果仍然超限,丢弃审核评论批次\n4. **丢弃 reviews_history** - 如果仍然超限,丢弃审核历史\n\n## 关键点\n- diff 有时会极其大,检测到超限时优先丢弃 diff\n- comment 已有智能节选算法(`truncate_context_by_chars` 函数,3新1老比例)\n- 仅丢弃 diff 就够了,因为评论历史已有节选机制\n\n## 相关 Issue\nFixes #152\n",
"comments_history": [
{
"id": "IC_kwDORAzJJ87kIpT8",
"user": "gemini-code-assist",
"body": "## Summary of Changes\n\nHello @WhiteElephantIsNotARobot, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!\n\n此拉取请求旨在解决 GitHub Actions `workflow_dispatch` 因输入数据(特别是上下文信息)超出 64KB 限制而失败的问题。通过引入智能的数据处理策略,它首先尝试将超大上下文数据存储到一个新的 GitHub Issue 中,从而避免直接在工作流输入中传递所有数据。如果此方法不可行或数据仍然过大,系统将按照预定义的优先级(如丢弃 diff 内容、截断评论历史等)对上下文进行精简,确保工作流能够成功触发,提高了系统的鲁棒性。\n\n### Highlights\n\n* **解决 GitHub Actions 输入限制**: 引入新机制,处理 `workflow_dispatch` 输入数据超过 64KB 限制导致的工作流失败。\n* **大上下文数据 Issue 存储**: 当上下文或任务文本过大时,尝试创建 GitHub Issue 存储数据,并通过 `repository_dispatch` 传递 Issue 编号,绕过 `workflow_dispatch` 大小限制。\n* **分级上下文截断策略**: 若数据仍超限或无法存储到 Issue,系统将按优先级丢弃或截断数据:首先 `diff_content`,然后 `comments_history`(保留最近 5 条),接着 `review_comments_batch`,最后 `reviews_history`。\n* **新增 `create_issue` 函数**: 添加异步函数 `create_issue`,用于创建 GitHub Issue 存储超大上下文数据。\n\n\n\n\n\n\n\n\n\n<details>\n<summary><b>Using Gemini Code Assist</b></summary>\n<br>\n\nThe full guide for Gemini Code Assist can be found on our [documentation page](https://developers.google.com/gemini-code-assist/docs/review-github-code), here are some quick tips.\n\n<b>Invoking Gemini</b>\n\nYou can request assistance from Gemini at any point by creating a comment using either `/gemini <command>` or `@gemini-code-assist <command>`. Below is a summary of the supported commands on the current page.\n\nFeature | Command | Description\n--- | --- | ---\nCode Review | `/gemini review` | Performs a code review for the current pull request in its current state.\nPull Request Summary | `/gemini summary` | Provides a summary of the current pull request in its current state.\nComment | @gemini-code-assist | Responds in comments when explicitly tagged, both in pull request comments and review comments.\nHelp | `/gemini help` | Displays a list of available commands.\n\n<b>Customization</b>\n\nTo customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a `.gemini/` folder in the base of the repository. Detailed instructions can be found [here](https://developers.google.com/gemini-code-assist/docs/customize-gemini-behavior-github).\n\n<b>Limitations & Feedback</b>\n\nGemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up [here](https://google.qualtrics.com/jfe/form/SV_2cyuGuTWsEw84yG).\n\n<b>You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the [Gemini Code Assist IDE Extension](https://cloud.google.com/products/gemini/code-assist).</b>\n</details>\n\n\n\n\n[^1]: Review the [Privacy Notices](https://policies.google.com/privacy), [Generative AI Prohibited Use Policy](https://policies.google.com/terms/generative-ai/use-policy), [Terms of Service](https://policies.google.com/terms), and learn how to configure Gemini Code Assist in GitHub [here](https://developers.google.com/gemini-code-assist/docs/customize-gemini-behavior-github). Gemini can make mistakes, so double check it and [use code with caution](https://support.google.com/legal/answer/13505487).\n",
"created_at": "2026-01-31T04:39:00Z",
"type": "comment"
},
{
"id": "IC_kwDORAzJJ87kIrCP",
"user": "WhiteElephant-abc",
"body": "@WhiteElephantIsNotARobot 解决gemini的review",
"created_at": "2026-01-31T04:43:18Z",
"type": "comment"
},
{
"id": "IC_kwDORAzJJ87kIsMD",
"user": "WhiteElephantIsNotARobot",
"body": "已根据 Gemini Code Assist 的 review 意见完成修复:\n\n## ✅ 关键问题修复\n**修复了 payload 结构问题**:\n- 移除了错误的 `client_payload` 使用\n- 统一使用 `inputs` 字段构建 payload\n- `workflow_dispatch` API 端点期望 `inputs` 对象来传递参数\n\n## ✅ 代码优化\n**重构了上下文截断逻辑**:\n- 将每个缩减步骤抽象成函数列表\n- 通过循环执行,直到上下文大小满足要求\n- 消除了重复代码,提高了可读性和可维护性\n\n## 核心逻辑\n保留了按优先级丢弃数据的策略:\n1. 优先丢弃 `diff_content`(diff 通常很大)\n2. 截断评论历史(保留最近 5 条)\n3. 丢弃 `review_comments_batch`\n4. 丢弃 `reviews_history`\n\n感谢 Gemini Code Assist 的详细审查!",
"created_at": "2026-01-31T04:46:32Z",
"type": "comment"
},
{
"id": "IC_kwDORAzJJ87kIs_8",
"user": "WhiteElephant-abc",
"body": "/gemini review",
"created_at": "2026-01-31T04:48:33Z",
"type": "comment"
},
{
"id": "IC_kwDORAzJJ87kIuv0",
"user": "WhiteElephant-abc",
"body": "@WhiteElephantIsNotARobot 首先解决gemini的review,其次workflow不会从issue读取数据,放弃issue存数据的逻辑,仅保留截断",
"created_at": "2026-01-31T04:54:07Z",
"type": "comment"
},
{
"id": "IC_kwDORAzJJ87kIwT-",
"user": "WhiteElephantIsNotARobot",
"body": "已根据您的要求和 Gemini Code Assist 的 review 意见完成修复:\n\n## ✅ 已完成的修复\n\n### 1. 解决 Gemini 的 review 意见\n- **移除硬编码数字**:使用常量 `GITHUB_INPUTS_MAX_SIZE = 64000` 代替魔术数字\n- **简化 payload 构建**:直接在 `payload` 字典中构建 `inputs`,移除了不必要的临时变量\n\n### 2. 放弃 issue 存数据的逻辑\n- 移除了 `create_issue` 函数\n- 移除了检查数据大小并创建 issue 的逻辑\n- 移除了 `issue_number` 相关的代码\n\n### 3. 仅保留截断逻辑\n- 保留了按优先级丢弃数据的策略:\n 1. 优先丢弃 `diff_content`(diff 通常很大)\n 2. 截断评论历史(保留最近 5 条)\n 3. 丢弃 `review_comments_batch`\n 4. 丢弃 `reviews_history`\n\n## 代码变更\n- 删除了约 64 行代码(issue 存储相关逻辑)\n- 简化了 payload 构建逻辑\n- 保留了核心的上下文截断功能\n\n感谢您的指导!\n",
"created_at": "2026-01-31T04:57:22Z",
"type": "comment"
},
{
"id": "IC_kwDORAzJJ87kIw2B",
"user": "WhiteElephant-abc",
"body": "/gemini review",
"created_at": "2026-01-31T04:58:35Z",
"type": "comment"
},
{
"id": "IC_kwDORAzJJ87kIxze",
"user": "WhiteElephant-abc",
"body": "@WhiteElephantIsNotARobot 解决gemini的review",
"created_at": "2026-01-31T05:01:21Z",
"type": "comment"
}
],
"reviews_history": [
{
"id": "PRR_kwDORAzJJ87eamFK",
"user": "gemini-code-assist",
"body": "## Code Review\n\n本次 PR 旨在解决当上下文超过 GitHub Actions 64KB 限制时导致 workflow dispatch 失败的问题。解决方案是当上下文过大时,优先尝试创建一个 issue 来存储数据,如果失败或数据不是特别大但仍超限,则按优先级丢弃部分上下文(如 diff、评论历史等)。这个思路是合理的,但我在实现中发现了一个关键问题和一处可优化的代码结构。\n\n主要问题在于,当使用 issue 存储方案时,触发工作流的 payload 结构不正确,使用了 `client_payload` 而非 `inputs`,这将导致工作流无法接收到参数。此外,上下文截断部分的逻辑存在大量重复代码,可以进行重构以提高可读性和可维护性。请查看具体的审查评论。",
"state": "COMMENTED",
"submitted_at": "2026-01-31T04:40:32Z"
},
{
"id": "PRR_kwDORAzJJ87eaqE0",
"user": "gemini-code-assist",
"body": "## Code Review\n\n这次的 Pull Request 很好地解决了 context 过大导致 workflow 失败的问题。通过优先将超大数据存入 Issue,以及在无法创建 Issue 时按优先级丢弃数据的策略,大大增强了系统的鲁棒性。这是一个非常巧妙和实用的解决方案。代码整体质量很高,我只有几点关于代码可读性和可维护性的建议,主要是关于将硬编码的数值提取为常量,以及简化部分代码逻辑。",
"state": "COMMENTED",
"submitted_at": "2026-01-31T04:50:03Z"
},
{
"id": "PRR_kwDORAzJJ87eatul",
"user": "gemini-code-assist",
"body": "## Code Review\n\n本次 PR 旨在解决 context 超出 GitHub Actions 64KB 限制的问题,通过引入一个按优先级的上下文缩减策略来解决。整体实现逻辑清晰,符合预期。\n\n我提出了一些关于代码风格和结构的小建议,主要是为了提高代码的可读性和可维护性:\n1. 将 `GITHUB_INPUTS_MAX_SIZE` 常量移至模块级别。\n2. 重构缩减步骤函数的定义方式,使其结构更清晰。\n\n这些改动不会影响功能,但能让代码更整洁。",
"state": "COMMENTED",
"submitted_at": "2026-01-31T05:00:23Z"
}
],
"review_comments_batch": [
{
"id": "PRRC_kwDORAzJJ86j2NT4",
"user": "gemini-code-assist",
"body": "\n\n为了提高代码的可读性和可维护性,建议将 `GITHUB_INPUTS_MAX_SIZE` 定义为模块级别的常量,与其他配置(如 `CONTEXT_MAX_CHARS`)放在一起。在函数内部定义常量通常不是最佳实践。",
"path": "server.py",
"diff_hunk": "@@ -951,28 +951,72 @@ async def trigger_workflow(client: httpx.AsyncClient, ctx: TaskContext, task_tex\n logger.info(f\" ReviewComment[{i}]: @{comment.get('user')} - {comment.get('path')}: {comment.get('body', '')[:50]}...\")\n if ctx.is_truncated is not None:\n logger.info(f\"Context was truncated: {ctx.is_truncated}\")\n- \n+\n # 检查是否有重复/空字段\n logger.info(f\"diff_content present: {bool(ctx.diff_content)}\")\n logger.info(f\"clone_url: {ctx.clone_url}\")\n logger.info(f\"head_ref: {ctx.head_ref}, base_ref: {ctx.base_ref}\")\n logger.info(f\"head_repo: {ctx.head_repo}, base_repo: {ctx.base_repo}\")\n- \n+\n # 记录任务描述\n logger.info(f\"LLM_TASK to send: '{task_text[:200]}{'...' if len(task_text) > 200 else ''}'\")\n- \n- if len(context_str) > 60000: # GitHub限制\n- logger.warning(f\"Context too large ({len(context_str)} chars), truncating...\")\n- # 简化上下文\n- ctx.diff_content = \"[Diff truncated due to size limits]\"\n- if ctx.comments_history and len(ctx.comments_history) > 10:\n- logger.info(f\"Reducing comments history from {len(ctx.comments_history)} to 10 items\")\n- ctx.comments_history = ctx.comments_history[-10:] # 只保留最近10条\n- context_str = ctx.to_json_string()\n+\n+ # GitHub Actions inputs 限制为 64KB (65536 字节)\n+ GITHUB_INPUTS_MAX_SIZE = 64000"
},
{
"id": "PRRC_kwDORAzJJ86j2NT6",
"user": "gemini-code-assist",
"body": "\n\n在 `if` 语句块中定义函数虽然在 Python 中是允许的,但通常不被认为是最佳实践,因为它会降低代码的可读性。一个更清晰的模式是先在外部定义所有可能的缩减函数,然后根据条件将它们添加到一个列表中。这样可以使逻辑更清晰,并将函数定义与使用分开。\n\n```python\n def drop_diff():\n logger.info(f\"Dropping diff_content to reduce size (was {len(ctx.diff_content)} chars)\")\n ctx.diff_content = None\n\n def reduce_comments():\n original_count = len(ctx.comments_history)\n ctx.comments_history = ctx.comments_history[-5:]\n logger.info(f\"Reducing comments history from {original_count} to 5 items\")\n\n def drop_review_comments():\n logger.info(f\"Dropping review_comments_batch (was {len(ctx.review_comments_batch)} items)\")\n ctx.review_comments_batch = None\n\n def drop_reviews_history():\n logger.info(f\"Dropping reviews_history (was {len(ctx.reviews_history)} items)\")\n ctx.reviews_history = None\n\n reduction_steps = []\n \n # 1. 优先丢弃 diff_content(diff 通常很大)\n if ctx.diff_content:\n reduction_steps.append(drop_diff)\n\n # 2. 截断评论历史(只保留最近 5 条)\n if ctx.comments_history:\n reduction_steps.append(reduce_comments)\n\n # 3. 丢弃 review_comments_batch\n if ctx.review_comments_batch:\n reduction_steps.append(drop_review_comments)\n\n # 4. 丢弃 reviews_history\n if ctx.reviews_history:\n reduction_steps.append(drop_reviews_history)\n```",
"path": "server.py",
"diff_hunk": "@@ -951,28 +951,72 @@ async def trigger_workflow(client: httpx.AsyncClient, ctx: TaskContext, task_tex\n logger.info(f\" ReviewComment[{i}]: @{comment.get('user')} - {comment.get('path')}: {comment.get('body', '')[:50]}...\")\n if ctx.is_truncated is not None:\n logger.info(f\"Context was truncated: {ctx.is_truncated}\")\n- \n+\n # 检查是否有重复/空字段\n logger.info(f\"diff_content present: {bool(ctx.diff_content)}\")\n logger.info(f\"clone_url: {ctx.clone_url}\")\n logger.info(f\"head_ref: {ctx.head_ref}, base_ref: {ctx.base_ref}\")\n logger.info(f\"head_repo: {ctx.head_repo}, base_repo: {ctx.base_repo}\")\n- \n+\n # 记录任务描述\n logger.info(f\"LLM_TASK to send: '{task_text[:200]}{'...' if len(task_text) > 200 else ''}'\")\n- \n- if len(context_str) > 60000: # GitHub限制\n- logger.warning(f\"Context too large ({len(context_str)} chars), truncating...\")\n- # 简化上下文\n- ctx.diff_content = \"[Diff truncated due to size limits]\"\n- if ctx.comments_history and len(ctx.comments_history) > 10:\n- logger.info(f\"Reducing comments history from {len(ctx.comments_history)} to 10 items\")\n- ctx.comments_history = ctx.comments_history[-10:] # 只保留最近10条\n- context_str = ctx.to_json_string()\n+\n+ # GitHub Actions inputs 限制为 64KB (65536 字节)\n+ GITHUB_INPUTS_MAX_SIZE = 64000\n+\n+ # 按优先级丢弃数据,直到上下文大小满足要求\n+ if len(context_str) > GITHUB_INPUTS_MAX_SIZE:\n+ logger.warning(f\"Context too large ({len(context_str)} chars), attempting to reduce...\")\n+\n+ reduction_steps = []\n+\n+ # 1. 优先丢弃 diff_content(diff 通常很大)\n+ if ctx.diff_content:\n+ def drop_diff():\n+ logger.info(f\"Dropping diff_content to reduce size (was {len(ctx.diff_content)} chars)\")\n+ ctx.diff_content = None\n+ reduction_steps.append(drop_diff)\n+\n+ # 2. 截断评论历史(只保留最近 5 条)\n+ if ctx.comments_history:\n+ def reduce_comments():\n+ original_count = len(ctx.comments_history)\n+ ctx.comments_history = ctx.comments_history[-5:]\n+ logger.info(f\"Reducing comments history from {original_count} to 5 items\")\n+ reduction_steps.append(reduce_comments)\n+\n+ # 3. 丢弃 review_comments_batch\n+ if ctx.review_comments_batch:\n+ def drop_review_comments():\n+ logger.info(f\"Dropping review_comments_batch (was {len(ctx.review_comments_batch)} items)\")\n+ ctx.review_comments_batch = None\n+ reduction_steps.append(drop_review_comments)\n+\n+ # 4. 丢弃 reviews_history\n+ if ctx.reviews_history:\n+ def drop_reviews_history():\n+ logger.info(f\"Dropping reviews_history (was {len(ctx.reviews_history)} items)\")\n+ ctx.reviews_history = None\n+ reduction_steps.append(drop_reviews_history)"
}
],
"diff_content": "diff --git a/server.py b/server.py\nindex afc3b17..3095d9b 100644\n--- a/server.py\n+++ b/server.py\n@@ -937,7 +937,7 @@ async def trigger_workflow(client: httpx.AsyncClient, ctx: TaskContext, task_tex\n # 检查上下文大小\n context_str = ctx.to_json_string()\n logger.info(f\"Context size: {len(context_str)} chars\")\n- \n+\n # 调试信息\n if ctx.comments_history:\n logger.info(f\"Comments history: {len(ctx.comments_history)} items\")\n@@ -951,28 +951,72 @@ async def trigger_workflow(client: httpx.AsyncClient, ctx: TaskContext, task_tex\n logger.info(f\" ReviewComment[{i}]: @{comment.get('user')} - {comment.get('path')}: {comment.get('body', '')[:50]}...\")\n if ctx.is_truncated is not None:\n logger.info(f\"Context was truncated: {ctx.is_truncated}\")\n- \n+\n # 检查是否有重复/空字段\n logger.info(f\"diff_content present: {bool(ctx.diff_content)}\")\n logger.info(f\"clone_url: {ctx.clone_url}\")\n logger.info(f\"head_ref: {ctx.head_ref}, base_ref: {ctx.base_ref}\")\n logger.info(f\"head_repo: {ctx.head_repo}, base_repo: {ctx.base_repo}\")\n- \n+\n # 记录任务描述\n logger.info(f\"LLM_TASK to send: '{task_text[:200]}{'...' if len(task_text) > 200 else ''}'\")\n- \n- if len(context_str) > 60000: # GitHub限制\n- logger.warning(f\"Context too large ({len(context_str)} chars), truncating...\")\n- # 简化上下文\n- ctx.diff_content = \"[Diff truncated due to size limits]\"\n- if ctx.comments_history and len(ctx.comments_history) > 10:\n- logger.info(f\"Reducing comments history from {len(ctx.comments_history)} to 10 items\")\n- ctx.comments_history = ctx.comments_history[-10:] # 只保留最近10条\n- context_str = ctx.to_json_string()\n+\n+ # GitHub Actions inputs 限制为 64KB (65536 字节)\n+ GITHUB_INPUTS_MAX_SIZE = 64000\n+\n+ # 按优先级丢弃数据,直到上下文大小满足要求\n+ if len(context_str) > GITHUB_INPUTS_MAX_SIZE:\n+ logger.warning(f\"Context too large ({len(context_str)} chars), attempting to reduce...\")\n+\n+ reduction_steps = []\n+\n+ # 1. 优先丢弃 diff_content(diff 通常很大)\n+ if ctx.diff_content:\n+ def drop_diff():\n+ logger.info(f\"Dropping diff_content to reduce size (was {len(ctx.diff_content)} chars)\")\n+ ctx.diff_content = None\n+ reduction_steps.append(drop_diff)\n+\n+ # 2. 截断评论历史(只保留最近 5 条)\n+ if ctx.comments_history:\n+ def reduce_comments():\n+ original_count = len(ctx.comments_history)\n+ ctx.comments_history = ctx.comments_history[-5:]\n+ logger.info(f\"Reducing comments history from {original_count} to 5 items\")\n+ reduction_steps.append(reduce_comments)\n+\n+ # 3. 丢弃 review_comments_batch\n+ if ctx.review_comments_batch:\n+ def drop_review_comments():\n+ logger.info(f\"Dropping review_comments_batch (was {len(ctx.review_comments_batch)} items)\")\n+ ctx.review_comments_batch = None\n+ reduction_steps.append(drop_review_comments)\n+\n+ # 4. 丢弃 reviews_history\n+ if ctx.reviews_history:\n+ def drop_reviews_history():\n+ logger.info(f\"Dropping reviews_history (was {len(ctx.reviews_history)} items)\")\n+ ctx.reviews_history = None\n+ reduction_steps.append(drop_reviews_history)\n+\n+ for step in reduction_steps:\n+ if len(context_str) <= GITHUB_INPUTS_MAX_SIZE:\n+ break\n+ step()\n+ context_str = ctx.to_json_string()\n+ logger.info(f\"Context size after reduction: {len(context_str)} chars\")\n+\n+ # 最终检查\n+ if len(context_str) > GITHUB_INPUTS_MAX_SIZE:\n+ logger.error(f\"Context still too large ({len(context_str)} chars) after all reductions. Workflow dispatch may fail.\")\n+ # 继续尝试,让 GitHub 返回错误,而不是在这里失败\n+ else:\n+ logger.info(f\"Context reduced to acceptable size: {len(context_str)} chars\")\n \n url = f\"{REST_API}/repos/{CONTROL_REPO}/actio",
"diff_url": "https://api.github.com/repos/WhiteElephant-abc/agent/pulls/182.diff",
"clone_url": "[email protected]:WhiteElephantIsNotARobot/agent.git",
"head_ref": "fix-context-size-limit",
"base_ref": "main",
"head_repo": "WhiteElephantIsNotARobot/agent:fix-context-size-limit",
"base_repo": "WhiteElephant-abc/agent:main",
"current_comment_id": "IC_kwDORAzJJ87kIxze",
"is_mention_in_body": false,
"is_truncated": false
}Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
workflowGitHub ActionsGitHub Actions