Skip to content

Feature Request: Intelligent CI/CD Output Filtering for /run and /test Commands #4730

@blueberrycongee

Description

@blueberrycongee

Issue

Problem Description

I've analyzed the source code in aider/run_cmd.py and aider/commands.py. Currently, run_cmd_subprocess captures every character into combined_output, and cmd_run adds the entire output to the chat context if the user confirms or if a test fails.

In large-scale projects like TiDB (distributed databases), a single test run can generate thousands of lines of logs. This leads to several issues:

  1. Context Bloat: Thousands of lines of successful test logs are captured, consuming valuable context space even when only a few lines of errors matter.
  2. Signal-to-Noise Ratio: When tests fail, the actual error messages are buried within massive amounts of "Checking..." or "Downloading..." logs, making it difficult for the LLM to identify the root cause.
  3. Token Waste: The current "all-or-nothing" approach forces users to either add thousands of tokens of junk or skip the output entirely, hindering the agent's ability to fix bugs.

Proposed Solution

Implement an intelligent output filtering/summarization mechanism:

  • Semantic Filtering: Use regex patterns to extract only error/warning/failure blocks (e.g., for Rust error[E...], Go panic:, or Python tracebacks).
  • Log Tail/Truncation: If the output exceeds a certain threshold (e.g., 2000 tokens), automatically keep only the first N lines and last M lines.
  • Smart Summarization: For successful runs, only capture a summary like "X tests passed" instead of the full log.
  • Configurable: Allow a --full flag for users who actually need the complete output.

Implementation Reference

Based on aider/commands.py:

  • In cmd_run, instead of adding combined_output directly, we could introduce a process_output() function to refine the string before it hits self.coder.cur_messages.

Version and model info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions