-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
Issue
Problem Description
I've analyzed the source code in aider/run_cmd.py and aider/commands.py. Currently, run_cmd_subprocess captures every character into combined_output, and cmd_run adds the entire output to the chat context if the user confirms or if a test fails.
In large-scale projects like TiDB (distributed databases), a single test run can generate thousands of lines of logs. This leads to several issues:
- Context Bloat: Thousands of lines of successful test logs are captured, consuming valuable context space even when only a few lines of errors matter.
- Signal-to-Noise Ratio: When tests fail, the actual error messages are buried within massive amounts of "Checking..." or "Downloading..." logs, making it difficult for the LLM to identify the root cause.
- Token Waste: The current "all-or-nothing" approach forces users to either add thousands of tokens of junk or skip the output entirely, hindering the agent's ability to fix bugs.
Proposed Solution
Implement an intelligent output filtering/summarization mechanism:
- Semantic Filtering: Use regex patterns to extract only error/warning/failure blocks (e.g., for Rust
error[E...], Gopanic:, or Python tracebacks). - Log Tail/Truncation: If the output exceeds a certain threshold (e.g., 2000 tokens), automatically keep only the first N lines and last M lines.
- Smart Summarization: For successful runs, only capture a summary like "X tests passed" instead of the full log.
- Configurable: Allow a
--fullflag for users who actually need the complete output.
Implementation Reference
Based on aider/commands.py:
- In
cmd_run, instead of addingcombined_outputdirectly, we could introduce aprocess_output()function to refine the string before it hitsself.coder.cur_messages.
Version and model info
No response
Metadata
Metadata
Assignees
Labels
No labels