fix: clean up log analysis comments on GitHub issues#3925
fix: clean up log analysis comments on GitHub issues#3925devin-ai-integration[bot] wants to merge 2 commits intomainfrom
Conversation
Co-Authored-By: john@hyprnote.com <john@hyprnote.com>
Co-Authored-By: john@hyprnote.com <john@hyprnote.com>
✅ Deploy Preview for hyprnote canceled.
|
✅ Deploy Preview for hyprnote-storybook canceled.
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
| .strip_prefix("```") | ||
| .and_then(|s| { | ||
| let s = s.strip_prefix('\n').unwrap_or(s); | ||
| s.strip_suffix("```") | ||
| }) |
There was a problem hiding this comment.
🟡 strip_code_fences fails to strip language identifier from fenced code blocks
The strip_code_fences function doesn't handle the common LLM pattern of ```text\n...\n``` or ```json\n...\n```. When the LLM returns a code block with a language identifier, the identifier leaks into the output.
Root Cause and Impact
The function does strip_prefix("```") which removes the opening triple backticks, leaving e.g. "text\ncontent\n\``". Then strip_prefix('\n')fails because the string starts withtext, not a newline. The unwrap_or(s)fallback keeps the full remainder. Finallystrip_suffix("```")removes the closing fence, yielding"text\ncontent"`—the language identifier `text` is now part of the content.
This result then gets embedded at crates/api-support/src/github.rs:136 inside format!("### Summary\n```\n{s}\n```"), producing broken output like:
### Summary
` ` `
text
ERROR: something went wrong
` ` `
Since the whole purpose of this function is to clean up LLM output that wraps responses in code fences, and LLMs very commonly use language-tagged fences (```text, ```json, ```markdown), this is a significant gap.
Fix: After strip_prefix("```"), skip everything up to and including the first \n (not just a bare \n prefix).
| .strip_prefix("```") | |
| .and_then(|s| { | |
| let s = s.strip_prefix('\n').unwrap_or(s); | |
| s.strip_suffix("```") | |
| }) | |
| .strip_prefix("```") | |
| .and_then(|s| { | |
| let s = match s.find('\n') { | |
| Some(pos) => &s[pos + 1..], | |
| None => s, | |
| }; | |
| s.strip_suffix("```") | |
| }) |
Was this helpful? React with 👍 or 👎 to provide feedback.
| if c == '\x1b' { | ||
| if chars.peek() == Some(&'[') { | ||
| chars.next(); | ||
| while let Some(&next) = chars.peek() { | ||
| chars.next(); | ||
| if next.is_ascii_alphabetic() { | ||
| break; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Incomplete ANSI sequence handling: If a log ends with an incomplete ANSI escape sequence (e.g., \x1b[31 without a terminating alphabetic character), the inner while loop at line 21-26 will consume all remaining characters while searching for the terminating letter. This causes the trailing content to be silently dropped.
Fix: Add a safety limit to the inner loop or check for end-of-string:
if c == '\x1b' {
if chars.peek() == Some(&'[') {
chars.next();
let mut count = 0;
while let Some(&next) = chars.peek() {
chars.next();
if next.is_ascii_alphabetic() || count > 20 {
break;
}
count += 1;
}
}
}| if c == '\x1b' { | |
| if chars.peek() == Some(&'[') { | |
| chars.next(); | |
| while let Some(&next) = chars.peek() { | |
| chars.next(); | |
| if next.is_ascii_alphabetic() { | |
| break; | |
| } | |
| } | |
| } | |
| if c == '\x1b' { | |
| if chars.peek() == Some(&'[') { | |
| chars.next(); | |
| let mut count = 0; | |
| while let Some(&next) = chars.peek() { | |
| chars.next(); | |
| if next.is_ascii_alphabetic() || count > 20 { | |
| break; | |
| } | |
| count += 1; | |
| } | |
| } |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
Summary
Fixes three formatting issues with the log analysis comments that hyper-charlie bot attaches to user-submitted GitHub issues:
Broken code blocks from LLM output: Gemini sometimes wraps its response in
```fences, which nest inside the existing code block wrapper and produce broken markdown. Addedstrip_code_fencesto remove wrapping fences from the LLM response before embedding it.ANSI escape sequences in raw logs: Logs from the desktop app can contain ANSI color/formatting codes (e.g.
\x1b[0m,\x1b[3m) that render as garbled text on GitHub. Addedstrip_ansi_escapesto clean these before both the LLM analysis and the raw log tail. (ref: ## Error Details #3921)HTML entity escaping (
"for"): Askama auto-escapes HTML entities by default, but these templates produce GitHub markdown, not HTML. Addedescape = "none"to all three Askama markdown templates.Review & Testing Checklist for Human
escape = "none"on all templates: This disables HTML escaping forbug_report,feature_request, andlog_analysistemplates. Since GitHub handles its own markdown sanitization this should be safe, but verify there's no injection concern with user-controlleddescriptionfields.strip_code_fencesedge cases: Only handles exact```prefix/suffix. If the LLM returns```text\n...\n```, the language identifier (text) will leak into the output. Consider whether this matters in practice.strip_ansi_escapescoverage: Only strips CSI sequences (ESC[...letter). Other escape types (OSC, single-char escapes) are not handled—orphanESCbytes are silently dropped. Likely sufficient for typical Tauri log output.", no ANSI artifacts.Notes
strip_code_fencesorstrip_ansi_escapesfunctions.