Skip to content

Conversation

@nagai9999
Copy link

@nagai9999 nagai9999 commented Feb 12, 2026

Issue: #2211

PR Type

Enhancement, Tests


Description

  • Extract diagram sanitization logic into dedicated function

  • Remove backticks from node labels in mermaid diagrams

  • Add comprehensive unit tests for diagram sanitization

  • Handle missing closing fence and invalid diagram formats


Diagram Walkthrough

flowchart LR
  input["Raw diagram input"] -- "sanitize_diagram()" --> validation["Validate fence markers"]
  validation -- "Fix missing closing" --> cleanup["Remove backticks from labels"]
  cleanup -- "Return sanitized" --> output["Sanitized diagram string"]
Loading

File Walkthrough

Relevant files
Enhancement
pr_description.py
Extract and enhance diagram sanitization logic                     

pr_agent/tools/pr_description.py

  • Refactored diagram sanitization logic into new sanitize_diagram()
    function
  • Function validates fence markers, fixes missing closing backticks, and
    removes backticks from node labels
  • Updated _prepare_data() to use new sanitization function instead of
    inline logic
  • Returns empty string for invalid diagrams that don't start with fence
    markers
+19/-5   
Tests
test_pr_description.py
Add unit tests for diagram sanitization                                   

tests/unittest/test_pr_description.py

  • Added new test file with comprehensive unit tests for diagram
    sanitization
  • Tests cover invalid diagrams without fence markers, missing closing
    fences, and backtick removal
  • Tests verify backticks are only removed from node labels, not from
    edge labels
  • Tests confirm normal diagrams are processed correctly with newline
    prefix
+69/-0   

@nagai9999 nagai9999 marked this pull request as ready for review February 12, 2026 08:05
@qodo-free-for-open-source-projects
Copy link
Contributor

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Markdown injection

Description: sanitize_diagram() only checks that input starts with a fence and ensures it ends with a
fence, but does not prevent additional embedded sequences inside the diagram body, which
can prematurely close the code block and allow markdown/HTML injection (e.g., a diagram
containing
\n\n<script>...</script>\n could escape the fenced block in rendered PR descriptions depending
on downstream rendering).
pr_description.py [772-785]

Referred Code
def sanitize_diagram(diagram_raw: str) -> str:
    """Sanitize a diagram string: fix missing closing fence and remove backticks."""
    diagram = diagram_raw.strip()
    if not diagram.startswith('```'):
        return ''

    # fallback missing closing
    if not diagram.endswith('```'):
        diagram += '\n```'  

    # remove backticks inside node labels: ["`label`"] -> ["label"]
    lines = diagram.split('\n')
    result = [re.sub(r'\["([^"]*?)"\]', lambda m: '["' + m.group(1).replace('`', '') + '"]', line) for line in lines]
    return '\n' + '\n'.join(result)
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Consistent Naming Conventions

Objective: All new variables, functions, and classes must follow the project's established naming
standards

Status: Passed

Single Responsibility for Functions

Objective: Each function should have a single, well-defined responsibility

Status: Passed

When relevant, utilize early return

Objective: In a code snippet containing multiple logic conditions (such as 'if-else'), prefer an
early return on edge cases than deep nesting

Status: Passed

🔴
No Dead or Commented-Out Code

Objective: Keep the codebase clean by ensuring all submitted code is active and necessary

Status:
Unused import: The newly added pytest import is unused in the test module, introducing unnecessary dead
code.

Referred Code
import pytest
import yaml
Robust Error Handling

Objective: Ensure potential errors and edge cases are anticipated and handled gracefully throughout
the code

Status:
Input type safety: sanitize_diagram() calls .strip() on diagram_raw without guarding against non-string/None
inputs, which could raise an exception depending on upstream data types.

Referred Code
def sanitize_diagram(diagram_raw: str) -> str:
    """Sanitize a diagram string: fix missing closing fence and remove backticks."""
    diagram = diagram_raw.strip()
    if not diagram.startswith('```'):
        return ''
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-free-for-open-source-projects
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix incorrect regex for sanitizing

Update the regex for sanitizing Mermaid node labels to correctly handle the
id["..."] format, as the current expression r'["([^"]*?)"]' fails to match
them.

pr_agent/tools/pr_description.py [782-785]

-# remove backticks inside node labels: ["`label`"] -> ["label"]
+# remove backticks inside node labels: A["`label`"] -> A["label"]
 lines = diagram.split('\n')
-result = [re.sub(r'\["([^"]*?)"\]', lambda m: '["' + m.group(1).replace('`', '') + '"]', line) for line in lines]
+result = [re.sub(r'(\w+\[")([^"]*)("\])', lambda m: m.group(1) + m.group(2).replace('`', '') + m.group(3), line) for line in lines]
 return '\n' + '\n'.join(result)
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a bug in the new sanitize_diagram function where the regex for node labels is wrong, and it provides a correct implementation that fixes the bug, which is critical for the feature to work as intended.

High
Learned
best practice
Add safe guards for missing values

Guard against diagram_raw being None or non-string before calling .strip() to
avoid AttributeError when the YAML value is missing or not a string.

pr_agent/tools/pr_description.py [772-785]

 def sanitize_diagram(diagram_raw: str) -> str:
     """Sanitize a diagram string: fix missing closing fence and remove backticks."""
+    if not isinstance(diagram_raw, str):
+        return ''
     diagram = diagram_raw.strip()
     if not diagram.startswith('```'):
         return ''
-    
+
     # fallback missing closing
     if not diagram.endswith('```'):
-        diagram += '\n```'  
-    
+        diagram += '\n```'
+
     # remove backticks inside node labels: ["`label`"] -> ["label"]
     lines = diagram.split('\n')
     result = [re.sub(r'\["([^"]*?)"\]', lambda m: '["' + m.group(1).replace('`', '') + '"]', line) for line in lines]
     return '\n' + '\n'.join(result)
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - When accessing attributes on values that may be missing/None, use safe access patterns (e.g., guard clauses) to prevent AttributeError/KeyError.

Low
  • More
  • Author self-review: I have reviewed the PR code suggestions, and addressed the relevant ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant