Skip to content

fix(api): refactors the SQL LIKE pattern escaping logic to use a centralized utility function, ensuring consistent and secure handling of special characters across all database queries.#137

Open
tomerqodo wants to merge 4 commits intoqodo_combined_20260121_qodo_grep_cursor_copilot_1_base_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_special_chfrom
qodo_combined_20260121_qodo_grep_cursor_copilot_1_head_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_special_ch
Open

fix(api): refactors the SQL LIKE pattern escaping logic to use a centralized utility function, ensuring consistent and secure handling of special characters across all database queries.#137
tomerqodo wants to merge 4 commits intoqodo_combined_20260121_qodo_grep_cursor_copilot_1_base_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_special_chfrom
qodo_combined_20260121_qodo_grep_cursor_copilot_1_head_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_special_ch

Conversation

@tomerqodo
Copy link

Benchmark PR from qodo-benchmark#431

NeatGuyCoding and others added 4 commits January 21, 2026 15:54
…ralized

utility function, ensuring consistent and secure handling of special characters
across all database queries.

Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com>
…logic

Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com>
@qodo-code-review
Copy link

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (4) 📎 Requirement gaps (0)

Grey Divider


Action required

1. escapedPattern violates snake_case 📘 Rule violation ✓ Correctness
Description
escape_like_pattern() introduces a local variable named escapedPattern using camelCase,
  violating the required Python naming conventions.
• This reduces consistency/readability and is likely to be flagged by repository lint rules.
• Rename the variable to escaped_pattern to comply with the standard.
Code

api/libs/helper.py[R64-65]

+    escapedPattern = pattern.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
+    return escapedPattern
Evidence
The checklist requires snake_case for Python variables/functions. The new helper function uses
escapedPattern, which is camelCase and therefore violates the naming convention requirement.

AGENTS.md
api/libs/helper.py[61-65]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`escape_like_pattern()` uses a camelCase local variable (`escapedPattern`), which violates required Python naming conventions.

## Issue Context
Repository standards require snake_case for variables and functions.

## Fix Focus Areas
- api/libs/helper.py[61-65]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. IRIS LIKE escape mismatch 📘 Rule violation ⛨ Security
Description
iris_vector.search_by_full_text() builds query_pattern using escape_like_pattern()
  (backslash-based escaping) but the SQL uses ESCAPE '|'.
• Because the escape character in the SQL does not match the escape sequences produced by the
  helper, %/_ may still behave as wildcards, defeating the intended sanitization and causing
  incorrect matches.
• Align the escaping strategy by using the same escape character in both the pattern escaping and
  the SQL ESCAPE clause.
Code

api/core/rag/datasource/vdb/iris/iris_vector.py[R290-300]

+                # Escape special characters for LIKE clause to prevent SQL injection
+                from libs.helper import escape_like_pattern
+
+                escaped_query = escape_like_pattern(query)
+                query_pattern = f"%{escaped_query}%"
                sql = f"""
                    SELECT TOP {top_k} id, text, meta
                    FROM {self.schema}.{self.table_name}
-                    WHERE text LIKE ?
+                    WHERE text LIKE ? ESCAPE '|'
                """
                cursor.execute(sql, (query_pattern,))
Evidence
The new code relies on escape_like_pattern() to escape LIKE wildcards, but the SQL explicitly sets
a different escape character (|). This mismatch means the escaping may not be applied as intended,
undermining injection/wildcard protections required by the checklist.

Rule 6: Generic: Security-First Input Validation and Data Handling
api/core/rag/datasource/vdb/iris/iris_vector.py[290-300]
api/libs/helper.py[35-45]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
IRIS fallback LIKE query uses `ESCAPE &#x27;|&#x27;` while `escape_like_pattern()` escapes using backslashes, so wildcard escaping may not work.

## Issue Context
`escape_like_pattern()` is documented to be used with `ESCAPE &#x27;\\&#x27;`. The IRIS SQL currently uses a different escape character.

## Fix Focus Areas
- api/core/rag/datasource/vdb/iris/iris_vector.py[290-300]
- api/libs/helper.py[35-65]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. New tests lack type hints 📘 Rule violation ✓ Correctness
Description
• Newly added pytest test functions/methods omit type annotations for parameters and return values.
• This violates the requirement that all Python function definitions include parameter and return
  type annotations (as enforced by basedpyright).
• Add explicit annotations (use Any for pytest fixtures when concrete types are impractical) and
  -> None return types for tests.
Code

api/tests/test_containers_integration_tests/services/test_annotation_service.py[R447-449]

+    def test_get_annotation_list_by_app_id_with_special_characters_in_keyword(
+        self, db_session_with_containers, mock_external_service_dependencies
+    ):
Evidence
The checklist requires type annotations on all Python function definitions. The newly introduced
test functions are missing parameter and return type annotations, which should fail the stated
success criteria.

AGENTS.md
api/tests/test_containers_integration_tests/services/test_annotation_service.py[447-449]
api/tests/unit_tests/libs/test_helper.py[71-74]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
New pytest test functions/methods are missing type annotations for parameters and return values.

## Issue Context
The project requires type annotations for all Python function definitions to satisfy basedpyright.

## Fix Focus Areas
- api/tests/test_containers_integration_tests/services/test_annotation_service.py[447-449]
- api/tests/unit_tests/libs/test_helper.py[68-113]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Helper type contract broken 🐞 Bug ✓ Correctness
Description
escape_like_pattern is annotated as str -> str, but it returns the input as-is for falsy
  values, meaning it can return None.
• Unit tests explicitly call escape_like_pattern(None), which will be flagged by basedpyright.
• The repo’s make type-check runs basedpyright, so this can break CI even if runtime tests pass.
Code

api/libs/helper.py[R35-65]

+def escape_like_pattern(pattern: str) -> str:
+    """
+    Escape special characters in a string for safe use in SQL LIKE patterns.
+
+    This function escapes the special characters used in SQL LIKE patterns:
+    - Backslash (\\) -> \\
+    - Percent (%) -> \\%
+    - Underscore (_) -> \\_
+
+    The escaped pattern can then be safely used in SQL LIKE queries with the
+    ESCAPE '\\' clause to prevent SQL injection via LIKE wildcards.
+
+    Args:
+        pattern: The string pattern to escape
+
+    Returns:
+        Escaped string safe for use in SQL LIKE queries
+
+    Examples:
+        >>> escape_like_pattern("50% discount")
+        '50\\% discount'
+        >>> escape_like_pattern("test_data")
+        'test\\_data'
+        >>> escape_like_pattern("path\\to\\file")
+        'path\\\\to\\\\file'
+    """
+    if not pattern:
+        return pattern
+    # Escape backslash first, then percent and underscore
+    escapedPattern = pattern.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
+    return escapedPattern
Evidence
The helper’s signature promises str input/output, but it can return None due to `if not pattern:
return pattern. A new unit test calls it with None`, and the Makefile shows basedpyright is used
for type-checking, which will report the mismatch.

api/libs/helper.py[35-65]
api/tests/unit_tests/libs/test_helper.py[96-101]
Makefile[68-71]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`escape_like_pattern` is typed as `str -&gt; str` but can return `None`, and tests call it with `None`. This will likely fail `basedpyright` during `make type-check`.

### Issue Context
The function currently returns `pattern` on falsy input. That includes `None`, violating the current type annotations.

### Fix Focus Areas
- api/libs/helper.py[35-65]
- api/tests/unit_tests/libs/test_helper.py[68-113]

### Suggested fix
- Update signature to `def escape_like_pattern(pattern: str | None) -&gt; str | None:`.
- (Optional) Rename `escapedPattern` to `escaped_pattern` for style consistency.
- Ensure tests and all call sites align with the new signature (or, alternatively, remove the None test and raise on None).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. notlike() missing escape 📘 Rule violation ⛨ Security
Description
• In process_metadata_filter_func(), the not contains branch uses
  json_field.notlike(f"%{escaped_value}%") without specifying escape="\\".
• Because escaped_value contains backslash escapes, omitting the explicit escape clause can
  cause the DB to treat %/_ as wildcards (depending on backend), weakening the intended input
  sanitization.
• Add escape="\\" to the notlike() call for parity with the other LIKE cases.
Code

api/core/rag/retrieval/dataset_retrieval.py[R1205-1207]

            case "not contains":
-                filters.append(json_field.notlike(f"%{value}%"))
+                escaped_value = escape_like_pattern(str(value))
+                filters.append(json_field.notlike(f"%{escaped_value}%"))
Evidence
The rule requires preventing injection/wildcard abuse via proper sanitization. The code escapes LIKE
metacharacters with backslashes but does not always tell the database to interpret backslashes as
the escape character, which can make the escaping ineffective.

Rule 6: Generic: Security-First Input Validation and Data Handling
api/core/rag/retrieval/dataset_retrieval.py[1201-1208]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `not contains` metadata filter uses backslash-escaped patterns but does not specify the escape character for `NOT LIKE`, which can make escaping ineffective.

## Issue Context
Other branches (`contains`, `start with`, `end with`) pass `escape=&quot;\\&quot;`, but `not contains` does not.

## Fix Focus Areas
- api/core/rag/retrieval/dataset_retrieval.py[1201-1208]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



ⓘ The new review experience is currently in Beta. Learn more

Qodo Logo

Comment on lines +64 to +65
escapedPattern = pattern.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
return escapedPattern

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. escapedpattern violates snake_case 📘 Rule violation ✓ Correctness

escape_like_pattern() introduces a local variable named escapedPattern using camelCase,
  violating the required Python naming conventions.
• This reduces consistency/readability and is likely to be flagged by repository lint rules.
• Rename the variable to escaped_pattern to comply with the standard.
Agent prompt
## Issue description
`escape_like_pattern()` uses a camelCase local variable (`escapedPattern`), which violates required Python naming conventions.

## Issue Context
Repository standards require snake_case for variables and functions.

## Fix Focus Areas
- api/libs/helper.py[61-65]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +290 to 300
# Escape special characters for LIKE clause to prevent SQL injection
from libs.helper import escape_like_pattern

escaped_query = escape_like_pattern(query)
query_pattern = f"%{escaped_query}%"
sql = f"""
SELECT TOP {top_k} id, text, meta
FROM {self.schema}.{self.table_name}
WHERE text LIKE ?
WHERE text LIKE ? ESCAPE '|'
"""
cursor.execute(sql, (query_pattern,))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Iris like escape mismatch 📘 Rule violation ⛨ Security

iris_vector.search_by_full_text() builds query_pattern using escape_like_pattern()
  (backslash-based escaping) but the SQL uses ESCAPE '|'.
• Because the escape character in the SQL does not match the escape sequences produced by the
  helper, %/_ may still behave as wildcards, defeating the intended sanitization and causing
  incorrect matches.
• Align the escaping strategy by using the same escape character in both the pattern escaping and
  the SQL ESCAPE clause.
Agent prompt
## Issue description
IRIS fallback LIKE query uses `ESCAPE '|'` while `escape_like_pattern()` escapes using backslashes, so wildcard escaping may not work.

## Issue Context
`escape_like_pattern()` is documented to be used with `ESCAPE '\\'`. The IRIS SQL currently uses a different escape character.

## Fix Focus Areas
- api/core/rag/datasource/vdb/iris/iris_vector.py[290-300]
- api/libs/helper.py[35-65]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines 1205 to +1207
case "not contains":
filters.append(json_field.notlike(f"%{value}%"))
escaped_value = escape_like_pattern(str(value))
filters.append(json_field.notlike(f"%{escaped_value}%"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remediation recommended

3. notlike() missing escape 📘 Rule violation ⛨ Security

• In process_metadata_filter_func(), the not contains branch uses
  json_field.notlike(f"%{escaped_value}%") without specifying escape="\\".
• Because escaped_value contains backslash escapes, omitting the explicit escape clause can
  cause the DB to treat %/_ as wildcards (depending on backend), weakening the intended input
  sanitization.
• Add escape="\\" to the notlike() call for parity with the other LIKE cases.
Agent prompt
## Issue description
The `not contains` metadata filter uses backslash-escaped patterns but does not specify the escape character for `NOT LIKE`, which can make escaping ineffective.

## Issue Context
Other branches (`contains`, `start with`, `end with`) pass `escape="\\"`, but `not contains` does not.

## Fix Focus Areas
- api/core/rag/retrieval/dataset_retrieval.py[1201-1208]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +447 to +449
def test_get_annotation_list_by_app_id_with_special_characters_in_keyword(
self, db_session_with_containers, mock_external_service_dependencies
):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. New tests lack type hints 📘 Rule violation ✓ Correctness

• Newly added pytest test functions/methods omit type annotations for parameters and return values.
• This violates the requirement that all Python function definitions include parameter and return
  type annotations (as enforced by basedpyright).
• Add explicit annotations (use Any for pytest fixtures when concrete types are impractical) and
  -> None return types for tests.
Agent prompt
## Issue description
New pytest test functions/methods are missing type annotations for parameters and return values.

## Issue Context
The project requires type annotations for all Python function definitions to satisfy basedpyright.

## Fix Focus Areas
- api/tests/test_containers_integration_tests/services/test_annotation_service.py[447-449]
- api/tests/unit_tests/libs/test_helper.py[68-113]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +35 to +65
def escape_like_pattern(pattern: str) -> str:
"""
Escape special characters in a string for safe use in SQL LIKE patterns.

This function escapes the special characters used in SQL LIKE patterns:
- Backslash (\\) -> \\
- Percent (%) -> \\%
- Underscore (_) -> \\_

The escaped pattern can then be safely used in SQL LIKE queries with the
ESCAPE '\\' clause to prevent SQL injection via LIKE wildcards.

Args:
pattern: The string pattern to escape

Returns:
Escaped string safe for use in SQL LIKE queries

Examples:
>>> escape_like_pattern("50% discount")
'50\\% discount'
>>> escape_like_pattern("test_data")
'test\\_data'
>>> escape_like_pattern("path\\to\\file")
'path\\\\to\\\\file'
"""
if not pattern:
return pattern
# Escape backslash first, then percent and underscore
escapedPattern = pattern.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
return escapedPattern

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

5. Helper type contract broken 🐞 Bug ✓ Correctness

escape_like_pattern is annotated as str -> str, but it returns the input as-is for falsy
  values, meaning it can return None.
• Unit tests explicitly call escape_like_pattern(None), which will be flagged by basedpyright.
• The repo’s make type-check runs basedpyright, so this can break CI even if runtime tests pass.
Agent prompt
### Issue description
`escape_like_pattern` is typed as `str -> str` but can return `None`, and tests call it with `None`. This will likely fail `basedpyright` during `make type-check`.

### Issue Context
The function currently returns `pattern` on falsy input. That includes `None`, violating the current type annotations.

### Fix Focus Areas
- api/libs/helper.py[35-65]
- api/tests/unit_tests/libs/test_helper.py[68-113]

### Suggested fix
- Update signature to `def escape_like_pattern(pattern: str | None) -> str | None:`.
- (Optional) Rename `escapedPattern` to `escaped_pattern` for style consistency.
- Ensure tests and all call sites align with the new signature (or, alternatively, remove the None test and raise on None).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants