-
Notifications
You must be signed in to change notification settings - Fork 0
fix(api): refactors the SQL LIKE pattern escaping logic to use a centralized utility function, ensuring consistent and secure handling of special characters across all database queries. #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devin_pr_code_review_bench_100_devin2_base_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_special_characters_acr
Are you sure you want to change the base?
Changes from all commits
f0a8695
9318a5d
a38bc04
b00e6e8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1195,18 +1195,24 @@ def process_metadata_filter_func( | |
|
|
||
| json_field = DatasetDocument.doc_metadata[metadata_name].as_string() | ||
|
|
||
| from libs.helper import escape_like_pattern | ||
|
|
||
| match condition: | ||
| case "contains": | ||
| filters.append(json_field.like(f"%{value}%")) | ||
| escaped_value = escape_like_pattern(str(value)) | ||
| filters.append(json_field.like(f"%{escaped_value}%", escape="\\")) | ||
|
|
||
| case "not contains": | ||
| filters.append(json_field.notlike(f"%{value}%")) | ||
| escaped_value = escape_like_pattern(str(value)) | ||
| filters.append(json_field.notlike(f"%{escaped_value}%")) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Missing escape parameter in notlike() call causes ineffective LIKE pattern escaping The Click to expandIssue DetailsIn case "not contains":
escaped_value = escape_like_pattern(str(value))
filters.append(json_field.notlike(f"%{escaped_value}%")) # Missing escape parameter!Compare with the "contains" case which correctly includes the escape parameter: case "contains":
escaped_value = escape_like_pattern(str(value))
filters.append(json_field.like(f"%{escaped_value}%", escape="\\")) # Correct!Actual vs Expected BehaviorActual: When filtering with "not contains" condition and a value like Expected: The ImpactMetadata filtering with "not contains" condition will not work correctly when the filter value contains special SQL LIKE characters (
Recommendation: Add the filters.append(json_field.notlike(f"%{escaped_value}%", escape="\\"))Was this helpful? React with 👍 or 👎 to provide feedback. |
||
|
|
||
| case "start with": | ||
| filters.append(json_field.like(f"{value}%")) | ||
| escaped_value = escape_like_pattern(str(value)) | ||
| filters.append(json_field.like(f"{escaped_value}%", escape="\\")) | ||
|
|
||
| case "end with": | ||
| filters.append(json_field.like(f"%{value}")) | ||
| escaped_value = escape_like_pattern(str(value)) | ||
| filters.append(json_field.like(f"%{escaped_value}", escape="\\")) | ||
|
|
||
| case "is" | "=": | ||
| if isinstance(value, str): | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 IRIS vector LIKE search uses wrong escape character causing ineffective escaping
The IRIS vector implementation uses
ESCAPE '|'(pipe) in the SQL query butescape_like_pattern()escapes using backslash (\), making the escaping completely ineffective.Click to expand
Issue Details
In
api/core/rag/datasource/vdb/iris/iris_vector.py, thesearch_by_full_textmethod escapes the query usingescape_like_pattern()which uses backslash as the escape character (seeapi/libs/helper.py:64):However, the SQL query at line 298 specifies pipe (
|) as the escape character:Actual vs Expected Behavior
Actual: When searching for
50%:escape_like_pattern("50%")returns50\%ESCAPE '|'\%as a literal backslash followed by a wildcard%50\followed by any charactersExpected: The escape character in the SQL should match the one used by
escape_like_pattern(), which is backslash (\).Impact
Full-text search in IRIS vector database will not correctly handle queries containing special SQL LIKE characters (
%,_,\). This could lead to:Recommendation: Change the ESCAPE clause to use backslash to match
escape_like_pattern():Note: In raw SQL strings, you may need to double the backslash depending on how IRIS handles escape characters.
Was this helpful? React with 👍 or 👎 to provide feedback.