-
-
Notifications
You must be signed in to change notification settings - Fork 65
fix(queries): eliminate double-escaping in regex patterns #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to f38457f in 1 minute and 51 seconds. Click for details.
- Reviewed
208
lines of code in4
files - Skipped
0
files when reviewing. - Skipped posting
3
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. aw-client-rust/src/classes.rs:182
- Draft comment:
Good improvement on error handling when deserializing classes settings. Logging the error detail helps troubleshooting. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. aw-client-rust/src/queries.rs:168
- Draft comment:
Manual JSON construction in serialize_classes may be fragile if any field contains quotes. Consider reusing the escape_doublequote helper to safely escape fields like the regex pattern and category strings. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment raises a valid security/correctness concern. If category names or regex patterns contain quotes, the manual JSON construction could break. However, looking at the code more carefully, the format!() with "{}" will automatically escape quotes in strings. The comment author may have missed this. Additionally, the function's comment explicitly states this is intentional to avoid double-escaping. I could be wrong about format!() handling quote escaping properly in all cases. There might be edge cases I'm not considering. The code's explicit comment about avoiding double-escaping suggests this was a deliberate design choice. The test cases show it working correctly with complex strings. The comment should be deleted. The current implementation appears intentional and correct, with format!() providing necessary escaping. The comment misunderstands the purpose of the manual JSON construction.
3. aw-client-rust/src/queries.rs:231
- Draft comment:
The new string literal that spans two lines does not indent the second line (it now starts withnot_afk
without preceding whitespace). This may cause an unintended change in the query string formatting. If a leading whitespace was intended, please add it. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment suggests this might be an unintended change, but looking at the full context shows this is actually an intentional formatting change applied consistently throughout the file. The query strings are being built programmatically and the indentation doesn't affect functionality. The author appears to have deliberately chosen to align all continuation lines at column 0. I could be wrong about the intentionality - maybe some of these formatting changes were accidental. However, the consistency of the changes across multiple locations suggests otherwise. The systematic nature of the changes, affecting multiple similar strings in the same way, strongly indicates this was an intentional reformatting rather than a mistake that needs to be fixed. This comment should be deleted because it's pointing out what appears to be an intentional formatting change that was consistently applied across the file.
Workflow ID: wflow_COtcfsVZJSCHbmxI
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
aw-client-rust/src/lib.rs
Outdated
@@ -123,6 +123,12 @@ impl AwClient { | |||
.map(|(start, stop)| format!("{}/{}", start, stop)) | |||
.collect(); | |||
|
|||
let query_lines: Vec<&str> = query.split('\n').collect(); | |||
let payload = json!({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused variable 'payload' is computed but never used; consider removing it or using it in the request for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were for debugging nice catch
aw-client-rust/src/queries.rs
Outdated
} | ||
|
||
// Only include ignore_case when true | ||
if category_spec.ignore_case { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new logic omits the 'ignore_case' field when false per PR description, but the test (test_classes_serialization) expects it to be present. Update the test to align with the intended behavior.
@ellipsis-dev implement both suggestions |
@ellipsis-dev fix the failing test to align with intended behavior |
Hi @0xbrayo, I attempted to address the failing test but couldn't locate the specific test that needs adjustment. Could you please provide more details about the failing test, such as its name, location, and the error message it produces? This information will help me align the test with the intended behavior. Thank you! |
@ellipsis-dev |
Hi @0xbrayo, I attempted to locate the 'test_classes_serialization' test to update it according to the new logic, but I couldn't find it in the codebase. Could you please clarify where this test is located? If it doesn't exist yet, would you like me to create a new test in |
nevermind. I'll just do it. |
d23b8e3
to
ae9209a
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #541 +/- ##
==========================================
- Coverage 70.81% 69.86% -0.95%
==========================================
Files 51 54 +3
Lines 2916 3166 +250
==========================================
+ Hits 2065 2212 +147
- Misses 851 954 +103 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The serialize_classes function was using serde_json serialization which caused regex patterns to be double-escaped (e.g., 't\.co' became 't\.co'), breaking pattern matching in ActivityWatch queries. Core changes: - Rewrite serialize_classes() to build JSON strings manually instead of using serde_json - Preserve single-escaped regex patterns for proper matching - Only include 'ignore_case' field when true (omit when false) - Only include 'regex' field for non-'none' type categories - Improve error handling in classes deserialization - Add optional fields to ClassSetting struct for better compatibility This fixes the core regex pattern matching issues in ActivityWatch canonical queries that were causing incorrect categorization results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to b705f04 in 2 minutes and 13 seconds. Click for details.
- Reviewed
218
lines of code in4
files - Skipped
0
files when reviewing. - Skipped posting
6
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. aw-client-rust/src/classes.rs:18
- Draft comment:
Good use of #[serde(default)] on the 'regex' field to handle missing data gracefully. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. aw-client-rust/src/lib.rs:122
- Draft comment:
The variable 'query_lines' is computed but never used; consider removing it to avoid unused code warnings. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. aw-client-rust/src/queries.rs:169
- Draft comment:
Manual JSON assembly in serialize_classes avoids double-escaping but be cautious—ensure consistent formatting and proper escaping for all edge cases. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =20%
<= threshold50%
The comment is suggesting caution regarding JSON assembly, which is a valid concern. However, it does not provide a specific suggestion or point out a specific issue in the code. It is more of a general warning rather than a constructive comment. According to the rules, comments should be specific and actionable, and this one is not. Therefore, it should be removed.
4. aw-client-rust/src/queries.rs:216
- Draft comment:
Mixing embedded semicolons in query fragments (e.g. in the 'not_afk' block) with join using ";\n" may lead to redundant semicolons; consider normalizing statement termination. - Reason this comment was not posted:
Comment was on unchanged code.
5. aw-client-rust/src/queries.rs:242
- Draft comment:
Appending a semicolon in the 'categorize' query string may result in extra semicolons when joining statements; verify that the query syntax tolerates this redundancy. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The semicolon is actually correct here. The query.join(";\n") on line 256 joins multiple statements, each needing a semicolon terminator. Having a semicolon in the format string ensures consistent statement termination. The comment's concern about redundant semicolons is unfounded since the join adds newlines between statements. Could there be edge cases where the query language is sensitive to multiple semicolons that I haven't considered? The code's test cases pass with the semicolon present, and the pattern of semicolon-terminated statements is used consistently throughout the codebase. SQL-like query languages typically handle multiple semicolons gracefully. The comment should be deleted. The semicolon is intentional and correct as part of the consistent statement termination pattern used throughout the query building.
6. aw-client-rust/src/queries.rs:291
- Draft comment:
In build_browser_events, the initial string already ends with a semicolon; ensure consistent semicolon usage when these parts are later concatenated. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_mW2ufx5lPCQwOJFw
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
The serialize_classes function was using serde_json serialization which caused regex patterns to be double-escaped (e.g., t.co became t\.co, (noticing markdown also does escaping so viewing the raw text is best) breaking pattern matching in ActivityWatch queries.
Core changes:
Important
Fixes double-escaping of regex patterns in ActivityWatch queries by manually constructing JSON strings in
serialize_classes()
and improves error handling inget_classes_from_server()
.serialize_classes()
inqueries.rs
now manually constructs JSON strings to avoid double-escaping regex patterns.get_classes_from_server()
inclasses.rs
.id
anddata
toClassSetting
struct inclasses.rs
for better compatibility.This description was created by
for b705f04. You can customize this summary. It will automatically update as commits are pushed.