-
Notifications
You must be signed in to change notification settings - Fork 265
[2.1] Improve cleaning of string literals #8996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2.1] Improve cleaning of string literals #8996
Conversation
Signed-off-by: Shawn Bulen <[email protected]>
|
I want to review this carefully. I will make time to do so in the next few days. |
|
Another note on these escapes... I initially thought - if php regex allowed it (uh... nope...) - of putting a 2nd lookbehind for mysql, so The problem is that it can keep going... e.g., this is a single quote that terminates a string: So it'd need to check for an even vs odd # of chars in a lookbehind in regex... Which is why I ultimately gave up on using a regex for that. A similar problem exists in pg - the number of single quotes in a row determines whether it's escaped or not... |
Sesquipedalian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed changes for PostgreSQL work well, but the proposed changes for MySQL ran into issues with some complex strings. This is because the SQL standard '' syntax for escaping apostrophe characters is unambiguous and can therefore be handled accurately with a simple str_replace() call, whereas the C-style \' syntax that MySQL supports can create more complexity that requires taking the surrounding context into account when doing this job.
That said, I did find a case while testing for MySQL where the existing regex used in the preg_split() call wasn't quite good enough and needs a tweak.
Thus, I have made code suggestions in the comments for this review that revert the proposed changes made to the MySQL code but also adjust the regex used in the preg_split() call.
Sesquipedalian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more important change to make.
Can you provide an example of this? Where the PR would fail? |
Start with this as the incoming query string passed to the $db_string parameter of smf_db_query(): ... which uses a mix of When the MySQL code proposed in this PR tries to make a cleaned version of this query string, it will produce this: ... which is a problem. Now, in theory such a situation isn't supposed to arise because people are supposed to be pass the raw string ( EDIT: Appended |
|
The proposed change to the PR fails simple tests - e.g., this PM: Not unusual; folks share code. If the only way to get the current PR to fail is by manually hacking the code &, further, not calling the appropriate $smcFunc... I don't think that's a problem. Don't hack the code, use SMF standards, you'll be fine. |
|
"Don't do that" isn't an adequate approach here. I will test your example soon. It is entirely possible that the preg_split() approach remains insufficient. If so, we will need to find something else. Also, calling $smcFunc['db_query'] vs. smf_db_query makes no difference regarding my comments above. I referred to the latter for clarity, that's all. |
|
If they're not calling smcFunc, they're not here at all, though... If they are calling smcFunc, it'll get escaped accordingly... I've tested that, too. I think we're covered with this PR. |
$request = $smcFunc['db_query']('', 'SELECT \'hel\'\'lo hel\\\\\'\'lo /*\'', array());Merely using |
Yup, I was able to reproduce that. Somewhat surprisingly, though, the PM still sent and the error appeared when trying to write the session data to the database after the PM had been sent. At any rate, that error message did allow me to find the issue with the regular expression I was using in the pre_split() call. Instead of this: $clean = preg_split('/(?<![\'\\\\])\'(?![\'])|(?<=[\\\\]{2})\'/', $db_string);... we need this: $clean = preg_split('/(?:\\\\{2})*\K(?<![\'\\\\])\'(?![\'])/', $db_string);The This approach (using
I have updated my suggestion in my review comments to use this new regex. If you can find any other cases where my latest suggestion fails, please let me know, @sbulen. |
See SimpleMachines#8996 Signed-off-by: Jon Stovell <[email protected]>
Fixes #8993
Just placing this out there again for consideration. It's an update of #8991