unescape: Decode \u escaped characters for surrogate pairs correctly#9799
Merged
unescape: Decode \u escaped characters for surrogate pairs correctly#9799
Conversation
27a363e to
b758796
Compare
b758796 to
dfe15aa
Compare
e02c52c to
b4c023e
Compare
Contributor
|
Thanks for following up this quick @cosmo0920! On the other hand, staying strict and rejecting that sequence is likely much better for the user, who won't suddenly find magic replacements in their data. |
b4c023e to
daee871
Compare
@vit-zikmund 's suggestion is very helpful to get working for handling surrogate pairs. Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io> Co-authored-by: Vit Zikmund <vit.zikmund@themama.ai>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
8febcae to
991691e
Compare
991691e to
7930bcd
Compare
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
7930bcd to
a0acbff
Compare
Contributor
Author
|
Now, I'd succeeded to make a green result for OSS-Fuzz task. 💪 |
|
@cosmo0920 run into this issue again today - is it planned to be reviewed merged soon? |
|
@cosmo0920 @edsiper @vit-zikmund thank you all for the fix 👏 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently, we ignore surrogate pairs for \u escape on Unicode representation.
To handle this, we need to process with surrogate pairs manner.
Noe that this representation is also encoded
\uXXXXrepresentation on creating JSON.On creating msgpack, this unescaping operation is effective.
Closes #9712.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
and send
{"text": "\ud83e\udd17"}in the same terminal.If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.