Skip to content

Conversation

@vehre-x41
Copy link
Contributor

Commit Message: Add quoting strings to YAML from proto message gen.
Additional Description:
Some strings in a protobuf message may not be valid in YAML,
like strings containing tabs. Quoting these strings on protobuf
message to YAML as string conversion fixes missunderstandings
of YAML parsers and therefore fuzz-tests.
Risk Level: None
Testing: n/a
Docs Changes: n/a
Release Notes: n/a

@repokitteh-read-only
Copy link

Hi @vehre-x41, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #20544 was opened by vehre-x41.

see: more, trace.

@vehre-x41 vehre-x41 force-pushed the quote-strs-on-YAML-gen branch from 5d108e9 to 447975c Compare March 28, 2022 09:14
Comment on lines 35 to 41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, may be we can keep this test unchanged, and add a new test for your new case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, not changing the test will prevent it from passing for certain fuzz input (e.g. input that has tabs in the values or keys). So how to do this without changing the test? Reject fuzz input, that does not suit the test? I am open for ideas!

Copy link
Member

@wbpcode wbpcode Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, how the test passed before this PR? 🤔 If you think they are some coincidences, and add quotes is the right option always, then may be we needn't quote_yaml_strings parameter. We can add quotes just by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I can only assume that the fuzzer hasn't used the offending character(s) before.

We can add quotes just by default.
I've thought of that, too. I wanted the code change as small as possible and as optional as possible. I am quite new to envoy and do not overlook the impacts of changes at all yet. So if you vote that the option should be removed, then I am happy to comply. Any thoughts?

Copy link
Member

@wbpcode wbpcode Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, 🤔 we should prove that the current implementation has this problem in the first. And then we can discuss the details of this patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your careful debug. I agree we should fix it in the yaml-cpp. And it's also ok to work around this problem in the Envoy.

I'm concerned that there are still some other symbols that may not be handled correctly, so it seems safer to always add quotes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the yaml-cpp repo traffic seems to be low. I have added the workaround here, so that double quotes are always used. Ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, with this change, there are now three tests failing. I could fix test/common/protobuf/utility_test, but with the other two I can't explain, why they fail. They read yaml only, there is no reason I see, that emitting YAML with quotes affects them.

The other two are:

//test/server/config_validation:server_test
//test/common/runtime:runtime_impl_test

I will take a deeper look tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request on the yaml-cpp repo has been accepted and merged to their master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing the tests is not as easy as I liked. The tests are serializing scalar numbers through YAML, like 10000 but also 123UL. On the writing side the protobuf-message is using a string to emit the character representation of the number. Using always quote now quotes these strings. The YAML-reader expecting a number then rejects the input due to the quotes. I have added a guard to not quote scalar YAML output, but I am totally unhappy with it.

I tried to prevent quoting only for scalar numbers, but numbers in YAML can have any character of "0123456789abcdefABCDEFxUL+-." (and probably more). Without correctly parsing the number (I just tried std::string::find_first_not_of()) with respect to YAML syntax this could allow to many YAML scalars to be emitted unquoted.

I am more and more inclined to just wait for the change in yaml-cpp coming downstream and use that instead of working around it in envoy. Thoughts?

Add test input to show that YAML <-> JSON config de-/serialisation
leads to different content.

Signed-off-by: Andre Vehreschild <[email protected]>
@vehre-x41 vehre-x41 force-pushed the quote-strs-on-YAML-gen branch from 447975c to eaf077e Compare March 28, 2022 12:55
@adisuissa adisuissa self-assigned this Mar 28, 2022
@wbpcode
Copy link
Member

wbpcode commented Mar 30, 2022

Hi, @vehre-x41 both @adisuissa and me agree that use double quotes by default is ok. Can you make a minor update for this PR?

@wbpcode
Copy link
Member

wbpcode commented Mar 30, 2022

/wait

Some strings in a protobuf message may not be valid in YAML,
like strings containing tabs. Quoting these strings on protobuf
message to YAML as string conversion fixes missunderstandings
of YAML parsers and therefore fuzz-tests.

Signed-off-by: Andre Vehreschild <[email protected]>
@vehre-x41 vehre-x41 force-pushed the quote-strs-on-YAML-gen branch from eaf077e to 17d5e3b Compare March 31, 2022 13:51
@wbpcode
Copy link
Member

wbpcode commented Apr 1, 2022

Hi, please don't use force push. Thanks. 😺

@vehre-x41
Copy link
Contributor Author

Ok, when not using push --with-lease what is the policy here? merge --squash? You don't want to have all of fails on the repo, will you?

@wbpcode
Copy link
Member

wbpcode commented Apr 1, 2022

All the commit will be squashed after merge.

Only quote string in YAML serialisation when the objects to serialize
is not scalar.

Signed-off-by: Andre Vehreschild <[email protected]>
@vehre-x41
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #20544 (comment) was created by @vehre-x41.

see: more, trace.

@vehre-x41
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #20544 (comment) was created by @vehre-x41.

see: more, trace.

@wbpcode
Copy link
Member

wbpcode commented Apr 11, 2022

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #20544 (comment) was created by @wbpcode.

see: more, trace.

@vehre-x41
Copy link
Contributor Author

Well, the bug is fixed also w/o this patch by importing the most recent version of the yaml-cpp repo. I am therefore inclined to just close this PR without merging anything. Opinions?

@wbpcode
Copy link
Member

wbpcode commented Apr 11, 2022

Well, the bug is fixed also w/o this patch by importing the most recent version of the yaml-cpp repo. I am therefore inclined to just close this PR without merging anything. Opinions?

I think it's ok. Does yam-cpp provide a new release version that contains your patch?

@vehre-x41
Copy link
Contributor Author

As far as I see, does yaml-cpp not provide a new release (the last one is 0.7.0 of Jul. 10th 2021). Envoy seems to pull in main branch now.

@wbpcode
Copy link
Member

wbpcode commented Apr 11, 2022

As far as I see, does yaml-cpp not provide a new release (the last one is 0.7.0 of Jul. 10th 2021). Envoy seems to pull in main branch now.

Then would you willing to create a new PR to update the yaml-cpp commit that Envoy depend to? 🌷

@adisuissa adisuissa mentioned this pull request Apr 11, 2022
@adisuissa
Copy link
Contributor

As far as I see, does yaml-cpp not provide a new release (the last one is 0.7.0 of Jul. 10th 2021). Envoy seems to pull in main branch now.

Then would you willing to create a new PR to update the yaml-cpp commit that Envoy depend to? 🌷

I can do that. Tracking issue #20762.

@vehre-x41
Copy link
Contributor Author

Thanks, @adisuissa . What I like to understand here nevertheless is, why this bug is reported to be fixed: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=19047#c25 without the reference in envoy being updated? The bug has been reported to be closed on Apr. 8th, 2022, but the references in the envoy at least from

com_github_jbeder_yaml_cpp = dict(
project_name = "yaml-cpp",
project_desc = "YAML parser and emitter in C++ matching the YAML 1.2 spec",
project_url = "https://github.com/jbeder/yaml-cpp",
version = "db6deedcd301754723065e0bbb1b75927c5b49c7",
sha256 = "387d7f25467312ca59068081f9a25bbab02bb6af32fd3e0aec1bd59163558171",
strip_prefix = "yaml-cpp-{version}",
urls = ["https://github.com/jbeder/yaml-cpp/archive/{version}.tar.gz"],
# YAML is also used for runtime as well as controlplane. It shouldn't appear on the
# dataplane but we can't verify this automatically due to code structure today.
use_category = ["controlplane", "dataplane_core"],
release_date = "2021-07-23",
cpe = "cpe:2.3:a:yaml-cpp_project:yaml-cpp:*",
),
do not point to the yaml-cpp commit (or a later one), that fixes the issue. So what is going on here? Are those references in the repository_locations.bzl only guides and not sticky?

@adisuissa
Copy link
Contributor

Thanks, @adisuissa . What I like to understand here nevertheless is, why this bug is reported to be fixed: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=19047#c25 without the reference in envoy being updated? The bug has been reported to be closed on Apr. 8th, 2022, but the references in the envoy at least from

com_github_jbeder_yaml_cpp = dict(
project_name = "yaml-cpp",
project_desc = "YAML parser and emitter in C++ matching the YAML 1.2 spec",
project_url = "https://github.com/jbeder/yaml-cpp",
version = "db6deedcd301754723065e0bbb1b75927c5b49c7",
sha256 = "387d7f25467312ca59068081f9a25bbab02bb6af32fd3e0aec1bd59163558171",
strip_prefix = "yaml-cpp-{version}",
urls = ["https://github.com/jbeder/yaml-cpp/archive/{version}.tar.gz"],
# YAML is also used for runtime as well as controlplane. It shouldn't appear on the
# dataplane but we can't verify this automatically due to code structure today.
use_category = ["controlplane", "dataplane_core"],
release_date = "2021-07-23",
cpe = "cpe:2.3:a:yaml-cpp_project:yaml-cpp:*",
),

do not point to the yaml-cpp commit (or a later one), that fixes the issue. So what is going on here? Are those references in the repository_locations.bzl only guides and not sticky?

The dependent sha versions should be the ones used by the fuzzing infrastructure.
What I'm assuming happened is that the same bug is somehow not reproducible by one of the PRs since the last fault. Can you try the latest sha in that range (8cb6862) and see if the bug still happen?

@adisuissa
Copy link
Contributor

Submitted #20765 that should include @vehre-x41's fix for yaml-cpp.

@adisuissa
Copy link
Contributor

We are still trying to upgrade yaml-cpp and going through the woes of the windows compatibility of yaml-cpp.
/wait

@vehre-x41
Copy link
Contributor Author

Closing in favor or #20886 .

@vehre-x41 vehre-x41 closed this Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants