-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[MsgPack] Use JSON schema boolean resolution rules #170561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Since YAMLIO generally knows the types ahead of time (since it primarily functions to (de)serialize concrete C++ types) "tag resolution" isn't really a meaningful isssue. With MsgPackDocument we permit arbitrary documents with arbitrary type nodes, and so the YAML "NO" problem is an issue. Address this in MsgPackDocument itself to avoid a significant and far-reaching backwards-incompatible change to YAMLIO (although we could still consider tightening things up there in the future). Use the "JSON Schema" for tag resolution where the only literals to resolve to bool by default are "true" and "false".
|
@llvm/pr-subscribers-llvm-binary-utilities Author: Scott Linder (slinder1) ChangesSince YAMLIO generally knows the types ahead of time (since it primarily functions to (de)serialize concrete C++ types) "tag resolution" isn't really a meaningful isssue. With MsgPackDocument we permit arbitrary documents with arbitrary type nodes, and so the YAML "NO" problem is an issue. Address this in MsgPackDocument itself to avoid a significant and far-reaching backwards-incompatible change to YAMLIO (although we could still consider tightening things up there in the future). Use the "JSON Schema" for tag resolution where the only literals to resolve to bool by default are "true" and "false". Full diff: https://github.com/llvm/llvm-project/pull/170561.diff 2 Files Affected:
diff --git a/llvm/lib/BinaryFormat/MsgPackDocumentYAML.cpp b/llvm/lib/BinaryFormat/MsgPackDocumentYAML.cpp
index 80b421d5f752e..fbf871f709a64 100644
--- a/llvm/lib/BinaryFormat/MsgPackDocumentYAML.cpp
+++ b/llvm/lib/BinaryFormat/MsgPackDocumentYAML.cpp
@@ -29,6 +29,10 @@ struct ScalarDocNode : DocNode {
StringRef getYAMLTag() const;
};
+bool isJSONSchemaBoolLiteral(StringRef S) {
+ return S == "true" || S == "false";
+}
+
} // namespace
/// Convert this DocNode to a string, assuming it is scalar.
@@ -84,11 +88,18 @@ StringRef DocNode::fromString(StringRef S, StringRef Tag) {
*this = getDocument()->getNode();
return "";
}
- if (Tag == "!bool" || Tag == "") {
+ if (Tag == "!bool") {
*this = getDocument()->getNode(false);
- StringRef Err = yaml::ScalarTraits<bool>::input(S, nullptr, getBool());
- if (Err == "" || Tag != "")
- return Err;
+ return yaml::ScalarTraits<bool>::input(S, nullptr, getBool());
+ }
+ // FIXME: This enforces the "JSON Schema" tag resolution for boolean literals,
+ // defined at https://yaml.org/spec/1.2.2/#json-schema which we adopt because
+ // the more general pre-1.2 resolution includes many common strings (e.g. "n",
+ // "no", "y", "yes", ...). This should be handled at the YAMLTraits level, but
+ // that change would have a much broader impact.
+ if (Tag == "" && isJSONSchemaBoolLiteral(S)) {
+ *this = getDocument()->getNode(S == "true");
+ return "";
}
if (Tag == "!float" || Tag == "") {
*this = getDocument()->getNode(0.0);
diff --git a/llvm/unittests/BinaryFormat/MsgPackDocumentTest.cpp b/llvm/unittests/BinaryFormat/MsgPackDocumentTest.cpp
index 6a6ad7010f629..5aace87cb3b6b 100644
--- a/llvm/unittests/BinaryFormat/MsgPackDocumentTest.cpp
+++ b/llvm/unittests/BinaryFormat/MsgPackDocumentTest.cpp
@@ -420,3 +420,77 @@ TEST(MsgPackDocument, TestInputYAMLMap) {
ASSERT_EQ(SS.getKind(), Type::String);
ASSERT_EQ(SS.getString(), "2");
}
+
+TEST(MsgPackDocument, TestInputYAMLBoolean) {
+ Document Doc;
+ auto GetFirst = [](Document &Doc) { return Doc.getRoot().getArray()[0]; };
+ auto ToYAML = [](Document &Doc) {
+ std::string S;
+ raw_string_ostream OS(S);
+ Doc.toYAML(OS);
+ return S;
+ };
+
+ bool Ok;
+
+ Ok = Doc.fromYAML("- n\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::String);
+ ASSERT_EQ(GetFirst(Doc).getString(), "n");
+ ASSERT_EQ(ToYAML(Doc), "---\n- n\n...\n");
+
+ Ok = Doc.fromYAML("- y\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::String);
+ ASSERT_EQ(GetFirst(Doc).getString(), "y");
+ ASSERT_EQ(ToYAML(Doc), "---\n- y\n...\n");
+
+ Ok = Doc.fromYAML("- no\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::String);
+ ASSERT_EQ(GetFirst(Doc).getString(), "no");
+ ASSERT_EQ(ToYAML(Doc), "---\n- no\n...\n");
+
+ Ok = Doc.fromYAML("- yes\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::String);
+ ASSERT_EQ(GetFirst(Doc).getString(), "yes");
+ ASSERT_EQ(ToYAML(Doc), "---\n- yes\n...\n");
+
+ Ok = Doc.fromYAML("- false\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::Boolean);
+ ASSERT_EQ(GetFirst(Doc).getBool(), false);
+ ASSERT_EQ(ToYAML(Doc), "---\n- false\n...\n");
+
+ Ok = Doc.fromYAML("- true\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::Boolean);
+ ASSERT_EQ(GetFirst(Doc).getBool(), true);
+ ASSERT_EQ(ToYAML(Doc), "---\n- true\n...\n");
+
+ Ok = Doc.fromYAML("- !str false\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::String);
+ ASSERT_EQ(GetFirst(Doc).getString(), "false");
+ ASSERT_EQ(ToYAML(Doc), "---\n- !str 'false'\n...\n");
+
+ Ok = Doc.fromYAML("- !str true\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::String);
+ ASSERT_EQ(GetFirst(Doc).getString(), "true");
+ ASSERT_EQ(ToYAML(Doc), "---\n- !str 'true'\n...\n");
+
+ // FIXME: A fix for these requires changes in YAMLParser/YAMLTraits.
+ Ok = Doc.fromYAML("- \"false\"\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::Boolean);
+ ASSERT_EQ(GetFirst(Doc).getBool(), false);
+ ASSERT_EQ(ToYAML(Doc), "---\n- false\n...\n");
+
+ Ok = Doc.fromYAML("- \"true\"\n");
+ ASSERT_TRUE(Ok);
+ ASSERT_EQ(GetFirst(Doc).getKind(), Type::Boolean);
+ ASSERT_EQ(GetFirst(Doc).getBool(), true);
+ ASSERT_EQ(ToYAML(Doc), "---\n- true\n...\n");
+}
|
| ASSERT_EQ(GetFirst(Doc).getString(), "true"); | ||
| ASSERT_EQ(ToYAML(Doc), "---\n- !str 'true'\n...\n"); | ||
|
|
||
| // FIXME: A fix for these requires changes in YAMLParser/YAMLTraits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain in more detail what fix is required?
Is there a reason it is not part of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It requires a change in YAMLParser/YAMLTraits directly and I wanted to avoid the fallout. If we would rather not have a half measure like this I can spend some time trying to understand the YAML issue more directly.
Edit: to answer your first question more directly, the YAML spec(s) are complicated and I didn't want to wade into that swamp. I believe the change as-is at least works towards conforming to the JSON schema for bools, and I don't see how it can make the situation worse in this regard, but it is not a complete solution.
Since YAMLIO generally knows the types ahead of time (since it primarily functions to (de)serialize concrete C++ types) "tag resolution" isn't really a meaningful isssue.
With MsgPackDocument we permit arbitrary documents with arbitrary type nodes, and so the YAML "NO" problem is an issue.
Address this in MsgPackDocument itself to avoid a significant and far-reaching backwards-incompatible change to YAMLIO (although we could still consider tightening things up there in the future). Use the "JSON Schema" for tag resolution where the only literals to resolve to bool by default are "true" and "false".