-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[llvm] fix mustache template whitespace #153724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-compiler-rt-sanitizer @llvm/pr-subscribers-llvm-support Author: None (mdenson) ChangesQuick go at trying to fix the template whitespace issue. It's not pretty, but I suspect it is a long way from where it needs to be. Wanted to get some thoughts on where to go from here. Maybe this work needs to be synced with the JSON generator to make it go smoothly. I'll create some simple tests unless someone has something already. Just a thought... if we have JSON, and we have the templates, there are probably already tools that can do this better... Full diff: https://github.com/llvm/llvm-project/pull/153724.diff 1 Files Affected:
diff --git a/llvm/lib/Support/Mustache.cpp b/llvm/lib/Support/Mustache.cpp
index 6c2ed6c84c6cf..205d9b51103c2 100644
--- a/llvm/lib/Support/Mustache.cpp
+++ b/llvm/lib/Support/Mustache.cpp
@@ -166,6 +166,10 @@ class ASTNode {
void renderSectionLambdas(const llvm::json::Value &Contexts,
llvm::raw_ostream &OS, SectionLambda &L);
+ void indentTextNode(std::string &body, size_t Indentation, bool lastChild);
+
+ void indentNodes(ASTNode *Node, bool isPartial);
+
void renderPartial(const llvm::json::Value &Contexts, llvm::raw_ostream &OS,
ASTNode *Partial);
@@ -681,10 +685,57 @@ void ASTNode::renderChild(const json::Value &Contexts, llvm::raw_ostream &OS) {
Child->render(Contexts, OS);
}
+void ASTNode::indentTextNode(std::string &body, size_t Indentation,
+ bool lastChild) {
+ std::string spaces(Indentation, ' ');
+ size_t pos = 0;
+
+ if (lastChild)
+ body.erase(body.find_last_not_of(" \t\r\f\v") + 1); // .rtrim??
+
+ while ((pos = body.find('\n', pos)) != std::string::npos) {
+ if ((!lastChild) || (pos != body.size() - 1)) {
+ body.insert(pos + 1, spaces);
+ pos += 1 + Indentation;
+ } else {
+ break;
+ }
+ }
+}
+
+void ASTNode::indentNodes(ASTNode *Node, bool isPartial) {
+ size_t size = Node->Children.size();
+
+ for (size_t i = 0; i < size; ++i) {
+ ASTNode *child = Node->Children[i].get();
+ switch (child->Ty) {
+ case ASTNode::Text: {
+ indentTextNode(child->Body, Indentation, ((i == size - 1) && isPartial));
+ break;
+ }
+ case ASTNode::Section: {
+ indentNodes(child, false);
+ break;
+ }
+ case ASTNode::Partial: {
+ indentNodes(child, true);
+ }
+ case ASTNode::Root:
+ case ASTNode::Variable:
+ case ASTNode::UnescapeVariable:
+ case ASTNode::InvertSection:
+ break;
+ default:
+ llvm::outs() << "Invalid ASTNode type\n";
+ break;
+ }
+ }
+}
+
void ASTNode::renderPartial(const json::Value &Contexts, llvm::raw_ostream &OS,
ASTNode *Partial) {
- AddIndentationStringStream IS(OS, Indentation);
- Partial->render(Contexts, IS);
+ indentNodes(Partial, true);
+ Partial->render(Contexts, OS);
}
void ASTNode::renderLambdas(const json::Value &Contexts, llvm::raw_ostream &OS,
|
if you're happy with that please merge (no rights) |
fixes #149844 |
I'm not sure what you mean by other tools or how to solve this in JSON. My understanding is that the JSON just feeds the data to be rendered in the template, then this library handles the rendering, so the white space needs to be fixed here. AFAIK this library outputs the actual HTML file with all the entities written, so any whitespace issues come from here. It'd be nice to have a test to check for exact whitespace offsets and indentation. I don't think whitespace matters in the lit suites but I believe it does in unit tests. Is indentation something that's tested in the Mustache spec tests @ilovepi ? We might want something like that if you can check indentation in the nodes before writing them to a file. I'll try to checkout this patch and test manually to see if I see any of the alignment issues I mentioned, but it might not be until next week. |
I only meant that there are a lot of mustache implementations that will take input (like json) and a template and output html. See here as an example. That may not be something you want to consider. I do know that the unit tests can check whitespace since that is what caused the first commit to fail the checks. If this is an approach you would consider, I can add tests and look at optimizing. Just wanted to get some feedback first. |
Thanks for the patch. First, this needs some kind of test. Without at least that, we can't really tell whats happening to the output. If you want to check whitespace in lit tests, you'll probably need to set up some that use As for your patch, I'm not totally sure that this is the right way to handle things. I'm fairly certain that the current mustache implementation doesn't handle partials correctly, and even fails some of the conformance tests. We have an in-tree tool to test the conformance In that file there are a fair number of tests that are expected to fail (e.g. we don't conform). Notably the interpolation and partials test suites have failures that I think are significant. This is an area I've been meaning to spend some time improving, but I haven't had time recently to dig into the formatting problem. My intuition is that the whitespace issue you're fixing on the clang-doc side is a symptom of this deficiency in our mustache implementation.
That may be true, but clang-doc is supposed to be a standalone tool with no external dependencies. You're right that another library or external tool may be even more well suited to the task, but its our goal to provide a compelling implementation here, and to not require any other third party dependencies. We're just not quite there yet. |
Sorry I'm new here and still getting familiar with the project. I thought the unit tests at https://github.com/llvm/llvm-project/blob/main/llvm/unittests/Support/MustacheTest.cpp were meant to test conformance. Although I see now that there are tests from https://github.com/mustache/spec v1.4.3 that are missing.
How is this tool different, and when should I use one over the other?
Ah, then perhaps I've really misunderstood this one. I thought llvm/lib/Support/Mustache.cpp was 'our mustache implementation'? Where should I be looking to sort that out?
I figured this was the case. Understood. |
No need to apologize. Those tests do largely mimic the spec tests, but are a bit incomplete. We've mostly focused on the base spec to date.
The unittests make sure that our implementation is working(in so far as they've been implemented), but we don't control the upstream spec. I wanted us to have a way to simply link in our implementation and test agains the public spec. Ideally, we'd get the full suite of tests translated into our unit tests, and fix the implementation as we go. But even then, I'd expect the
oh, geez. I misread your patch. It seems you are touching the exact bits I'm talking about. sorry for the confusion. In this case the [clang-doc] tag is a bit misleading, since this is a change to
So, since it seems I've misread a part of your patch and given some rather questionable directions, let me summarize things briefly:
|
Got it. I will plan on the following:
Agreed. I don't think I can fix that without write access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in an earlier response, we still need some tests for this fix. Even better if the test cases are landed first as XFAIL.
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while ((pos = Body.find('\n', pos)) != std::string::npos) { | |
if ((!FinalNode) || (pos != LastChar)) { | |
Body.insert(pos + 1, spaces); | |
pos += 1 + Indentation; | |
} else { | |
break; | |
} | |
} | |
} | |
while ((pos = Body.find('\n', pos)) != std::string::npos && !(FinalNode && pos == LastChar)) { | |
Body.insert(pos + 1, spaces); | |
pos += 1 + Indentation; | |
} |
In any case you can use an early break to avoid the need for an else if you invert the if
condition, so even if you don't want to combine them into the loop condition, there's a nicer way to write it.
while ((pos = Body.find('\n', pos)) != std::string::npos) { | |
if ((!FinalNode) || (pos != LastChar)) { | |
Body.insert(pos + 1, spaces); | |
pos += 1 + Indentation; | |
} else { | |
break; | |
} | |
} | |
} | |
while ((pos = Body.find('\n', pos)) != std::string::npos) { | |
if (FinalNode && (pos == LastChar)) | |
break; | |
Body.insert(pos + 1, spaces); | |
pos += 1 + Indentation; | |
} | |
} |
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentTextNode(child->Body, Indentation, ((i == size - 1) && isPartial)); | |
// Only track the final node for partials. | |
bool IsFinalNode = ((i == size - 1) && isPartial); | |
indentTextNode(child->Body, Indentation, IsFinalNode); |
This is probably easier for folks like me to reason about down the line.
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default: | |
llvm::outs() << "Invalid ASTNode type\n"; | |
break; | |
} | |
} | |
llvm_unreachable("Invalid ASTNode type"); | |
I believe we should get an error if we fail to handle one of the cases due to -Wswitch-enum
, so default isn't needed. Since all cases are fully covered, I'd prefer we mark it unreachable, so it breaks loudly. If you're not getting the compile time warning or error diagnostic in your build for the uncovered switch, please let me know so we can fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do get a warning about a fully covered switch with this, so we should remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, actually, since you're looping over a bunch of things, the unreachable may not work unless you switch break
to continue
.
I'm pretty sure you can by editing the commit message/title. I'll admit its been a while since I've not had commit access, but I believe it should be doable. If not, I can fix it for you as part of the review. |
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void indentNodes(ASTNode *Node, bool isPartial); | |
void indentNodes(ASTNode *Node, bool IsPartial); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accepted
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string spaces(Indentation, ' '); | |
std::string Spaces(Indentation, ' '); |
The style guide prefers that we capitalize variables, and that they should be nouns. https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accepted
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Pos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accepted
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IsPartial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accepted
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accepted
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Child
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accepted
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do get a warning about a fully covered switch with this, so we should remove it.
llvm/lib/Support/Mustache.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentNodes(child, true); | |
indentNodes(child, true); | |
break; |
Ok, wasn't paying attention... I hope that is back to where it needs to be. I've accepted the suggested changes with the exception of the unreachable bit, which doesn't seem to fit with the loop as you mention. |
Passing llvm-test-mustache-spec partial test if you replace {{{...}}} with {{&...}} in partials.json since that is not supported in our implementation. |
any thoughts? unless you think otherwise, i can add the now (somewhat) passing spec test https://github.com/mustache/spec/blob/master/specs/partials.yml#L89-L107 to the unit tests (with the exchange of {{& }} for {{{ }}}) at https://github.com/llvm/llvm-project/blob/main/llvm/unittests/Support/MustacheTest.cpp and we could close the issue. if there are other mustache features we care about and want to implement (such as {{{ }}}), i can give that a shot in another pr |
I actually have a big stack of PRs that address the Mustache impl. I probably should have posted this a bit sooner. The stack starts here: #159182 and adds lots of missing functionality and some minor perf optimizations at the end. I'm happy to review your patches, but given that there's another patch set that will implement more features I'm not too sure. Feedback on that stack is welcome, since its still definitely what I'd call a first draft, but it does get us to where we want. |
Ok, will leave it with you then |
Quick go at trying to fix the template whitespace issue. It's not pretty, but I suspect it is a long way from where it needs to be. Wanted to get some thoughts on where to go from here. Maybe this work needs to be synced with the JSON generator to make it go smoothly.
I'll create some simple tests unless someone has something already.
Just a thought... if we have JSON, and we have the templates, there are probably already tools that can do this better...
fixes #149844