Skip to content

Conversation

mdenson
Copy link
Contributor

@mdenson mdenson commented Aug 15, 2025

Quick go at trying to fix the template whitespace issue. It's not pretty, but I suspect it is a long way from where it needs to be. Wanted to get some thoughts on where to go from here. Maybe this work needs to be synced with the JSON generator to make it go smoothly.

I'll create some simple tests unless someone has something already.

Just a thought... if we have JSON, and we have the templates, there are probably already tools that can do this better...

fixes #149844

@llvmbot
Copy link
Member

llvmbot commented Aug 15, 2025

@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-llvm-support

Author: None (mdenson)

Changes

Quick go at trying to fix the template whitespace issue. It's not pretty, but I suspect it is a long way from where it needs to be. Wanted to get some thoughts on where to go from here. Maybe this work needs to be synced with the JSON generator to make it go smoothly.

I'll create some simple tests unless someone has something already.

Just a thought... if we have JSON, and we have the templates, there are probably already tools that can do this better...


Full diff: https://github.com/llvm/llvm-project/pull/153724.diff

1 Files Affected:

  • (modified) llvm/lib/Support/Mustache.cpp (+53-2)
diff --git a/llvm/lib/Support/Mustache.cpp b/llvm/lib/Support/Mustache.cpp
index 6c2ed6c84c6cf..205d9b51103c2 100644
--- a/llvm/lib/Support/Mustache.cpp
+++ b/llvm/lib/Support/Mustache.cpp
@@ -166,6 +166,10 @@ class ASTNode {
   void renderSectionLambdas(const llvm::json::Value &Contexts,
                             llvm::raw_ostream &OS, SectionLambda &L);
 
+  void indentTextNode(std::string &body, size_t Indentation, bool lastChild);
+
+  void indentNodes(ASTNode *Node, bool isPartial);
+
   void renderPartial(const llvm::json::Value &Contexts, llvm::raw_ostream &OS,
                      ASTNode *Partial);
 
@@ -681,10 +685,57 @@ void ASTNode::renderChild(const json::Value &Contexts, llvm::raw_ostream &OS) {
     Child->render(Contexts, OS);
 }
 
+void ASTNode::indentTextNode(std::string &body, size_t Indentation,
+                             bool lastChild) {
+  std::string spaces(Indentation, ' ');
+  size_t pos = 0;
+
+  if (lastChild)
+    body.erase(body.find_last_not_of(" \t\r\f\v") + 1); // .rtrim??
+
+  while ((pos = body.find('\n', pos)) != std::string::npos) {
+    if ((!lastChild) || (pos != body.size() - 1)) {
+      body.insert(pos + 1, spaces);
+      pos += 1 + Indentation;
+    } else {
+      break;
+    }
+  }
+}
+
+void ASTNode::indentNodes(ASTNode *Node, bool isPartial) {
+  size_t size = Node->Children.size();
+
+  for (size_t i = 0; i < size; ++i) {
+    ASTNode *child = Node->Children[i].get();
+    switch (child->Ty) {
+    case ASTNode::Text: {
+      indentTextNode(child->Body, Indentation, ((i == size - 1) && isPartial));
+      break;
+    }
+    case ASTNode::Section: {
+      indentNodes(child, false);
+      break;
+    }
+    case ASTNode::Partial: {
+      indentNodes(child, true);
+    }
+    case ASTNode::Root:
+    case ASTNode::Variable:
+    case ASTNode::UnescapeVariable:
+    case ASTNode::InvertSection:
+      break;
+    default:
+      llvm::outs() << "Invalid ASTNode type\n";
+      break;
+    }
+  }
+}
+
 void ASTNode::renderPartial(const json::Value &Contexts, llvm::raw_ostream &OS,
                             ASTNode *Partial) {
-  AddIndentationStringStream IS(OS, Indentation);
-  Partial->render(Contexts, IS);
+  indentNodes(Partial, true);
+  Partial->render(Contexts, OS);
 }
 
 void ASTNode::renderLambdas(const json::Value &Contexts, llvm::raw_ostream &OS,

@mdenson
Copy link
Contributor Author

mdenson commented Aug 16, 2025

if you're happy with that please merge (no rights)

@mdenson
Copy link
Contributor Author

mdenson commented Aug 19, 2025

fixes #149844

@evelez7 evelez7 requested a review from ilovepi August 24, 2025 21:25
@evelez7
Copy link
Member

evelez7 commented Aug 24, 2025

I'm not sure what you mean by other tools or how to solve this in JSON. My understanding is that the JSON just feeds the data to be rendered in the template, then this library handles the rendering, so the white space needs to be fixed here. AFAIK this library outputs the actual HTML file with all the entities written, so any whitespace issues come from here.

It'd be nice to have a test to check for exact whitespace offsets and indentation. I don't think whitespace matters in the lit suites but I believe it does in unit tests. Is indentation something that's tested in the Mustache spec tests @ilovepi ? We might want something like that if you can check indentation in the nodes before writing them to a file.

I'll try to checkout this patch and test manually to see if I see any of the alignment issues I mentioned, but it might not be until next week.

@mdenson
Copy link
Contributor Author

mdenson commented Aug 24, 2025

I only meant that there are a lot of mustache implementations that will take input (like json) and a template and output html. See here as an example. That may not be something you want to consider.

I do know that the unit tests can check whitespace since that is what caused the first commit to fail the checks.

If this is an approach you would consider, I can add tests and look at optimizing. Just wanted to get some feedback first.

@ilovepi
Copy link
Contributor

ilovepi commented Aug 25, 2025

Thanks for the patch.

First, this needs some kind of test. Without at least that, we can't really tell whats happening to the output. If you want to check whitespace in lit tests, you'll probably need to set up some that use --match-full-lines ( https://llvm.org/docs/CommandGuide/FileCheck.html#cmdoption-FileCheck-match-full-lines). As @evelez7 mentioned, it may be possible to check in a unit test, and better test coverage is always welcome.

As for your patch, I'm not totally sure that this is the right way to handle things. I'm fairly certain that the current mustache implementation doesn't handle partials correctly, and even fails some of the conformance tests.

We have an in-tree tool to test the conformance
https://github.com/llvm/llvm-project/blob/main/llvm/utils/llvm-test-mustache-spec/llvm-test-mustache-spec.cpp

In that file there are a fair number of tests that are expected to fail (e.g. we don't conform). Notably the interpolation and partials test suites have failures that I think are significant. This is an area I've been meaning to spend some time improving, but I haven't had time recently to dig into the formatting problem.

My intuition is that the whitespace issue you're fixing on the clang-doc side is a symptom of this deficiency in our mustache implementation.

Just a thought... if we have JSON, and we have the templates, there are probably already tools that can do this better...

That may be true, but clang-doc is supposed to be a standalone tool with no external dependencies. You're right that another library or external tool may be even more well suited to the task, but its our goal to provide a compelling implementation here, and to not require any other third party dependencies. We're just not quite there yet.

@mdenson
Copy link
Contributor Author

mdenson commented Aug 25, 2025

Sorry I'm new here and still getting familiar with the project. I thought the unit tests at https://github.com/llvm/llvm-project/blob/main/llvm/unittests/Support/MustacheTest.cpp were meant to test conformance. Although I see now that there are tests from https://github.com/mustache/spec v1.4.3 that are missing.

We have an in-tree tool to test the conformance

How is this tool different, and when should I use one over the other?

My intuition is that the whitespace issue you're fixing on the clang-doc side is a symptom of this deficiency in our mustache implementation.

Ah, then perhaps I've really misunderstood this one. I thought llvm/lib/Support/Mustache.cpp was 'our mustache implementation'? Where should I be looking to sort that out?

That may be true, but clang-doc is supposed to be a standalone tool with no external dependencies. You're right that another library or external tool may be even more well suited to the task, but its our goal to provide a compelling implementation here, and to not require any other third-party dependencies. We're just not quite there yet.

I figured this was the case. Understood.

@ilovepi
Copy link
Contributor

ilovepi commented Aug 25, 2025

Sorry I'm new here and still getting familiar with the project. I thought the unit tests at https://github.com/llvm/llvm-project/blob/main/llvm/unittests/Support/MustacheTest.cpp were meant to test conformance. Although I see now that there are tests from https://github.com/mustache/spec v1.4.3 that are missing.

No need to apologize. Those tests do largely mimic the spec tests, but are a bit incomplete. We've mostly focused on the base spec to date.

We have an in-tree tool to test the conformance

How is this tool different

The unittests make sure that our implementation is working(in so far as they've been implemented), but we don't control the upstream spec. I wanted us to have a way to simply link in our implementation and test agains the public spec. Ideally, we'd get the full suite of tests translated into our unit tests, and fix the implementation as we go. But even then, I'd expect the llvm-test-mustache-spec tool to be useful for periodically checking our conformance as the spec updates.

, and when should I use one over the other?
For understanding conformance, use the tool. For testing our implementation is working the way we expect (which may not match the spec) you can use the unit tests.

My intuition is that the whitespace issue you're fixing on the clang-doc side is a symptom of this deficiency in our mustache implementation.

Ah, then perhaps I've really misunderstood this one. I thought llvm/lib/Support/Mustache.cpp was 'our mustache implementation'? Where should I be looking to sort that out?

oh, geez. I misread your patch. It seems you are touching the exact bits I'm talking about. sorry for the confusion. In this case the [clang-doc] tag is a bit misleading, since this is a change to [llvm], IMO. I'm guessing I didn't look closely enough after I saw the [clang-doc] tag and assumed your changes were to the clang-doc code under clang-tools-extra. Sorry for the misunderstanding.

That may be true, but clang-doc is supposed to be a standalone tool with no external dependencies. You're right that another library or external tool may be even more well suited to the task, but its our goal to provide a compelling implementation here, and to not require any other third-party dependencies. We're just not quite there yet.

I figured this was the case. Understood.

So, since it seems I've misread a part of your patch and given some rather questionable directions, let me summarize things briefly:

  1. your change is in the right place
  2. it still needs some kind of testing.
  3. you can/should also test w/ the llvm-test-mustache-spec tool, since after your patch, we'd need to update the xfail tests (hopefully). Anything that was xfail that now passes can be added to the unit tests.

@mdenson
Copy link
Contributor Author

mdenson commented Aug 25, 2025

Got it.

I will plan on the following:

  • run the patched llvm-test-mustache-spec against the latest specs and report back here
  • if there are failing tests we care about, try to fix them
  • when we're happy with the above, update the unit tests

In this case the [clang-doc] tag is a bit misleading, since this is a change to [llvm], IMO

Agreed. I don't think I can fix that without write access

@evelez7 evelez7 changed the title [clang-doc] fix mustache template whitespace [llvm] fix mustache template whitespace Aug 25, 2025
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in an earlier response, we still need some tests for this fix. Even better if the test cases are landed first as XFAIL.

Comment on lines 698 to 704
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while ((pos = Body.find('\n', pos)) != std::string::npos) {
if ((!FinalNode) || (pos != LastChar)) {
Body.insert(pos + 1, spaces);
pos += 1 + Indentation;
} else {
break;
}
}
}
while ((pos = Body.find('\n', pos)) != std::string::npos && !(FinalNode && pos == LastChar)) {
Body.insert(pos + 1, spaces);
pos += 1 + Indentation;
}

In any case you can use an early break to avoid the need for an else if you invert the if condition, so even if you don't want to combine them into the loop condition, there's a nicer way to write it.

Suggested change
while ((pos = Body.find('\n', pos)) != std::string::npos) {
if ((!FinalNode) || (pos != LastChar)) {
Body.insert(pos + 1, spaces);
pos += 1 + Indentation;
} else {
break;
}
}
}
while ((pos = Body.find('\n', pos)) != std::string::npos) {
if (FinalNode && (pos == LastChar))
break;
Body.insert(pos + 1, spaces);
pos += 1 + Indentation;
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
indentTextNode(child->Body, Indentation, ((i == size - 1) && isPartial));
// Only track the final node for partials.
bool IsFinalNode = ((i == size - 1) && isPartial);
indentTextNode(child->Body, Indentation, IsFinalNode);

This is probably easier for folks like me to reason about down the line.

Comment on lines +730 to +734
Copy link
Contributor

@ilovepi ilovepi Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default:
llvm::outs() << "Invalid ASTNode type\n";
break;
}
}
llvm_unreachable("Invalid ASTNode type");

I believe we should get an error if we fail to handle one of the cases due to -Wswitch-enum, so default isn't needed. Since all cases are fully covered, I'd prefer we mark it unreachable, so it breaks loudly. If you're not getting the compile time warning or error diagnostic in your build for the uncovered switch, please let me know so we can fix that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do get a warning about a fully covered switch with this, so we should remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, actually, since you're looping over a bunch of things, the unreachable may not work unless you switch break to continue.

@ilovepi
Copy link
Contributor

ilovepi commented Aug 25, 2025

Agreed. I don't think I can fix that without write access

I'm pretty sure you can by editing the commit message/title. I'll admit its been a while since I've not had commit access, but I believe it should be doable. If not, I can fix it for you as part of the review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void indentNodes(ASTNode *Node, bool isPartial);
void indentNodes(ASTNode *Node, bool IsPartial);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string spaces(Indentation, ' ');
std::string Spaces(Indentation, ' ');

The style guide prefers that we capitalize variables, and that they should be nouns. https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Pos

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: IsPartial

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Child

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted

Comment on lines +730 to +734
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do get a warning about a fully covered switch with this, so we should remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
indentNodes(child, true);
indentNodes(child, true);
break;

@mdenson
Copy link
Contributor Author

mdenson commented Aug 29, 2025

Ok, wasn't paying attention... I hope that is back to where it needs to be.

I've accepted the suggested changes with the exception of the unreachable bit, which doesn't seem to fit with the loop as you mention.

@mdenson
Copy link
Contributor Author

mdenson commented Aug 29, 2025

Passing llvm-test-mustache-spec partial test if you replace {{{...}}} with {{&...}} in partials.json since that is not supported in our implementation.

@evelez7 evelez7 removed compiler-rt compiler-rt:fuzzer flang Flang issues not falling into any other category labels Aug 29, 2025
@mdenson
Copy link
Contributor Author

mdenson commented Sep 16, 2025

any thoughts? unless you think otherwise, i can add the now (somewhat) passing spec test https://github.com/mustache/spec/blob/master/specs/partials.yml#L89-L107 to the unit tests (with the exchange of {{& }} for {{{ }}}) at https://github.com/llvm/llvm-project/blob/main/llvm/unittests/Support/MustacheTest.cpp and we could close the issue.

if there are other mustache features we care about and want to implement (such as {{{ }}}), i can give that a shot in another pr

@ilovepi
Copy link
Contributor

ilovepi commented Sep 17, 2025

I actually have a big stack of PRs that address the Mustache impl. I probably should have posted this a bit sooner. The stack starts here: #159182 and adds lots of missing functionality and some minor perf optimizations at the end.

I'm happy to review your patches, but given that there's another patch set that will implement more features I'm not too sure. Feedback on that stack is welcome, since its still definitely what I'd call a first draft, but it does get us to where we want.

@mdenson
Copy link
Contributor Author

mdenson commented Sep 17, 2025

Ok, will leave it with you then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mustache templates produce irregular whitespace

4 participants