Skip to content

Conversation

@ilovepi
Copy link
Contributor

@ilovepi ilovepi commented Sep 22, 2025

Delimiters in mustache are generally 2-4 character sequences. While good
for general search, we can beat find() for these short sequences by just
using memchr() to find the first match, and then checking the next few
characters directly.

Delimiters in mustache are generally 2-4 character sequences. While good
for general search, we can beat find() for these short sequences by just
using memchr() to find the first match, and then checking the next few
characters directly.
Copy link
Contributor Author

ilovepi commented Sep 22, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@llvmbot
Copy link
Member

llvmbot commented Sep 22, 2025

@llvm/pr-subscribers-llvm-support

Author: Paul Kirth (ilovepi)

Changes

Delimiters in mustache are generally 2-4 character sequences. While good
for general search, we can beat find() for these short sequences by just
using memchr() to find the first match, and then checking the next few
characters directly.


Full diff: https://github.com/llvm/llvm-project/pull/160165.diff

1 Files Affected:

  • (modified) llvm/lib/Support/Mustache.cpp (+53-3)
diff --git a/llvm/lib/Support/Mustache.cpp b/llvm/lib/Support/Mustache.cpp
index 6c2ed6c84c6cf..c7cebe6b64fae 100644
--- a/llvm/lib/Support/Mustache.cpp
+++ b/llvm/lib/Support/Mustache.cpp
@@ -17,6 +17,50 @@ namespace {
 
 using Accessor = SmallVector<std::string>;
 
+// A more generic specialized find for needles of length 1-3.
+[[maybe_unused]]
+static size_t findDelimiters(StringRef Haystack, StringRef Needle,
+                             size_t Offset = 0) {
+  const size_t N = Needle.size();
+  if (N == 0)
+    return Offset;
+  if (N > 3) {
+    // Fallback for longer needles where more advanced algorithms are better.
+    return Haystack.find(Needle, Offset);
+  }
+
+  const char *HaystackStart = Haystack.data();
+  const size_t HaystackSize = Haystack.size();
+  if (HaystackSize < N + Offset)
+    return StringRef::npos;
+
+  const char *NeedleStart = Needle.data();
+  const char *Current = HaystackStart + Offset;
+  const char *End = HaystackStart + HaystackSize;
+
+  while (Current + N <= End) {
+    // Stage 1: Find the first character of the needle.
+    Current = (const char *)::memchr(Current, NeedleStart[0], End - Current);
+    if (!Current || Current + N > End)
+      return StringRef::npos;
+
+    // Stage 2: Validate the rest of the sequence.
+    if (N == 1)
+      return Current - HaystackStart;
+    if (N == 2 && Current[1] == NeedleStart[1])
+      return Current - HaystackStart;
+    if (N == 3 && Current[1] == NeedleStart[1] && Current[2] == NeedleStart[2])
+      return Current - HaystackStart;
+
+    // Mismatch, advance and continue the search.
+    ++Current;
+  }
+
+  return StringRef::npos;
+}
+
+
+
 static bool isFalsey(const json::Value &V) {
   return V.getAsNull() || (V.getAsBoolean() && !V.getAsBoolean().value()) ||
          (V.getAsArray() && V.getAsArray()->empty());
@@ -306,7 +350,9 @@ SmallVector<Token> tokenize(StringRef Template) {
   StringLiteral Open("{{");
   StringLiteral Close("}}");
   size_t Start = 0;
-  size_t DelimiterStart = Template.find(Open);
+  // size_t DelimiterStart = Template.find(Open);
+  size_t DelimiterStart = findDelimiters(Template, Open);
+
   if (DelimiterStart == StringRef::npos) {
     Tokens.emplace_back(Template.str());
     return Tokens;
@@ -314,7 +360,8 @@ SmallVector<Token> tokenize(StringRef Template) {
   while (DelimiterStart != StringRef::npos) {
     if (DelimiterStart != Start)
       Tokens.emplace_back(Template.substr(Start, DelimiterStart - Start).str());
-    size_t DelimiterEnd = Template.find(Close, DelimiterStart);
+    // size_t DelimiterEnd = Template.find(Close, DelimiterStart);
+    size_t DelimiterEnd = findDelimiters(Template, Close, DelimiterStart);
     if (DelimiterEnd == StringRef::npos)
       break;
 
@@ -326,7 +373,8 @@ SmallVector<Token> tokenize(StringRef Template) {
     std::string RawBody = Open.str() + Interpolated + Close.str();
     Tokens.emplace_back(RawBody, Interpolated, Interpolated[0]);
     Start = DelimiterEnd + Close.size();
-    DelimiterStart = Template.find(Open, Start);
+    // DelimiterStart = Template.find(Open, Start);
+    DelimiterStart = findDelimiters(Template, Open, Start);
   }
 
   if (Start < Template.size())
@@ -572,6 +620,8 @@ void ASTNode::render(const json::Value &CurrentCtx, raw_ostream &OS) {
   ParentContext = &CurrentCtx;
   const json::Value *ContextPtr = Ty == Root ? ParentContext : findContext();
 
+  if (AccessorValue.empty() && (Ty != Root && Ty != Text))
+    return;
   switch (Ty) {
   case Root:
     renderChild(CurrentCtx, OS);

@github-actions
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions cpp -- llvm/lib/Support/Mustache.cpp

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/llvm/lib/Support/Mustache.cpp b/llvm/lib/Support/Mustache.cpp
index c7cebe6b6..4acd7b816 100644
--- a/llvm/lib/Support/Mustache.cpp
+++ b/llvm/lib/Support/Mustache.cpp
@@ -59,8 +59,6 @@ static size_t findDelimiters(StringRef Haystack, StringRef Needle,
   return StringRef::npos;
 }
 
-
-
 static bool isFalsey(const json::Value &V) {
   return V.getAsNull() || (V.getAsBoolean() && !V.getAsBoolean().value()) ||
          (V.getAsArray() && V.getAsArray()->empty());

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much does this improve performance?

using Accessor = SmallVector<std::string>;

// A more generic specialized find for needles of length 1-3.
[[maybe_unused]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not maybe_unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops left from early debugging. I'll remove.

StringLiteral Close("}}");
size_t Start = 0;
size_t DelimiterStart = Template.find(Open);
// size_t DelimiterStart = Template.find(Open);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover commented code

const json::Value *ContextPtr = Ty == Root ? ParentContext : findContext();

if (AccessorValue.empty() && (Ty != Root && Ty != Text))
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change?

Copy link
Contributor Author

ilovepi commented Sep 22, 2025

sigh. I was consistently getting 10% on the template parsing benchmarks, w/ the rest under 1% variance run to run. But now I'm getting more consistent regressions with only 4% improvements. That was across 20 iterations of the benchmark via ./benchmarks/MustacheBench --benchmark_repetitions=20 --benchmark_display_aggregates_only. I'm guessing some of the benchmarks need more tuning/refining to be less sensitive to things happening on the CPU.

Given that I'm not seeing consistent wins on this patch, lets close this for now, and if we find a bottleneck splitting up the template, we can revisit.

@ilovepi ilovepi closed this Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants