Use memchr in parse_file by bscuron · Pull Request #5437 · mawww/kakoune

bscuron · 2026-01-15T03:46:00Z

perf record shows improvement from ~15-18% to 1-3% using kak -n

If this is equivalent to the current implementation, a similar update can be made to parse_lines as well.

mawww · 2026-02-27T00:48:38Z

Experimented with this on large file loading, it did not seem to make a significant difference. I am not sure what was going from 15-18% to 1-3% in perf but it does not seem to impact the actual startup time on a big file, which is dominated by allocating the lines.

I was measuring using this change, which migrates the other std::find to memchr as well:

diff --git a/src/buffer_utils.cc b/src/buffer_utils.cc
index cfac4881..7ed5430b 100644
--- a/src/buffer_utils.cc
+++ b/src/buffer_utils.cc
@@ -97,11 +97,13 @@ static BufferLines parse_lines(const char* pos, const char* end, EolFormat eolfo
         if (lines.size() >= std::numeric_limits<int>::max())
             throw runtime_error("too many lines");
 
-        const char* eol = std::find(pos, end, '\n');
-        if ((eol - pos) >= std::numeric_limits<int>::max())
+        const char* eol = reinterpret_cast<const char*>(memchr(pos, '\n', end - pos));
+        if (((eol ? eol : end) - pos) >= std::numeric_limits<int>::max())
             throw runtime_error("line is too long");
 
-        lines.emplace_back(StringData::create(StringView{pos, eol - (eolformat == EolFormat::Crlf and eol != end ? 1 : 0)}, "\n"));
+        lines.emplace_back(StringData::create(StringView{pos, eol ? eol - (eolformat == EolFormat::Crlf ? 1 : 0) : end}, "\n"));
+        if (not eol)
+            break;
         pos = eol + 1;
     }
 
@@ -137,8 +139,8 @@ decltype(auto) parse_file(StringView filename, Func&& func)
 
     size_t line_count = 1;
     bool has_crlf = false, has_lf = false;
-    for (auto it = std::find(pos, end, '\n'); it != end; it = std::find(it+1, end, '\n'), ++line_count)
-        ((it != pos and *(it-1) == '\r') ? has_crlf : has_lf) = true;
+    for (const char* eol = pos; (eol = reinterpret_cast<const char*>(memchr(eol, '\n', end - eol))); ++eol, ++line_count)
+        ((eol != pos and *(eol-1) == '\r') ? has_crlf : has_lf) = true;
     auto eolformat = (has_crlf and not has_lf) ? EolFormat::Crlf : EolFormat::Lf;
 
     FsStatus fs_status{file.st.st_mtim, file.st.st_size, murmur3(file.data, file.st.st_size)};

Use memchr in parse_file

1340bd9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use memchr in parse_file#5437

Use memchr in parse_file#5437
bscuron wants to merge 1 commit intomawww:masterfrom
bscuron:memchr_parse_file

bscuron commented Jan 15, 2026 •

edited

Loading

Uh oh!

mawww commented Feb 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bscuron commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mawww commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bscuron commented Jan 15, 2026 •

edited

Loading

mawww commented Feb 27, 2026 •

edited

Loading