Open
Conversation
Owner
|
Experimented with this on large file loading, it did not seem to make a significant difference. I am not sure what was going from 15-18% to 1-3% in perf but it does not seem to impact the actual startup time on a big file, which is dominated by allocating the lines. I was measuring using this change, which migrates the other std::find to memchr as well: diff --git a/src/buffer_utils.cc b/src/buffer_utils.cc
index cfac4881..7ed5430b 100644
--- a/src/buffer_utils.cc
+++ b/src/buffer_utils.cc
@@ -97,11 +97,13 @@ static BufferLines parse_lines(const char* pos, const char* end, EolFormat eolfo
if (lines.size() >= std::numeric_limits<int>::max())
throw runtime_error("too many lines");
- const char* eol = std::find(pos, end, '\n');
- if ((eol - pos) >= std::numeric_limits<int>::max())
+ const char* eol = reinterpret_cast<const char*>(memchr(pos, '\n', end - pos));
+ if (((eol ? eol : end) - pos) >= std::numeric_limits<int>::max())
throw runtime_error("line is too long");
- lines.emplace_back(StringData::create(StringView{pos, eol - (eolformat == EolFormat::Crlf and eol != end ? 1 : 0)}, "\n"));
+ lines.emplace_back(StringData::create(StringView{pos, eol ? eol - (eolformat == EolFormat::Crlf ? 1 : 0) : end}, "\n"));
+ if (not eol)
+ break;
pos = eol + 1;
}
@@ -137,8 +139,8 @@ decltype(auto) parse_file(StringView filename, Func&& func)
size_t line_count = 1;
bool has_crlf = false, has_lf = false;
- for (auto it = std::find(pos, end, '\n'); it != end; it = std::find(it+1, end, '\n'), ++line_count)
- ((it != pos and *(it-1) == '\r') ? has_crlf : has_lf) = true;
+ for (const char* eol = pos; (eol = reinterpret_cast<const char*>(memchr(eol, '\n', end - eol))); ++eol, ++line_count)
+ ((eol != pos and *(eol-1) == '\r') ? has_crlf : has_lf) = true;
auto eolformat = (has_crlf and not has_lf) ? EolFormat::Crlf : EolFormat::Lf;
FsStatus fs_status{file.st.st_mtim, file.st.st_size, murmur3(file.data, file.st.st_size)}; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
perf record shows improvement from ~15-18% to 1-3% using
kak -nIf this is equivalent to the current implementation, a similar update can be made to parse_lines as well.