Skip to content

Conversation

@vzakhari
Copy link
Contributor

The len-1 case is noticeably slower than gfortran's straightforward
implementation https://github.com/gcc-mirror/gcc/blob/075611b646e5554ae02b2622061ea1614bf16ead/libgfortran/intrinsics/string_intrinsics_inc.c#L253
This change speeds up a simple microkernel by 37% on icelake.

The len-1 case is noticeably slower than gfortran's straightforward
implementation https://github.com/gcc-mirror/gcc/blob/075611b646e5554ae02b2622061ea1614bf16ead/libgfortran/intrinsics/string_intrinsics_inc.c#L253
This change speeds up a simple microkernel by 37% on icelake.
@vzakhari vzakhari requested a review from klausler April 29, 2025 22:59
if (wantLen == 1) {
// Trivial case for single character lookup.
// We can use simple forward search.
CHAR ch = want[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just become a call to std::memchr() for KIND==1.

For other kinds, please add braced initialization to the two declarations here.

@vzakhari vzakhari requested a review from klausler April 30, 2025 00:00
// We can use simple forward search.
CHAR ch{want[0]};
if constexpr (std::is_same_v<CHAR, char>) {
auto pos{reinterpret_cast<const CHAR *>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Save a line with if (auto pos{}) { ...

@vzakhari vzakhari merged commit a860706 into llvm:main Apr 30, 2025
9 checks passed
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
The len-1 case is noticeably slower than gfortran's straightforward
implementation
https://github.com/gcc-mirror/gcc/blob/075611b646e5554ae02b2622061ea1614bf16ead/libgfortran/intrinsics/string_intrinsics_inc.c#L253
This change speeds up a simple microkernel by 37% on icelake.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
The len-1 case is noticeably slower than gfortran's straightforward
implementation
https://github.com/gcc-mirror/gcc/blob/075611b646e5554ae02b2622061ea1614bf16ead/libgfortran/intrinsics/string_intrinsics_inc.c#L253
This change speeds up a simple microkernel by 37% on icelake.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants