Skip to content

Commit 8701c2a

Browse files
Refactor WIDE_READ to allow finer control over high-performance function selection (#165613)
[This is more of a straw-proposal than a ready-for-merging PR. I got started thinking about what this might look like, and ended up just implementing something as a proof-of-concept. Totally open to other methods an ideas.] As we implement more high-performance string-related functions, we have found a need for better control over their selection than the big-hammer LIBC_CONF_STRING_LENGTH_WIDE_READ. For example, I have a memchr implementation coming, and unless I implement it in every variant, a simple binary value doesn't work. This PR makes gives finer-grained control over high-performance functions than the generic LIBC_CONF_UNSAFE_WIDE_READ option. For any function they like, the user can now select one of four implementations at build time: 1. element, which reads byte-by-byte (or wchar by wchar) 2. wide, which reads by unsigned long 3. generic, which uses standard clang vector implemenations, if available 4. arch, which uses an architecture-specific implemenation (Reading the code carefully, you may note that a user can actually specify any namespace they want, so we aren't technically limited to those 4.) We may also want to switch from command-line #defines as it is currently done, to something more like llvm-project/llvm/include/llvm/Config/llvm-config.h.cmake, and #including the resulting file, which would move quite a bit of complexity out of the command-line. But that's a future problem.
1 parent 203cd83 commit 8701c2a

File tree

13 files changed

+275
-185
lines changed

13 files changed

+275
-185
lines changed

libc/cmake/modules/LLVMLibCCompileOptionRules.cmake

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -81,9 +81,8 @@ function(_get_compile_options_from_config output_var)
8181
list(APPEND config_options "-DLIBC_QSORT_IMPL=${LIBC_CONF_QSORT_IMPL}")
8282
endif()
8383

84-
if(LIBC_CONF_STRING_UNSAFE_WIDE_READ)
85-
list(APPEND config_options "-DLIBC_COPT_STRING_UNSAFE_WIDE_READ")
86-
endif()
84+
list(APPEND config_options "-DLIBC_COPT_STRING_LENGTH_IMPL=${LIBC_CONF_STRING_LENGTH_IMPL}")
85+
list(APPEND config_options "-DLIBC_COPT_FIND_FIRST_CHARACTER_IMPL=${LIBC_CONF_FIND_FIRST_CHARACTER_IMPL}")
8786

8887
if(LIBC_CONF_MEMSET_X86_USE_SOFTWARE_PREFETCHING)
8988
list(APPEND config_options "-DLIBC_COPT_MEMSET_X86_USE_SOFTWARE_PREFETCHING")

libc/config/config.json

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@
4040
"value": false,
4141
"doc": "Use an alternative printf float implementation based on 320-bit floats"
4242
},
43+
4344
"LIBC_CONF_PRINTF_DISABLE_FIXED_POINT": {
4445
"value": false,
4546
"doc": "Disable printing fixed point values in printf and friends."
@@ -64,9 +65,13 @@
6465
}
6566
},
6667
"string": {
67-
"LIBC_CONF_STRING_UNSAFE_WIDE_READ": {
68-
"value": false,
69-
"doc": "Read more than a byte at a time to perform byte-string operations like strlen."
68+
"LIBC_CONF_STRING_LENGTH_IMPL": {
69+
"value": "element",
70+
"doc": "Selects the implementation for string-length: 'element', 'word', 'clang_vector', or 'arch_vector'."
71+
},
72+
"LIBC_CONF_FIND_FIRST_CHARACTER_IMPL": {
73+
"value": "element",
74+
"doc": "Selects the implementation for find-first-character-related functions: 'element', 'word', 'clang_vector', or 'arch_vector'."
7075
},
7176
"LIBC_CONF_MEMSET_X86_USE_SOFTWARE_PREFETCHING": {
7277
"value": false,

libc/config/linux/arm/config.json

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
{
22
"string": {
3-
"LIBC_CONF_STRING_UNSAFE_WIDE_READ": {
4-
"value": false
3+
"LIBC_CONF_STRING_LENGTH_IMPL": {
4+
"value": "element"
5+
}
6+
"LIBC_CONF_FIND_FIRST_CHARACTER_IMPL": {
7+
"value": "element"
58
}
69
}
710
}

libc/config/linux/config.json

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
{
22
"string": {
3-
"LIBC_CONF_STRING_UNSAFE_WIDE_READ": {
4-
"value": true
3+
"LIBC_CONF_STRING_LENGTH_IMPL": {
4+
"value": "clang_vector",
5+
},
6+
"LIBC_CONF_FIND_FIRST_CHARACTER_IMPL": {
7+
"value": "word",
58
}
69
}
710
}
Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
{
22
"string": {
3-
"LIBC_CONF_STRING_UNSAFE_WIDE_READ": {
4-
"value": false
3+
"LIBC_CONF_STRING_LENGTH_IMPL": {
4+
"value": "element"
5+
}
6+
"LIBC_CONF_FIND_FIRST_CHARACTER_IMPL": {
7+
"value": "element"
58
}
69
}
710
}

libc/docs/configure.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,9 @@ to learn about the defaults for your platform and target.
5858
* **"setjmp" options**
5959
- ``LIBC_CONF_SETJMP_AARCH64_RESTORE_PLATFORM_REGISTER``: Make setjmp save the value of x18, and longjmp restore it. The AArch64 ABI delegates this register to platform ABIs, which can choose whether to make it caller-saved.
6060
* **"string" options**
61+
- ``LIBC_CONF_FIND_FIRST_CHARACTER_IMPL``: Selects the implementation for find-first-character-related functions: 'element', 'word', 'clang_vector', or 'arch_vector'.
6162
- ``LIBC_CONF_MEMSET_X86_USE_SOFTWARE_PREFETCHING``: Inserts prefetch for write instructions (PREFETCHW) for memset on x86 to recover performance when hardware prefetcher is disabled.
62-
- ``LIBC_CONF_STRING_UNSAFE_WIDE_READ``: Read more than a byte at a time to perform byte-string operations like strlen.
63+
- ``LIBC_CONF_STRING_LENGTH_IMPL``: Selects the implementation for string-length: 'element', 'word', 'clang_vector', or 'arch_vector'.
6364
* **"threads" options**
6465
- ``LIBC_CONF_THREAD_MODE``: The implementation used for Mutex, acceptable values are LIBC_THREAD_MODE_PLATFORM, LIBC_THREAD_MODE_SINGLE, and LIBC_THREAD_MODE_EXTERNAL.
6566
* **"time" options**

libc/src/string/memory_utils/aarch64/inline_strlen.h

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
#include <arm_neon.h>
1616
#include <stddef.h> // size_t
1717
namespace LIBC_NAMESPACE_DECL {
18-
namespace neon {
18+
namespace internal::neon {
1919
[[maybe_unused]] LIBC_NO_SANITIZE_OOB_ACCESS LIBC_INLINE static size_t
2020
string_length(const char *src) {
2121
using Vector __attribute__((may_alias)) = uint8x8_t;
@@ -43,15 +43,15 @@ string_length(const char *src) {
4343
(cpp::countr_zero(cmp) >> 3));
4444
}
4545
}
46-
} // namespace neon
46+
} // namespace internal::neon
4747
} // namespace LIBC_NAMESPACE_DECL
4848
#endif // __ARM_NEON
4949

5050
#ifdef LIBC_TARGET_CPU_HAS_SVE
5151
#include "src/__support/macros/optimization.h"
5252
#include <arm_sve.h>
5353
namespace LIBC_NAMESPACE_DECL {
54-
namespace sve {
54+
namespace internal::sve {
5555
[[maybe_unused]] LIBC_INLINE static size_t string_length(const char *src) {
5656
const uint8_t *ptr = reinterpret_cast<const uint8_t *>(src);
5757
// Initialize the first-fault register to all true
@@ -92,15 +92,19 @@ namespace sve {
9292
len += svcntp_b8(all_true, before_zero);
9393
return len;
9494
}
95-
} // namespace sve
95+
} // namespace internal::sve
9696
} // namespace LIBC_NAMESPACE_DECL
9797
#endif // LIBC_TARGET_CPU_HAS_SVE
9898

9999
namespace LIBC_NAMESPACE_DECL {
100+
namespace internal::arch_vector {
101+
[[maybe_unused]] LIBC_INLINE size_t string_length(const char *src) {
100102
#ifdef LIBC_TARGET_CPU_HAS_SVE
101-
namespace string_length_impl = sve;
103+
return sve::string_length(src);
102104
#elif defined(__ARM_NEON)
103-
namespace string_length_impl = neon;
105+
return neon::string_length(src);
104106
#endif
107+
}
108+
} // namespace internal::arch_vector
105109
} // namespace LIBC_NAMESPACE_DECL
106110
#endif // LLVM_LIBC_SRC_STRING_MEMORY_UTILS_AARCH64_INLINE_STRLEN_H

libc/src/string/memory_utils/generic/inline_strlen.h

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
#include "src/__support/common.h"
1515

1616
namespace LIBC_NAMESPACE_DECL {
17-
namespace internal {
17+
namespace clang_vector {
1818

1919
// Exploit the underlying integer representation to do a variable shift.
2020
LIBC_INLINE constexpr cpp::simd_mask<char> shift_mask(cpp::simd_mask<char> m,
@@ -46,9 +46,8 @@ LIBC_NO_SANITIZE_OOB_ACCESS LIBC_INLINE size_t string_length(const char *src) {
4646
cpp::find_first_set(mask);
4747
}
4848
}
49-
} // namespace internal
49+
} // namespace clang_vector
5050

51-
namespace string_length_impl = internal;
5251
} // namespace LIBC_NAMESPACE_DECL
5352

5453
#endif // LLVM_LIBC_SRC_STRING_MEMORY_UTILS_GENERIC_INLINE_STRLEN_H

libc/src/string/memory_utils/x86_64/inline_strlen.h

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@
1515

1616
namespace LIBC_NAMESPACE_DECL {
1717

18-
namespace string_length_internal {
18+
namespace internal::arch_vector {
19+
1920
// Return a bit-mask with the nth bit set if the nth-byte in block_ptr is zero.
2021
template <typename Vector, typename Mask>
2122
LIBC_NO_SANITIZE_OOB_ACCESS LIBC_INLINE static Mask
@@ -92,15 +93,18 @@ namespace avx512 {
9293
}
9394
} // namespace avx512
9495
#endif
95-
} // namespace string_length_internal
9696

97+
[[maybe_unused]] LIBC_INLINE size_t string_length(const char *src) {
9798
#if defined(__AVX512F__)
98-
namespace string_length_impl = string_length_internal::avx512;
99+
return avx512::string_length(src);
99100
#elif defined(__AVX2__)
100-
namespace string_length_impl = string_length_internal::avx2;
101+
return avx2::string_length(src);
101102
#else
102-
namespace string_length_impl = string_length_internal::sse2;
103+
return sse2::string_length(src);
103104
#endif
105+
}
106+
107+
} // namespace internal::arch_vector
104108

105109
} // namespace LIBC_NAMESPACE_DECL
106110

0 commit comments

Comments
 (0)