[libc++] Optimize num_get integral functions #121795

philnik777 · 2025-01-06T16:20:38Z

---------------------------------------------------
Benchmark                            old        new
---------------------------------------------------
BM_num_get<bool>                 86.5 ns    32.3 ns
BM_num_get<long>                 82.1 ns    30.3 ns
BM_num_get<long long>            85.2 ns    33.4 ns
BM_num_get<unsigned short>       85.3 ns    31.2 ns
BM_num_get<unsigned int>         84.2 ns    31.1 ns
BM_num_get<unsigned long>        83.6 ns    31.9 ns
BM_num_get<unsigned long long>   87.7 ns    31.5 ns
BM_num_get<float>                 116 ns     114 ns
BM_num_get<double>                114 ns     114 ns
BM_num_get<long double>           113 ns     114 ns
BM_num_get<void*>                 151 ns     144 ns

This patch applies multiple optimizations:

Stages two and three of do_get are merged and a custom integer parser has been implemented
This avoids allocations, removes the need for strto{,u}ll and avoids __stage2_int_loop (avoiding extra writes to memory)
std::find has been replaced with __atoms_offset, which uses vector instructions to look for a character

Fixes #158100
Fixes #158102

github-actions · 2025-01-06T16:24:47Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff origin/main HEAD --extensions h,cpp -- libcxx/include/__algorithm/simd_utils.h libcxx/include/__locale_dir/locale_base_api.h libcxx/include/__locale_dir/locale_base_api/bsd_locale_fallbacks.h libcxx/include/__locale_dir/locale_base_api/ibm.h libcxx/include/__locale_dir/num.h libcxx/include/__locale_dir/support/bsd_like.h libcxx/include/__locale_dir/support/fuchsia.h libcxx/include/__locale_dir/support/linux.h libcxx/include/__locale_dir/support/no_locale/strtonum.h libcxx/include/__locale_dir/support/windows.h libcxx/include/__support/xlocale/__strtonum_fallback.h libcxx/src/locale.cpp libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_int.pass.cpp libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long.pass.cpp libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long_long.pass.cpp libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_short.pass.cpp --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.

diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp
index a110aae2d..21ce3ac11 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp
@@ -101,16 +101,14 @@ int main(int, char**)
         assert(v == 291);
     }
     {
-        const char str[] = "a123";
-        std::dec(ios);
-        std::ios_base::iostate err = ios.goodbit;
-        cpp17_input_iterator<const char*> iter =
-            f.get(cpp17_input_iterator<const char*>(str),
-                  cpp17_input_iterator<const char*>(str+sizeof(str)),
-                  ios, err, v);
-        assert(base(iter) == str);
-        assert(err == ios.failbit);
-        assert(v == 0);
+      const char str[] = "a123";
+      std::dec(ios);
+      std::ios_base::iostate err             = ios.goodbit;
+      cpp17_input_iterator<const char*> iter = f.get(
+          cpp17_input_iterator<const char*>(str), cpp17_input_iterator<const char*>(str + sizeof(str)), ios, err, v);
+      assert(base(iter) == str);
+      assert(err == ios.failbit);
+      assert(v == 0);
     }
     {
         const char str[] = "0x123";
@@ -619,7 +617,7 @@ int main(int, char**)
     }
     {
       v                          = -1;
-      std::string str = std::to_string(std::numeric_limits<unsigned long>::max()) + "99a";
+      std::string str            = std::to_string(std::numeric_limits<unsigned long>::max()) + "99a";
       std::ios_base::iostate err = ios.goodbit;
 
       cpp17_input_iterator<const char*> iter = f.get(
@@ -633,15 +631,17 @@ int main(int, char**)
       assert(v == std::numeric_limits<long>::max());
     }
     {
-        std::string str = std::to_string(std::numeric_limits<long>::max()) + 'c';
-        std::ios_base::iostate err = ios.goodbit;
-        cpp17_input_iterator<const char*> iter =
-            f.get(cpp17_input_iterator<const char*>(str.data()),
-                  cpp17_input_iterator<const char*>(str.data() + str.size()),
-                  ios, err, v);
-        assert(base(iter) == str.data() + str.size() - 1);
-        assert(err == ios.goodbit);
-        assert(v == std::numeric_limits<long>::max());
+      std::string str                        = std::to_string(std::numeric_limits<long>::max()) + 'c';
+      std::ios_base::iostate err             = ios.goodbit;
+      cpp17_input_iterator<const char*> iter = f.get(
+          cpp17_input_iterator<const char*>(str.data()),
+          cpp17_input_iterator<const char*>(str.data() + str.size()),
+          ios,
+          err,
+          v);
+      assert(base(iter) == str.data() + str.size() - 1);
+      assert(err == ios.goodbit);
+      assert(v == std::numeric_limits<long>::max());
     }
     {
       std::string str = std::to_string(static_cast<unsigned long>(std::numeric_limits<long>::max()) + 1) + 'c';
diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_int.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_int.pass.cpp
index f9cef08e2..1361404a4 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_int.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_int.pass.cpp
@@ -69,15 +69,13 @@ int main(int, char**)
         assert(v == 1);
     }
     {
-        const char str[] = "-1";
-        std::ios_base::iostate err = ios.goodbit;
-        cpp17_input_iterator<const char*> iter =
-            f.get(cpp17_input_iterator<const char*>(str),
-                  cpp17_input_iterator<const char*>(str+sizeof(str)),
-                  ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == std::numeric_limits<unsigned int>::max());
+      const char str[]                       = "-1";
+      std::ios_base::iostate err             = ios.goodbit;
+      cpp17_input_iterator<const char*> iter = f.get(
+          cpp17_input_iterator<const char*>(str), cpp17_input_iterator<const char*>(str + sizeof(str)), ios, err, v);
+      assert(base(iter) == str + sizeof(str) - 1);
+      assert(err == ios.goodbit);
+      assert(v == std::numeric_limits<unsigned int>::max());
     }
     std::hex(ios);
     {
diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long.pass.cpp
index fed6fc024..a49904a64 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long.pass.cpp
@@ -69,15 +69,13 @@ int main(int, char**)
         assert(v == 1);
     }
     {
-        const char str[] = "-1";
-        std::ios_base::iostate err = ios.goodbit;
-        cpp17_input_iterator<const char*> iter =
-            f.get(cpp17_input_iterator<const char*>(str),
-                  cpp17_input_iterator<const char*>(str+sizeof(str)),
-                  ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == std::numeric_limits<unsigned long>::max());
+      const char str[]                       = "-1";
+      std::ios_base::iostate err             = ios.goodbit;
+      cpp17_input_iterator<const char*> iter = f.get(
+          cpp17_input_iterator<const char*>(str), cpp17_input_iterator<const char*>(str + sizeof(str)), ios, err, v);
+      assert(base(iter) == str + sizeof(str) - 1);
+      assert(err == ios.goodbit);
+      assert(v == std::numeric_limits<unsigned long>::max());
     }
     std::hex(ios);
     {
diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long_long.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long_long.pass.cpp
index 0bdb6c1c3..b6dade2f3 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long_long.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_long_long.pass.cpp
@@ -69,15 +69,13 @@ int main(int, char**)
         assert(v == 1);
     }
     {
-        const char str[] = "-1";
-        std::ios_base::iostate err = ios.goodbit;
-        cpp17_input_iterator<const char*> iter =
-            f.get(cpp17_input_iterator<const char*>(str),
-                  cpp17_input_iterator<const char*>(str+sizeof(str)),
-                  ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == std::numeric_limits<unsigned long long>::max());
+      const char str[]                       = "-1";
+      std::ios_base::iostate err             = ios.goodbit;
+      cpp17_input_iterator<const char*> iter = f.get(
+          cpp17_input_iterator<const char*>(str), cpp17_input_iterator<const char*>(str + sizeof(str)), ios, err, v);
+      assert(base(iter) == str + sizeof(str) - 1);
+      assert(err == ios.goodbit);
+      assert(v == std::numeric_limits<unsigned long long>::max());
     }
     std::hex(ios);
     {
diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_short.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_short.pass.cpp
index decfbe943..f1ad05ccf 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_short.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_unsigned_short.pass.cpp
@@ -69,15 +69,13 @@ int main(int, char**)
         assert(v == 1);
     }
     {
-        const char str[] = "-1";
-        std::ios_base::iostate err = ios.goodbit;
-        cpp17_input_iterator<const char*> iter =
-            f.get(cpp17_input_iterator<const char*>(str),
-                  cpp17_input_iterator<const char*>(str+sizeof(str)),
-                  ios, err, v);
-        assert(base(iter) == str+sizeof(str)-1);
-        assert(err == ios.goodbit);
-        assert(v == std::numeric_limits<unsigned short>::max());
+      const char str[]                       = "-1";
+      std::ios_base::iostate err             = ios.goodbit;
+      cpp17_input_iterator<const char*> iter = f.get(
+          cpp17_input_iterator<const char*>(str), cpp17_input_iterator<const char*>(str + sizeof(str)), ios, err, v);
+      assert(base(iter) == str + sizeof(str) - 1);
+      assert(err == ios.goodbit);
+      assert(v == std::numeric_limits<unsigned short>::max());
     }
     std::hex(ios);
     {

llvmbot · 2025-02-26T10:45:32Z

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

---------------------------------------------------
Benchmark                            old        new
---------------------------------------------------
BM_num_get&lt;bool&gt;                 86.5 ns    32.3 ns
BM_num_get&lt;long&gt;                 82.1 ns    30.3 ns
BM_num_get&lt;long long&gt;            85.2 ns    33.4 ns
BM_num_get&lt;unsigned short&gt;       85.3 ns    31.2 ns
BM_num_get&lt;unsigned int&gt;         84.2 ns    31.1 ns
BM_num_get&lt;unsigned long&gt;        83.6 ns    31.9 ns
BM_num_get&lt;unsigned long long&gt;   87.7 ns    31.5 ns
BM_num_get&lt;float&gt;                 116 ns     114 ns
BM_num_get&lt;double&gt;                114 ns     114 ns
BM_num_get&lt;long double&gt;           113 ns     114 ns
BM_num_get&lt;void*&gt;                 151 ns     144 ns

Patch is 20.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121795.diff

7 Files Affected:

(modified) libcxx/docs/ReleaseNotes/20.rst (+2)
(modified) libcxx/include/__algorithm/simd_utils.h (+25)
(modified) libcxx/include/__locale_dir/locale_base_api.h (-11)
(modified) libcxx/include/__locale_dir/support/bsd_like.h (-9)
(modified) libcxx/include/__locale_dir/support/windows.h (-8)
(modified) libcxx/include/locale (+152-153)
(modified) libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp (+12)

diff --git a/libcxx/docs/ReleaseNotes/20.rst b/libcxx/docs/ReleaseNotes/20.rst
index 57ab0c167544b..d8e275020f6f4 100644
--- a/libcxx/docs/ReleaseNotes/20.rst
+++ b/libcxx/docs/ReleaseNotes/20.rst
@@ -120,6 +120,8 @@ Improvements and New Features
 
 - Added :ref:`hardening mode <hardening>` support for ``forward_list`` and ``bitset``.
 
+- The ``num_get::do_get`` integral overloads have been optimized. resulting in a performance improvement of up to 2.8x.
+
 Deprecations and Removals
 -------------------------
 
diff --git a/libcxx/include/__algorithm/simd_utils.h b/libcxx/include/__algorithm/simd_utils.h
index 4e03723a32854..987cf09cf48f7 100644
--- a/libcxx/include/__algorithm/simd_utils.h
+++ b/libcxx/include/__algorithm/simd_utils.h
@@ -116,11 +116,36 @@ template <class _VecT, class _Iter>
   }(make_index_sequence<__simd_vector_size_v<_VecT>>{});
 }
 
+// Load the first _Np elements, zero the rest
+_LIBCPP_DIAGNOSTIC_PUSH
+_LIBCPP_CLANG_DIAGNOSTIC_IGNORED("-Wpsabi")
+template <class _VecT, size_t _Np, class _Iter>
+[[__nodiscard__]] _LIBCPP_ALWAYS_INLINE _LIBCPP_HIDE_FROM_ABI _VecT __partial_load(_Iter __iter) noexcept {
+  return [=]<size_t... _LoadIndices, size_t... _ZeroIndices>(
+             index_sequence<_LoadIndices...>, index_sequence<_ZeroIndices...>) _LIBCPP_ALWAYS_INLINE noexcept {
+    return _VecT{__iter[_LoadIndices]...};
+  }(make_index_sequence<_Np>{}, make_index_sequence<__simd_vector_size_v<_VecT> - _Np>{});
+}
+
+template <class _VecT>
+[[__nodiscard__]] _LIBCPP_ALWAYS_INLINE _LIBCPP_HIDE_FROM_ABI _VecT
+__broadcast(__simd_vector_underlying_type_t<_VecT> __val) {
+  return [&]<std::size_t... _Indices>(index_sequence<_Indices...>) {
+    return _VecT{((void)_Indices, __val)...};
+  }(make_index_sequence<__simd_vector_size_v<_VecT>>());
+}
+_LIBCPP_DIAGNOSTIC_POP
+
 template <class _Tp, size_t _Np>
 [[__nodiscard__]] _LIBCPP_HIDE_FROM_ABI bool __all_of(__simd_vector<_Tp, _Np> __vec) noexcept {
   return __builtin_reduce_and(__builtin_convertvector(__vec, __simd_vector<bool, _Np>));
 }
 
+template <class _Tp, size_t _Np>
+[[__nodiscard__]] _LIBCPP_HIDE_FROM_ABI bool __none_of(__simd_vector<_Tp, _Np> __vec) noexcept {
+  return !__builtin_reduce_or(__vec);
+}
+
 template <class _Tp, size_t _Np>
 [[__nodiscard__]] _LIBCPP_HIDE_FROM_ABI size_t __find_first_set(__simd_vector<_Tp, _Np> __vec) noexcept {
   using __mask_vec = __simd_vector<bool, _Np>;
diff --git a/libcxx/include/__locale_dir/locale_base_api.h b/libcxx/include/__locale_dir/locale_base_api.h
index bbee9f49867fd..efe1e9fae44ac 100644
--- a/libcxx/include/__locale_dir/locale_base_api.h
+++ b/libcxx/include/__locale_dir/locale_base_api.h
@@ -51,8 +51,6 @@
 //  float               __strtof(const char*, char**, __locale_t);
 //  double              __strtod(const char*, char**, __locale_t);
 //  long double         __strtold(const char*, char**, __locale_t);
-//  long long           __strtoll(const char*, char**, __locale_t);
-//  unsigned long long  __strtoull(const char*, char**, __locale_t);
 // }
 //
 // Character manipulation functions
@@ -182,15 +180,6 @@ inline _LIBCPP_HIDE_FROM_ABI long double __strtold(const char* __nptr, char** __
   return strtold_l(__nptr, __endptr, __loc);
 }
 
-inline _LIBCPP_HIDE_FROM_ABI long long __strtoll(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
-  return strtoll_l(__nptr, __endptr, __base, __loc);
-}
-
-inline _LIBCPP_HIDE_FROM_ABI unsigned long long
-__strtoull(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
-  return strtoull_l(__nptr, __endptr, __base, __loc);
-}
-
 //
 // Character manipulation functions
 //
diff --git a/libcxx/include/__locale_dir/support/bsd_like.h b/libcxx/include/__locale_dir/support/bsd_like.h
index c0080b13a08cf..6ebf7726f0fab 100644
--- a/libcxx/include/__locale_dir/support/bsd_like.h
+++ b/libcxx/include/__locale_dir/support/bsd_like.h
@@ -75,15 +75,6 @@ inline _LIBCPP_HIDE_FROM_ABI long double __strtold(const char* __nptr, char** __
   return ::strtold_l(__nptr, __endptr, __loc);
 }
 
-inline _LIBCPP_HIDE_FROM_ABI long long __strtoll(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
-  return ::strtoll_l(__nptr, __endptr, __base, __loc);
-}
-
-inline _LIBCPP_HIDE_FROM_ABI unsigned long long
-__strtoull(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
-  return ::strtoull_l(__nptr, __endptr, __base, __loc);
-}
-
 //
 // Character manipulation functions
 //
diff --git a/libcxx/include/__locale_dir/support/windows.h b/libcxx/include/__locale_dir/support/windows.h
index ff89d3e87eb44..c3b24ca013181 100644
--- a/libcxx/include/__locale_dir/support/windows.h
+++ b/libcxx/include/__locale_dir/support/windows.h
@@ -184,14 +184,6 @@ inline _LIBCPP_HIDE_FROM_ABI double __strtod(const char* __nptr, char** __endptr
   return ::_strtod_l(__nptr, __endptr, __loc);
 }
 
-inline _LIBCPP_HIDE_FROM_ABI long long __strtoll(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
-  return ::_strtoi64_l(__nptr, __endptr, __base, __loc);
-}
-inline _LIBCPP_HIDE_FROM_ABI unsigned long long
-__strtoull(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
-  return ::_strtoui64_l(__nptr, __endptr, __base, __loc);
-}
-
 //
 // Character manipulation functions
 //
diff --git a/libcxx/include/locale b/libcxx/include/locale
index be0f31cece671..cd06dba8adbe6 100644
--- a/libcxx/include/locale
+++ b/libcxx/include/locale
@@ -198,7 +198,9 @@ template <class charT> class messages_byname;
 #    include <__algorithm/equal.h>
 #    include <__algorithm/find.h>
 #    include <__algorithm/max.h>
+#    include <__algorithm/min.h>
 #    include <__algorithm/reverse.h>
+#    include <__algorithm/simd_utils.h>
 #    include <__algorithm/unwrap_iter.h>
 #    include <__assert>
 #    include <__iterator/access.h>
@@ -375,7 +377,7 @@ struct _LIBCPP_EXPORTED_FROM_ABI __num_get_base {
   static int __get_base(ios_base&);
   static const char __src[33]; // "0123456789abcdefABCDEFxX+-pPiInN"
   // count of leading characters in __src used for parsing integers ("012..X+-")
-  static const size_t __int_chr_cnt = 26;
+  static inline const size_t __int_chr_cnt = 26;
   // count of leading characters in __src used for parsing floating-point values ("012..-pP")
   static const size_t __fp_chr_cnt = 28;
 };
@@ -403,6 +405,27 @@ struct __num_get : protected __num_get_base {
 
   [[__deprecated__("This exists only for ABI compatibility")]] static string
   __stage2_int_prep(ios_base& __iob, _CharT* __atoms, _CharT& __thousands_sep);
+
+  _LIBCPP_HIDE_FROM_ABI static ptrdiff_t __atoms_offset(const _CharT* __atoms, _CharT __val) {
+#    if _LIBCPP_HAS_ALGORITHM_VECTOR_UTILS
+    _LIBCPP_DIAGNOSTIC_PUSH
+    _LIBCPP_CLANG_DIAGNOSTIC_IGNORED("-Wpsabi")
+    if constexpr (is_same<_CharT, char>::value) {
+      using __vec   = __simd_vector<char, 32>;
+      __vec __chars = std::__broadcast<__vec>(__val);
+      __vec __cmp   = std::__partial_load<__vec, __int_chr_cnt>(__atoms);
+      auto __res    = __chars == __cmp;
+      if (std::__none_of(__res))
+        return __int_chr_cnt;
+      return std::min(__int_chr_cnt, std::__find_first_set(__res));
+    } else
+      _LIBCPP_DIAGNOSTIC_POP
+#    endif
+      {
+        return std::find(__atoms, __atoms + __int_chr_cnt, __val) - __atoms;
+      }
+  }
+
   static int __stage2_int_loop(
       _CharT __ct,
       int __base,
@@ -476,7 +499,7 @@ int __num_get<_CharT>::__stage2_int_loop(
     }
     return 0;
   }
-  ptrdiff_t __f = std::find(__atoms, __atoms + __int_chr_cnt, __ct) - __atoms;
+  ptrdiff_t __f = __atoms_offset(__atoms, __ct);
   if (__f >= 24)
     return -1;
   switch (__base) {
@@ -637,43 +660,39 @@ protected:
   _LIBCPP_METHOD_TEMPLATE_IMPLICIT_INSTANTIATION_VIS iter_type
   __do_get_floating_point(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, _Fp& __v) const;
 
-  template <class _Signed>
-  _LIBCPP_METHOD_TEMPLATE_IMPLICIT_INSTANTIATION_VIS iter_type
-  __do_get_signed(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, _Signed& __v) const;
-
-  template <class _Unsigned>
+  template <class _MaybeSigned>
   _LIBCPP_METHOD_TEMPLATE_IMPLICIT_INSTANTIATION_VIS iter_type
-  __do_get_unsigned(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, _Unsigned& __v) const;
+  __do_get_integral(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, _MaybeSigned& __v) const;
 
   virtual iter_type do_get(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, bool& __v) const;
 
   virtual iter_type do_get(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, long& __v) const {
-    return this->__do_get_signed(__b, __e, __iob, __err, __v);
+    return this->__do_get_integral(__b, __e, __iob, __err, __v);
   }
 
   virtual iter_type
   do_get(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, long long& __v) const {
-    return this->__do_get_signed(__b, __e, __iob, __err, __v);
+    return this->__do_get_integral(__b, __e, __iob, __err, __v);
   }
 
   virtual iter_type
   do_get(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, unsigned short& __v) const {
-    return this->__do_get_unsigned(__b, __e, __iob, __err, __v);
+    return this->__do_get_integral(__b, __e, __iob, __err, __v);
   }
 
   virtual iter_type
   do_get(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, unsigned int& __v) const {
-    return this->__do_get_unsigned(__b, __e, __iob, __err, __v);
+    return this->__do_get_integral(__b, __e, __iob, __err, __v);
   }
 
   virtual iter_type
   do_get(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, unsigned long& __v) const {
-    return this->__do_get_unsigned(__b, __e, __iob, __err, __v);
+    return this->__do_get_integral(__b, __e, __iob, __err, __v);
   }
 
   virtual iter_type
   do_get(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, unsigned long long& __v) const {
-    return this->__do_get_unsigned(__b, __e, __iob, __err, __v);
+    return this->__do_get_integral(__b, __e, __iob, __err, __v);
   }
 
   virtual iter_type do_get(iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, float& __v) const {
@@ -695,65 +714,6 @@ protected:
 template <class _CharT, class _InputIterator>
 locale::id num_get<_CharT, _InputIterator>::id;
 
-template <class _Tp>
-_LIBCPP_HIDE_FROM_ABI _Tp
-__num_get_signed_integral(const char* __a, const char* __a_end, ios_base::iostate& __err, int __base) {
-  if (__a != __a_end) {
-    __libcpp_remove_reference_t<decltype(errno)> __save_errno = errno;
-    errno                                                     = 0;
-    char* __p2;
-    long long __ll = __locale::__strtoll(__a, &__p2, __base, _LIBCPP_GET_C_LOCALE);
-    __libcpp_remove_reference_t<decltype(errno)> __current_errno = errno;
-    if (__current_errno == 0)
-      errno = __save_errno;
-    if (__p2 != __a_end) {
-      __err = ios_base::failbit;
-      return 0;
-    } else if (__current_errno == ERANGE || __ll < numeric_limits<_Tp>::min() || numeric_limits<_Tp>::max() < __ll) {
-      __err = ios_base::failbit;
-      if (__ll > 0)
-        return numeric_limits<_Tp>::max();
-      else
-        return numeric_limits<_Tp>::min();
-    }
-    return static_cast<_Tp>(__ll);
-  }
-  __err = ios_base::failbit;
-  return 0;
-}
-
-template <class _Tp>
-_LIBCPP_HIDE_FROM_ABI _Tp
-__num_get_unsigned_integral(const char* __a, const char* __a_end, ios_base::iostate& __err, int __base) {
-  if (__a != __a_end) {
-    const bool __negate = *__a == '-';
-    if (__negate && ++__a == __a_end) {
-      __err = ios_base::failbit;
-      return 0;
-    }
-    __libcpp_remove_reference_t<decltype(errno)> __save_errno = errno;
-    errno                                                     = 0;
-    char* __p2;
-    unsigned long long __ll = __locale::__strtoull(__a, &__p2, __base, _LIBCPP_GET_C_LOCALE);
-    __libcpp_remove_reference_t<decltype(errno)> __current_errno = errno;
-    if (__current_errno == 0)
-      errno = __save_errno;
-    if (__p2 != __a_end) {
-      __err = ios_base::failbit;
-      return 0;
-    } else if (__current_errno == ERANGE || numeric_limits<_Tp>::max() < __ll) {
-      __err = ios_base::failbit;
-      return numeric_limits<_Tp>::max();
-    }
-    _Tp __res = static_cast<_Tp>(__ll);
-    if (__negate)
-      __res = -__res;
-    return __res;
-  }
-  __err = ios_base::failbit;
-  return 0;
-}
-
 template <class _Tp>
 _LIBCPP_HIDE_FROM_ABI _Tp __do_strtod(const char* __a, char** __p2);
 
@@ -822,12 +782,12 @@ _InputIterator num_get<_CharT, _InputIterator>::do_get(
   return __b;
 }
 
-// signed
-
 template <class _CharT, class _InputIterator>
-template <class _Signed>
-_InputIterator num_get<_CharT, _InputIterator>::__do_get_signed(
-    iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, _Signed& __v) const {
+template <class _MaybeSigned>
+_InputIterator num_get<_CharT, _InputIterator>::__do_get_integral(
+    iter_type __first, iter_type __last, ios_base& __iob, ios_base::iostate& __err, _MaybeSigned& __v) const {
+  using _Unsigned = __make_unsigned_t<_MaybeSigned>;
+
   // Stage 1
   int __base = this->__get_base(__iob);
   // Stage 2
@@ -836,98 +796,137 @@ _InputIterator num_get<_CharT, _InputIterator>::__do_get_signed(
   char_type __atoms1[__atoms_size];
   const char_type* __atoms = this->__do_widen(__iob, __atoms1);
   string __grouping        = this->__stage2_int_prep(__iob, __thousands_sep);
-  string __buf;
-  __buf.resize(__buf.capacity());
-  char* __a     = &__buf[0];
-  char* __a_end = __a;
   unsigned __g[__num_get_base::__num_get_buf_sz];
   unsigned* __g_end = __g;
   unsigned __dc     = 0;
-  for (; __b != __e; ++__b) {
-    if (__a_end == __a + __buf.size()) {
-      size_t __tmp = __buf.size();
-      __buf.resize(2 * __buf.size());
-      __buf.resize(__buf.capacity());
-      __a     = &__buf[0];
-      __a_end = __a + __tmp;
+
+  if (__first == __last) {
+    __err |= ios_base::eofbit;
+    return __first;
+  }
+
+  while (!__grouping.empty() && *__first == __thousands_sep) {
+    ++__first;
+    if (__g_end - __g < this->__num_get_buf_sz) {
+      *__g_end++ = __dc;
+      __dc       = 0;
     }
-    if (this->__stage2_int_loop(
-            *__b,
-            __base,
-            __a,
-            __a_end,
-            __dc,
-            __thousands_sep,
-            __grouping,
-            __g,
-            __g_end,
-            const_cast<char_type*>(__atoms)))
-      break;
   }
-  if (__grouping.size() != 0 && __g_end - __g < __num_get_base::__num_get_buf_sz)
-    *__g_end++ = __dc;
-  // Stage 3
-  __v = std::__num_get_signed_integral<_Signed>(__a, __a_end, __err, __base);
-  // Digit grouping checked
-  __check_grouping(__grouping, __g, __g_end, __err);
-  // EOF checked
-  if (__b == __e)
+
+  bool __negate = false;
+  // __c == '+' || __c == '-'
+  if (auto __c = *__first; __c == __atoms[24] || __c == __atoms[25]) {
+    __negate = __c == __atoms[25];
+    ++__first;
+  }
+
+  if (__first == __last) {
     __err |= ios_base::eofbit;
-  return __b;
-}
+    return __first;
+  }
 
-// unsigned
+  bool __parsed_num = false;
 
-template <class _CharT, class _InputIterator>
-template <class _Unsigned>
-_InputIterator num_get<_CharT, _InputIterator>::__do_get_unsigned(
-    iter_type __b, iter_type __e, ios_base& __iob, ios_base::iostate& __err, _Unsigned& __v) const {
-  // Stage 1
-  int __base = this->__get_base(__iob);
-  // Stage 2
-  char_type __thousands_sep;
-  const int __atoms_size = __num_get_base::__int_chr_cnt;
-  char_type __atoms1[__atoms_size];
-  const char_type* __atoms = this->__do_widen(__iob, __atoms1);
-  string __grouping        = this->__stage2_int_prep(__iob, __thousands_sep);
-  string __buf;
-  __buf.resize(__buf.capacity());
-  char* __a     = &__buf[0];
-  char* __a_end = __a;
-  unsigned __g[__num_get_base::__num_get_buf_sz];
-  unsigned* __g_end = __g;
-  unsigned __dc     = 0;
-  for (; __b != __e; ++__b) {
-    if (__a_end == __a + __buf.size()) {
-      size_t __tmp = __buf.size();
-      __buf.resize(2 * __buf.size());
-      __buf.resize(__buf.capacity());
-      __a     = &__buf[0];
-      __a_end = __a + __tmp;
+  // If we don't have a pre-set base, figure it out
+  if (__base == 0) {
+    auto __c = *__first;
+    // __c == '0'
+    if (__c == __atoms[0]) {
+      ++__first;
+      if (__first == __last) {
+        __err |= ios_base::eofbit;
+        return __first;
+      }
+      // __c2 == 'x' || __c2 == 'X'
+      if (auto __c2 = *__first; __c2 == __atoms[22] || __c2 == __atoms[23]) {
+        __base = 16;
+        ++__first;
+      } else {
+        __base = 8;
+      }
+    } else {
+      __base = 10;
     }
-    if (this->__stage2_int_loop(
-            *__b,
-            __base,
-            __a,
-            __a_end,
-            __dc,
-            __thousands_sep,
-            __grouping,
-            __g,
-            __g_end,
-            const_cast<char_type*>(__atoms)))
+  } else if (__base == 16) {
+    // Try to swallow '0x'
+
+    // *__first == '0'
+    if (*__first == __atoms[0]) {
+      ++__first;
+      if (__first == __last) {
+        __err |= ios_base::eofbit;
+        return __first;
+      }
+      // __c == 'x' || __c == 'X'
+      if (auto __c = *__first; __c == __atoms[22] || __c == __atoms[23])
+        ++__first;
+      else
+        __parsed_num = true; // We only swallowed '0', so we've started to parse a number
+    }
+  }
+
+  // Calculate the actual number
+  _Unsigned __val   = 0;
+  bool __overflowed = false;
+  for (; __first != __last; ++__first) {
+    auto __c = *__first;
+    if (!__grouping.empty() && __c == __thousands_sep) {
+      if (__g_end - __g < this->__num_get_buf_sz) {
+        *__g_end++ = __dc;
+        __dc       = 0;
+      }
+      continue;
+    }
+    auto __offset = this->__atoms_offset(__atoms, __c);
+    if (__offset >= 22)
       break;
+
+    if (__base == 16 && __offset >= 16)
+      __offset -= 6;
+    if (__offset >= __base)
+      break;
+    __overflowed |= __builtin_mul_overflow(__val, __base, &__val) || __builtin_add_overflow(__val, __offset, &__val);
+    __parsed_num = true;
+    ++__dc;
+  }
+
+  if (!__parsed_num) {
+    __err |= ios_base::failbit;
+    __v = 0;
+  } else if (__overflowed) {
+    __err |= ios_base::failbit;
+    __v = is_signed<_MaybeSigned>::value && __negate
+            ? numeric_limits<_MaybeSigned>::min()
+            : numeric_limits<_MaybeSigned>::max();
+  } else if (!__negate) {
+    if (__val > static_cast<_Unsigned>(numeric_limits<_MaybeSigned>::max())) {
+      __err |= ios_base::failbit;
+      __v = numeric_limits<_MaybeSigned>::max();
+    } else {
+      __v = __val;
+    }
+  } else if (is_signed<_MaybeSigned>::value) {
+    if (__val > static_cast<_Unsigned>(numeric_limits<_MaybeSigned>::max()) + 1) {
+      __err |= ios_base::failbit;
+      __v = numeric_limits<_MaybeSigned>::min();
+    } else if (__val == static_cast<_Unsigned>(numeric_limits<_MaybeSigned>::max()) + 1) {
+      __v = numeric_limits<_MaybeSigned>::min();
+    } else {
+      __v = -__val;
+    }
+  } else {
+    __v = -__val;
   }
+
   if (__grouping.size() != 0 && __g_end - __g < __num_get_base::__num_get_buf_sz)
     *__g_end++ = __dc;
-  // Stage 3
-  __v = std::__num_get_unsigned_integral<_Unsigned>(__a, __a_end, __err, __base);
+
   // Digit grouping checked
   __check_grouping(__grouping, __g, __g_end, __err);
   // EOF checked
-  if (__b == __e)
+  if (__first == __last)
     __err |= ios_base::eofbit;
-  return __b;
+  return __first;
 }
 
 // floating point
diff --git a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp
index 03b74ebf53936..6bf38bd8c72a1 100644
--- a/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp
@@ -102,6 +102,18 @@ int main(int, char**)
         assert(err == ios.goodbit);
         assert(v == 291);
     }
+    {
+        const char str[] = "a123";
+        std::dec(ios);
+      ...
[truncated]

libcxx/docs/ReleaseNotes/20.rst

libcxx/include/__locale_dir/support/bsd_like.h

libcxx/include/locale

libcxx/include/__algorithm/simd_utils.h

libcxx/include/locale

...on/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp

libcxx/docs/ReleaseNotes/21.rst

libcxx/include/__locale_dir/num.h

ldionne · 2025-08-12T16:18:18Z

libcxx/include/__locale_dir/num.h

@fhahn I was expecting this code to get vectorized since it's a simple call to std::find where we even know the size of the range at compile-time. However we can't get Clang to emit vector code for that: https://godbolt.org/z/qxrj7n46K

Do you have a clue what's happening?

Yes, we should use __builtin_assume_dereferenceable to tell the compiler that the full input range is dereferenceable. Currently the size for the builtin needs to be a constant, but the underlying intrinsic already supports variable sizes and I'll just need to update the builtin in Clang.

An example of how the builtin is used is here: https://clang.godbolt.org/z/vq853PGjr

I'm not yet sure where the best place would be to add this in libc++. One way would be to add a specialization of std::find for std::vector and others?

Oh and another issue in the specific case is that LLVM currently will fully unroll loops with 'small' trip counts, which also means we won't vectorize, but that's a separate issue.

Ah, is that why https://godbolt.org/z/d6hW1vP9h doesn't vectorize?

Yeah the unrolling is the issue here. Interestingly enough it also doesn't vectorize if you disable unrolling (or change the constant to something larger, like 128; the reason seems to be that we compute the end pointer using an integer add and convert it back to a pointer. I'll check what's going on there

So with #153566, we will be able to vectorize something like

auto test(std::vector<short>::iterator first, short s) { auto ptr = __builtin_assume_aligned(&*first, 2); __builtin_assume_dereferenceable(ptr, 128 * sizeof(short)); return std::find(first, first + 128, s); }

https://godbolt.org/z/Wrovddv37

I think what we would want to write (in an ideal world) in the library is this: https://clang.godbolt.org/z/zcoce9erE

Do you envision that working?

morinmorin · 2025-09-11T23:21:30Z

This might be out of scope for this PR, but the following tests are failing:

When basefield is 0, a standard-conforming implementation parses "0" as (octal) zero, does not set failbit, and consumes all characters.

// "0" is an octal zero
std::istringstream is("0");
is >> std::setbase(0);

int x;
is >> x;
bool failbit = is.fail();
if (failbit) is.clear();
auto next_pos = is.tellg();

std::cout << "value: " << x << '\n'; // expected: 0
std::cout << "fail: " << failbit << '\n'; // expected: 0
std::cout << "next: " << next_pos << '\n'; // expected: -1

When basefield is 0, a standard-conforming implementation parses "08" as (octal) zero, does not set failbit, and consumes only the leading "0".

std::istringstream is("08");
is >> std::setbase(0);

int x;
is >> x;
bool failbit = is.fail();
if (failbit) is.clear();
auto next_pos = is.tellg();

std::cout << "value: " << x << '\n'; // expected: 0
std::cout << "fail: " << failbit << '\n'; // expected: 0
std::cout << "next: " << next_pos << '\n'; // expected: 1

philnik777 · 2025-10-31T12:55:22Z

@morinmorin Could you provide some reasoning for why that's the case? (I'm not saying you're wrong, I just want to understand why) I think these would be valuable test cases for this change.

morinmorin · 2025-11-01T13:06:27Z

@philnik777 Thank you for the comment. Here's the rationale behind the two test cases.

Stage 3 converts the accumulated sequence using the rules of strtoll. With base = 0, strtoll specifies that (emphasis mine)

the expected form of the subject sequence is that of an integer constant as described in 6.4.5.2

This is 7.24.2.8 p3 in WG14 N3299 (C23 draft). In this context, "integer constant" corresponds to a C++ integer literal.

C integer constants
6.4.5.2 p1 in N3299 specifies the literal 0 as an octal integer constant. Additionally, 6.4.5.2 p4 says:

An octal constant consists of the prefix 0 optionally followed by a sequence of the digits 0 through 7 only

This should be understood as "the digit 0 optionally followed by a sequence of the digits 0 through 7 only", which matches the formal grammar. The C++ rules for integer literals are analogous.

The first test case

When basefield is 0, a standard-conforming implementation parses "0" as (octal) zero, does not set failbit, and consumes all characters.

Stage 2 yields "0".
Stage 3 converts "0" to 0.
Because Stage 3 successfully converts the entire output of Stage 2, failbit is not set.

The second test case

When basefield is 0, a standard-conforming implementation parses "08" as (octal) zero, does not set failbit, and consumes only the leading "0".

Under the rules of scanf("%i",…)/strtol(…,…,0), "08" is not a valid sequence. So Stage 2 does not accumulate '8'. It yields "0" and the trailing '8' remains unconsumed for subsequence extraction.
Stage 3 converts "0" to 0.
Again, because Stage 3 successfully converts the entire output of Stage 2, failbit is not set.

ldionne

LGTM, thanks! A few comments to basically add more comments.

libcxx/include/__locale_dir/num.h

ldionne · 2025-11-10T15:32:52Z

libcxx/include/__locale_dir/num.h

-    __thousands_sep              = __np.thousands_sep();
-    return __np.grouping();
+  _LIBCPP_HIDE_FROM_ABI static ptrdiff_t __atoms_offset(const _CharT* __atoms, _CharT __val) {
+#  if _LIBCPP_HAS_ALGORITHM_VECTOR_UTILS


Let's add a comment pointing to one of the bug reports we have on std::find not vectorizing properly. That way we can come back and simplify this code once Clang does the right thing.

The last change to support __builtin_assume_derefenceable with pointer differences landed in main recently, so somehing like https://clang.godbolt.org/z/sEPbx3YaM now works.

Please let me know if anything else is missing

Yeah, neither versions currently seems to generate to what I'd like to see: https://godbolt.org/z/7G9vMYqY4 (i.e. a few vector instructions)

``` --------------------------------------------------- Benchmark old new --------------------------------------------------- BM_num_get<bool> 86.5 ns 32.3 ns BM_num_get<long> 82.1 ns 30.3 ns BM_num_get<long long> 85.2 ns 33.4 ns BM_num_get<unsigned short> 85.3 ns 31.2 ns BM_num_get<unsigned int> 84.2 ns 31.1 ns BM_num_get<unsigned long> 83.6 ns 31.9 ns BM_num_get<unsigned long long> 87.7 ns 31.5 ns BM_num_get<float> 116 ns 114 ns BM_num_get<double> 114 ns 114 ns BM_num_get<long double> 113 ns 114 ns BM_num_get<void*> 151 ns 144 ns ``` This patch applies multiple optimizations: - Stages two and three of do_get are merged and a custom integer parser has been implemented This avoids allocations, removes the need for strto{,u}ll and avoids __stage2_int_loop (avoiding extra writes to memory) - std::find has been replaced with __atoms_offset, which uses vector instructions to look for a character Fixes llvm#158100 Fixes llvm#158102

alexfh · 2025-12-01T13:29:13Z

Hi @philnik777, this seems to affect user-visible behavior of libc++. I don't know what exactly, but we've root-caused numerous test failures to this commit. From a brief look this might require some debugging to find the problem, so I'd appreciate any pointers at what we should be looking for in the affected tests/binaries. Assuming this change is correct, do you know any specific types of UB that the user code could depend on with regard to this function?
Thanks!

philnik777 · 2025-12-01T13:38:29Z

@alexfh This patch fixes some parsing bugs mentioned above, which could be the problem. Other than that I can't think of anything right now. Are the test crashing or just not giving the expected results?

alexfh · 2025-12-01T13:44:24Z

@alexfh This patch fixes some parsing bugs mentioned above, which could be the problem. Other than that I can't think of anything right now. Are the test crashing or just not giving the expected results?

One thing that happened is that material file loader in https://github.com/google/filament stopped working at least on some inputs. Does this commit affect std::stoul? If so, then the issue might be somewhere around https://github.com/google/filament/blob/main/tools/matinfo/src/main.cpp#L657

alexfh · 2025-12-01T13:45:26Z

In general, what's the API surface that this change affects? std::istream, std::stou?(i|l|ll)? Anything else?

philnik777 · 2025-12-01T13:51:05Z

Any streams should be affected, but I don't think the sto* family is. They call strto* functions, which don't use C++ locales.

alexfh · 2025-12-01T13:52:45Z

Apart from filament, we see problems in code using https://github.com/KhronosGroup/SPIRV-Cross.

eaeltsin · 2025-12-02T11:11:18Z

Looks like this change keeps the read value uninitialized in certain cases - see https://godbolt.org/z/jTzc4r4vo.

The following program

#include <cstdint>
#include <cstdio>
#include <iomanip>
#include <sstream>

template <typename T>
bool ParseNumber(const char* text, T* value_pointer) {
  if (!text) return false;
  std::istringstream text_stream(text);
  text_stream >> std::setbase(0);
  text_stream >> *value_pointer;
  bool ok = (text[0] != 0) && !text_stream.bad();
  ok = ok && text_stream.eof();
  ok = ok && !text_stream.fail();
  return ok;
}

int main() {
  const char* text = "0";  // <--------------------------------- parsing this keeps the value unassigned
  std::istringstream text_stream(text);
  uint64_t value = 0xaaaaaaaaaaaaaaaa;
  std::printf("0x%lx\n", value);
  ParseNumber(text, &value);
  std::printf("0x%lx\n", value);
  return 0;
}

at this commit prints

0xaaaaaaaaaaaaaaaa
0xaaaaaaaaaaaaaaaa

while before

0xaaaaaaaaaaaaaaaa
0x0

eaeltsin · 2025-12-02T11:23:05Z

The above example is actually https://github.com/KhronosGroup/SPIRV-Tools/blob/main/source/util/parse_number.h#L191

alexfh · 2025-12-02T12:25:25Z

@philnik777 it looks like the issue is in the libc++ change itself rather than in the user code.

``` --------------------------------------------------- Benchmark old new --------------------------------------------------- BM_num_get<bool> 86.5 ns 32.3 ns BM_num_get<long> 82.1 ns 30.3 ns BM_num_get<long long> 85.2 ns 33.4 ns BM_num_get<unsigned short> 85.3 ns 31.2 ns BM_num_get<unsigned int> 84.2 ns 31.1 ns BM_num_get<unsigned long> 83.6 ns 31.9 ns BM_num_get<unsigned long long> 87.7 ns 31.5 ns BM_num_get<float> 116 ns 114 ns BM_num_get<double> 114 ns 114 ns BM_num_get<long double> 113 ns 114 ns BM_num_get<void*> 151 ns 144 ns ``` This patch applies multiple optimizations: - Stages two and three of do_get are merged and a custom integer parser has been implemented This avoids allocations, removes the need for strto{,u}ll and avoids __stage2_int_loop (avoiding extra writes to memory) - std::find has been replaced with __atoms_offset, which uses vector instructions to look for a character Fixes llvm#158100 Fixes llvm#158102

philnik777 · 2025-12-03T10:49:26Z

@eaeltsin @alexfh #170460 should fix the issues. Could you verify that?

alexfh · 2025-12-03T17:01:21Z

@eaeltsin @alexfh #170460 should fix the issues. Could you verify that?

#170460 fixes the problems we've seen so far. Thanks!

philnik777 force-pushed the optimize_num_get branch 2 times, most recently from 0b82a8c to b466dc6 Compare January 6, 2025 16:56

philnik777 force-pushed the optimize_num_get branch 3 times, most recently from 4a99b3e to ae3fb04 Compare January 30, 2025 20:06

philnik777 marked this pull request as ready for review February 26, 2025 10:44

philnik777 requested a review from a team as a code owner February 26, 2025 10:44

llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Feb 26, 2025

ldionne requested changes Feb 26, 2025

View reviewed changes

philnik777 mentioned this pull request Feb 27, 2025

[libc++] Fix the locale base API on Linux with musl #128936

Merged

philnik777 force-pushed the optimize_num_get branch 2 times, most recently from 0ffa10e to 9f61a94 Compare February 28, 2025 12:44

philnik777 force-pushed the optimize_num_get branch from 9f61a94 to ea90f16 Compare July 4, 2025 11:04

philnik777 force-pushed the optimize_num_get branch 2 times, most recently from 0512ce1 to a6c2c52 Compare July 23, 2025 14:34

philnik777 force-pushed the optimize_num_get branch 2 times, most recently from 2225563 to 75ea131 Compare August 12, 2025 11:41

ldionne added the performance label Aug 12, 2025

ldionne requested changes Aug 12, 2025

View reviewed changes

philnik777 force-pushed the optimize_num_get branch from 75ea131 to d662ab8 Compare August 14, 2025 12:02

philnik777 force-pushed the optimize_num_get branch from d662ab8 to 71fc06b Compare September 4, 2025 11:15

This was referenced Sep 11, 2025

[libcxx] operator>>(integer): incorrect hex prefix recognition for basefield == 16 #158100

Closed

[libcxx] operator>>(integer) consumes too much input characters when basefield == 0 #158102

Closed

philnik777 force-pushed the optimize_num_get branch from 71fc06b to 6ab488f Compare October 31, 2025 12:55

philnik777 force-pushed the optimize_num_get branch from 6ab488f to 446a26b Compare November 7, 2025 13:32

philnik777 force-pushed the optimize_num_get branch from 446a26b to 4f56439 Compare November 10, 2025 10:34

ldionne approved these changes Nov 10, 2025

View reviewed changes

[libc++] Optimize num_get integral functions

5c62e32

philnik777 force-pushed the optimize_num_get branch from 4f56439 to 5c62e32 Compare November 18, 2025 15:37

philnik777 merged commit 2bdd135 into llvm:main Nov 24, 2025
74 of 81 checks passed

philnik777 deleted the optimize_num_get branch November 24, 2025 15:54

[libc++] Optimize num_get integral functions #121795

[libc++] Optimize num_get integral functions #121795

Uh oh!

Conversation

philnik777 commented Jan 6, 2025 • edited by ldionne Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

morinmorin commented Sep 11, 2025

Uh oh!

philnik777 commented Oct 31, 2025

Uh oh!

morinmorin commented Nov 1, 2025

Uh oh!

ldionne left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alexfh commented Dec 1, 2025

Uh oh!

philnik777 commented Dec 1, 2025

Uh oh!

alexfh commented Dec 1, 2025

Uh oh!

alexfh commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

philnik777 commented Dec 1, 2025

Uh oh!

alexfh commented Dec 1, 2025

Uh oh!

eaeltsin commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaeltsin commented Dec 2, 2025

Uh oh!

philnik777 commented Jan 6, 2025 •

edited by ldionne

Loading

github-actions bot commented Jan 6, 2025 •

edited

Loading

alexfh commented Dec 1, 2025 •

edited

Loading

eaeltsin commented Dec 2, 2025 •

edited

Loading