Skip to content

Conversation

@SchrodingerZhu
Copy link
Contributor

@SchrodingerZhu SchrodingerZhu commented Sep 24, 2024

Based on discussion at https://www.openwall.com/lists/libc-coord/2024/09/23/1, vDSO support can be a good implementation for arc4random. Let's create our wrappers!


This patch is reworked.


// CPRNG provides LLVM-libc with a cryptographically secure random number.
// arc4random is available in BSD systems and later introduced in glibc.
// However, arc4random family faces performance issues on pre 6.11 Linux
// as each entropy generation would demand a round trip to and from the kernel.
// We still use getrandom syscall as fallback for older kernels. For modern
// kernels, we use the vDSO interface to get random bytes.
// Such vDSO interface differs from the getrandom syscall in that it demands
// the userspace to maintain an opaque state. System library (e.g. libc) is in
// charge of allocating a large bunch of such states and lend them to each
// thread demanding random bytes.
// Our approach should be providing similar guarantees as arc4random family on
// BSD and bionic. It is not async-signal-safe, but we have means to prevent
// information leakage due to reentrancy in signal frames. The vDSO getrandom
// implementation itself provides security across forks (by wiping pages).

@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2024

@llvm/pr-subscribers-libc

Author: Schrodinger ZHU Yifan (SchrodingerZhu)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/109870.diff

11 Files Affected:

  • (modified) libc/src/__support/OSUtil/linux/CMakeLists.txt (+22)
  • (modified) libc/src/__support/OSUtil/linux/aarch64/vdso.h (+2)
  • (added) libc/src/__support/OSUtil/linux/random.cpp (+323)
  • (added) libc/src/__support/OSUtil/linux/random.h (+20)
  • (modified) libc/src/__support/OSUtil/linux/vdso_sym.h (+6-3)
  • (modified) libc/src/__support/OSUtil/linux/x86_64/vdso.h (+2)
  • (modified) libc/src/__support/threads/thread.cpp (+1-1)
  • (modified) libc/test/integration/src/__support/CMakeLists.txt (+1)
  • (added) libc/test/integration/src/__support/OSUtil/CMakeLists.txt (+5)
  • (added) libc/test/integration/src/__support/OSUtil/linux/CMakeLists.txt (+10)
  • (added) libc/test/integration/src/__support/OSUtil/linux/random_fill_test.cpp (+22)
diff --git a/libc/src/__support/OSUtil/linux/CMakeLists.txt b/libc/src/__support/OSUtil/linux/CMakeLists.txt
index 6c7014940407d8..4752e326cc153e 100644
--- a/libc/src/__support/OSUtil/linux/CMakeLists.txt
+++ b/libc/src/__support/OSUtil/linux/CMakeLists.txt
@@ -53,3 +53,25 @@ add_object_library(
     libc.src.errno.errno
     libc.src.sys.auxv.getauxval
 )
+
+add_object_library(
+    random
+  HDRS
+    random.h
+  SRCS
+    random.cpp
+  DEPENDS
+    libc.src.sys.random.getrandom
+    libc.src.sys.mman.mmap
+    libc.src.sys.mman.munmap
+    libc.src.unistd.sysconf
+    libc.src.errno.errno
+    libc.src.__support.common
+    libc.src.__support.OSUtil.linux.vdso
+    libc.src.__support.threads.callonce
+    libc.src.__support.threads.linux.raw_mutex
+    libc.src.__support.threads.thread
+    libc.src.sched.sched_getaffinity
+    libc.src.sched.__sched_getcpucount
+)
+
diff --git a/libc/src/__support/OSUtil/linux/aarch64/vdso.h b/libc/src/__support/OSUtil/linux/aarch64/vdso.h
index 3c4c6205071da2..ee5777ad67f6dd 100644
--- a/libc/src/__support/OSUtil/linux/aarch64/vdso.h
+++ b/libc/src/__support/OSUtil/linux/aarch64/vdso.h
@@ -23,6 +23,8 @@ LIBC_INLINE constexpr cpp::string_view symbol_name(VDSOSym sym) {
     return "__kernel_clock_gettime";
   case VDSOSym::ClockGetRes:
     return "__kernel_clock_getres";
+  case VDSOSym::GetRandom:
+    return "__kernel_getrandom";
   default:
     return "";
   }
diff --git a/libc/src/__support/OSUtil/linux/random.cpp b/libc/src/__support/OSUtil/linux/random.cpp
new file mode 100644
index 00000000000000..612a184bda8455
--- /dev/null
+++ b/libc/src/__support/OSUtil/linux/random.cpp
@@ -0,0 +1,323 @@
+#include "src/__support/OSUtil/linux/random.h"
+#include "src/__support/CPP/mutex.h"
+#include "src/__support/CPP/new.h"
+#include "src/__support/OSUtil/linux/syscall.h"
+#include "src/__support/OSUtil/linux/vdso.h"
+#include "src/__support/OSUtil/linux/x86_64/vdso.h"
+#include "src/__support/libc_assert.h"
+#include "src/__support/memory_size.h"
+#include "src/__support/threads/callonce.h"
+#include "src/__support/threads/linux/callonce.h"
+#include "src/__support/threads/linux/raw_mutex.h"
+#include "src/errno/libc_errno.h"
+#include "src/sched/sched_getaffinity.h"
+#include "src/sched/sched_getcpucount.h"
+#include "src/stdlib/atexit.h"
+#include "src/sys/mman/mmap.h"
+#include "src/sys/mman/munmap.h"
+#include "src/sys/random/getrandom.h"
+#include "src/unistd/sysconf.h"
+#include <asm/param.h>
+
+namespace LIBC_NAMESPACE_DECL {
+namespace {
+// errno protection
+struct ErrnoProtect {
+  int backup;
+  ErrnoProtect() : backup(libc_errno) { libc_errno = 0; }
+  ~ErrnoProtect() { libc_errno = backup; }
+};
+
+// parameters for allocating per-thread random state
+struct Params {
+  unsigned size_of_opaque_state;
+  unsigned mmap_prot;
+  unsigned mmap_flags;
+  unsigned reserved[13];
+};
+
+// for registering thread-specific atexit callbacks
+using Destructor = void(void *);
+extern "C" int __cxa_thread_atexit_impl(Destructor *, void *, void *);
+extern "C" [[gnu::weak, gnu::visibility("hidden")]] void *__dso_handle =
+    nullptr;
+
+class MMapContainer {
+  void **ptr = nullptr;
+  void **usage = nullptr;
+  void **boundary = nullptr;
+
+  internal::SafeMemSize capacity() const {
+    return internal::SafeMemSize{
+        static_cast<size_t>(reinterpret_cast<ptrdiff_t>(boundary) -
+                            reinterpret_cast<ptrdiff_t>(ptr))};
+  }
+
+  internal::SafeMemSize bytes() const {
+    return capacity() * internal::SafeMemSize{sizeof(void *)};
+  }
+
+  bool initialize() {
+    internal::SafeMemSize page_size{static_cast<size_t>(sysconf(_SC_PAGESIZE))};
+    if (!page_size.valid())
+      return false;
+    ptr = reinterpret_cast<void **>(mmap(nullptr, page_size,
+                                         PROT_READ | PROT_WRITE,
+                                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0));
+    if (ptr == MAP_FAILED)
+      return false;
+    usage = ptr;
+    boundary = ptr + page_size / sizeof(void *);
+    return true;
+  }
+
+  bool grow(size_t additional) {
+    if (ptr == nullptr)
+      return initialize();
+
+    size_t old_capacity = capacity();
+
+    internal::SafeMemSize target_bytes{additional};
+    internal::SafeMemSize new_bytes = bytes();
+    target_bytes = target_bytes + size();
+    target_bytes = target_bytes * internal::SafeMemSize{sizeof(void *)};
+
+    if (!target_bytes.valid())
+      return false;
+    while (new_bytes < target_bytes) {
+      new_bytes = new_bytes * internal::SafeMemSize{static_cast<size_t>(2)};
+      if (!new_bytes.valid())
+        return false;
+    }
+
+    // TODO: migrate to syscall wrapper once it's available
+    auto result = syscall_impl<intptr_t>(
+        SYS_mremap, bytes(), static_cast<size_t>(new_bytes), MREMAP_MAYMOVE);
+
+    if (result < 0 && result > -EXEC_PAGESIZE)
+      return false;
+    ptr = reinterpret_cast<void **>(result);
+    usage = ptr + old_capacity;
+    boundary = ptr + new_bytes / sizeof(void *);
+    return true;
+  }
+
+public:
+  MMapContainer() = default;
+  ~MMapContainer() {
+    if (!ptr)
+      return;
+    munmap(ptr, bytes());
+  }
+
+  bool ensure_space(size_t additional) {
+    if (usage + additional >= boundary && !grow(additional))
+      return false;
+    return true;
+  }
+
+  void push_unchecked(void *value) {
+    LIBC_ASSERT(usage != boundary && "pushing into full container");
+    *usage++ = value;
+  }
+
+  using iterator = void **;
+  using value_type = void *;
+  iterator begin() const { return ptr; }
+  iterator end() const { return usage; }
+
+  bool empty() const { return begin() == end(); }
+  void *pop() {
+    LIBC_ASSERT(!empty() && "popping from empty container");
+    return *--usage;
+  }
+  internal::SafeMemSize size() const {
+    return internal::SafeMemSize{static_cast<size_t>(
+        reinterpret_cast<ptrdiff_t>(usage) - reinterpret_cast<ptrdiff_t>(ptr))};
+  }
+};
+
+class StateFactory {
+  RawMutex mutex{};
+  MMapContainer allocations{};
+  MMapContainer freelist{};
+  Params params{};
+  size_t states_per_page = 0;
+  size_t pages_per_allocation = 0;
+  size_t page_size = 0;
+
+  bool prepare() {
+    vdso::TypedSymbol<vdso::VDSOSym::GetRandom> vgetrandom;
+
+    if (!vgetrandom)
+      return false;
+
+    // get the allocation configuration suggested by the kernel
+    if (vgetrandom(nullptr, 0, 0, &params, ~0UL))
+      return false;
+
+    cpu_set_t cs{};
+
+    if (LIBC_NAMESPACE::sched_getaffinity(0, sizeof(cs), &cs))
+      return false;
+
+    internal::SafeMemSize count{static_cast<size_t>(
+        LIBC_NAMESPACE::__sched_getcpucount(sizeof(cs), &cs))};
+
+    internal::SafeMemSize allocation_size =
+        internal::SafeMemSize{
+            static_cast<size_t>(params.size_of_opaque_state)} *
+        count;
+
+    page_size = static_cast<size_t>(sysconf(_SC_PAGESIZE));
+    allocation_size = allocation_size.align_up(page_size);
+    if (!allocation_size.valid())
+      return false;
+
+    states_per_page = page_size / params.size_of_opaque_state;
+    pages_per_allocation = allocation_size / page_size;
+
+    return true;
+  }
+
+  bool allocate_new_states() {
+    if (!allocations.ensure_space(1))
+      return false;
+
+    // we always ensure the freelist can contain all the allocated states
+    internal::SafeMemSize total_size =
+        internal::SafeMemSize{page_size} *
+        internal::SafeMemSize{pages_per_allocation} *
+        (internal::SafeMemSize{static_cast<size_t>(1)} + allocations.size());
+
+    if (!total_size.valid() ||
+        !freelist.ensure_space(total_size - freelist.size()))
+      return false;
+
+    auto *new_allocation =
+        static_cast<char *>(mmap(nullptr, page_size * pages_per_allocation,
+                                 params.mmap_prot, params.mmap_flags, -1, 0));
+    if (new_allocation == MAP_FAILED)
+      return false;
+
+    for (size_t i = 0; i < pages_per_allocation; ++i) {
+      auto *page = new_allocation + i * page_size;
+      for (size_t j = 0; j < states_per_page; ++j)
+        freelist.push_unchecked(page + j * params.size_of_opaque_state);
+    }
+    return true;
+  }
+
+  static StateFactory *instance() {
+    alignas(StateFactory) static char storage[sizeof(StateFactory)]{};
+    static CallOnceFlag flag = callonce_impl::NOT_CALLED;
+    static bool valid = false;
+    callonce(&flag, []() {
+      auto *factory = new (storage) StateFactory();
+      valid = factory->prepare();
+      if (valid)
+        atexit([]() {
+          auto factory = reinterpret_cast<StateFactory *>(storage);
+          factory->~StateFactory();
+          valid = false;
+        });
+    });
+    return valid ? reinterpret_cast<StateFactory *>(storage) : nullptr;
+  }
+
+  void *acquire() {
+    cpp::lock_guard guard{mutex};
+    if (freelist.empty() && !allocate_new_states())
+      return nullptr;
+    return freelist.pop();
+  }
+  void release(void *state) {
+    cpp::lock_guard guard{mutex};
+    // there should be no need to check this pushing
+    freelist.push_unchecked(state);
+  }
+  ~StateFactory() {
+    for (auto *allocation : allocations)
+      munmap(allocation, page_size * pages_per_allocation);
+  }
+
+public:
+  static void *acquire_global() {
+    auto *factory = instance();
+    if (!factory)
+      return nullptr;
+    return factory->acquire();
+  }
+  static void release_global(void *state) {
+    auto *factory = instance();
+    if (!factory)
+      return;
+    factory->release(state);
+  }
+  static size_t size_of_opaque_state() {
+    return instance()->params.size_of_opaque_state;
+  }
+};
+
+void *acquire_tls() {
+  static thread_local void *state = nullptr;
+  // previous acquire failed, do not try again
+  if (state == MAP_FAILED)
+    return nullptr;
+  // first acquirement
+  if (state == nullptr) {
+    state = StateFactory::acquire_global();
+    // if still fails, remember the failure
+    if (state == nullptr) {
+      state = MAP_FAILED;
+      return nullptr;
+    } else {
+      // register the release callback.
+      if (__cxa_thread_atexit_impl(
+              [](void *s) { StateFactory::release_global(s); }, state,
+              __dso_handle)) {
+        StateFactory::release_global(state);
+        state = MAP_FAILED;
+        return nullptr;
+      }
+    }
+  }
+  return state;
+}
+
+template <class F> void random_fill_impl(F gen, void *buf, size_t size) {
+  auto *buffer = reinterpret_cast<uint8_t *>(buf);
+  while (size > 0) {
+    ssize_t len = gen(buffer, size);
+    if (len == -1) {
+      if (libc_errno == EINTR)
+        continue;
+      break;
+    }
+    size -= len;
+    buffer += len;
+  }
+}
+} // namespace
+
+void random_fill(void *buf, size_t size) {
+  ErrnoProtect protect;
+  void *state = acquire_tls();
+  if (state) {
+    random_fill_impl(
+        [state](void *buf, size_t size) {
+          vdso::TypedSymbol<vdso::VDSOSym::GetRandom> vgetrandom;
+          return vgetrandom(buf, size, 0, state,
+                            StateFactory::size_of_opaque_state());
+        },
+        buf, size);
+  } else {
+    random_fill_impl(
+        [](void *buf, size_t size) {
+          return LIBC_NAMESPACE::getrandom(buf, size, 0);
+        },
+        buf, size);
+  }
+}
+
+} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/__support/OSUtil/linux/random.h b/libc/src/__support/OSUtil/linux/random.h
new file mode 100644
index 00000000000000..0e2d51391ec31c
--- /dev/null
+++ b/libc/src/__support/OSUtil/linux/random.h
@@ -0,0 +1,20 @@
+//===-- Utilities for getting secure randomness -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_RANDOMNESS_H
+#define LLVM_LIBC_SRC___SUPPORT_RANDOMNESS_H
+
+#include "src/__support/common.h"
+
+#define __need_size_t
+#include <stddef.h>
+
+namespace LIBC_NAMESPACE_DECL {
+void random_fill(void *buf, unsigned long size);
+} // namespace LIBC_NAMESPACE_DECL
+#endif // LLVM_LIBC_SRC___SUPPORT_RANDOMNESS_H
diff --git a/libc/src/__support/OSUtil/linux/vdso_sym.h b/libc/src/__support/OSUtil/linux/vdso_sym.h
index 968e1536c4d270..2b1cf398369de2 100644
--- a/libc/src/__support/OSUtil/linux/vdso_sym.h
+++ b/libc/src/__support/OSUtil/linux/vdso_sym.h
@@ -19,7 +19,6 @@ struct __kernel_timespec;
 struct timezone;
 struct riscv_hwprobe;
 struct getcpu_cache;
-struct cpu_set_t;
 // NOLINTEND(llvmlibc-implementation-in-namespace)
 
 namespace LIBC_NAMESPACE_DECL {
@@ -35,7 +34,8 @@ enum class VDSOSym {
   RTSigReturn,
   FlushICache,
   RiscvHwProbe,
-  VDSOSymCount
+  GetRandom,
+  VDSOSymCount,
 };
 
 template <VDSOSym sym> LIBC_INLINE constexpr auto dispatcher() {
@@ -58,8 +58,11 @@ template <VDSOSym sym> LIBC_INLINE constexpr auto dispatcher() {
   else if constexpr (sym == VDSOSym::FlushICache)
     return static_cast<void (*)(void *, void *, unsigned int)>(nullptr);
   else if constexpr (sym == VDSOSym::RiscvHwProbe)
-    return static_cast<int (*)(riscv_hwprobe *, size_t, size_t, cpu_set_t *,
+    return static_cast<int (*)(riscv_hwprobe *, size_t, size_t, void *,
                                unsigned)>(nullptr);
+  else if constexpr (sym == VDSOSym::GetRandom)
+    return static_cast<int (*)(void *, size_t, unsigned int, void *, size_t)>(
+        nullptr);
   else
     return static_cast<void *>(nullptr);
 }
diff --git a/libc/src/__support/OSUtil/linux/x86_64/vdso.h b/libc/src/__support/OSUtil/linux/x86_64/vdso.h
index abe7c33e07cfab..f46fcb038f2e60 100644
--- a/libc/src/__support/OSUtil/linux/x86_64/vdso.h
+++ b/libc/src/__support/OSUtil/linux/x86_64/vdso.h
@@ -29,6 +29,8 @@ LIBC_INLINE constexpr cpp::string_view symbol_name(VDSOSym sym) {
     return "__vdso_time";
   case VDSOSym::ClockGetRes:
     return "__vdso_clock_getres";
+  case VDSOSym::GetRandom:
+    return "__vdso_getrandom";
   default:
     return "";
   }
diff --git a/libc/src/__support/threads/thread.cpp b/libc/src/__support/threads/thread.cpp
index dad4f75f092ede..04668dbfcbb63a 100644
--- a/libc/src/__support/threads/thread.cpp
+++ b/libc/src/__support/threads/thread.cpp
@@ -117,7 +117,7 @@ class ThreadAtExitCallbackMgr {
 
   int add_callback(AtExitCallback *callback, void *obj) {
     cpp::lock_guard lock(mtx);
-    return callback_list.push_back({callback, obj});
+    return callback_list.push_back({callback, obj}) ? 0 : -1;
   }
 
   void call() {
diff --git a/libc/test/integration/src/__support/CMakeLists.txt b/libc/test/integration/src/__support/CMakeLists.txt
index b5b6557e8d6899..d2dae7e02a9c57 100644
--- a/libc/test/integration/src/__support/CMakeLists.txt
+++ b/libc/test/integration/src/__support/CMakeLists.txt
@@ -1,4 +1,5 @@
 add_subdirectory(threads)
+add_subdirectory(OSUtil)
 if(LIBC_TARGET_OS_IS_GPU)
   add_subdirectory(GPU)
 endif()
diff --git a/libc/test/integration/src/__support/OSUtil/CMakeLists.txt b/libc/test/integration/src/__support/OSUtil/CMakeLists.txt
new file mode 100644
index 00000000000000..5ff1a11aff5c9d
--- /dev/null
+++ b/libc/test/integration/src/__support/OSUtil/CMakeLists.txt
@@ -0,0 +1,5 @@
+add_custom_target(libc-osutil-integration-tests)
+
+if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/${LIBC_TARGET_OS})
+  add_subdirectory(${LIBC_TARGET_OS})
+endif()
diff --git a/libc/test/integration/src/__support/OSUtil/linux/CMakeLists.txt b/libc/test/integration/src/__support/OSUtil/linux/CMakeLists.txt
new file mode 100644
index 00000000000000..a4f15ad8370352
--- /dev/null
+++ b/libc/test/integration/src/__support/OSUtil/linux/CMakeLists.txt
@@ -0,0 +1,10 @@
+add_integration_test(
+  random_fill_test
+  SUITE
+    libc-osutil-integration-tests
+  SRCS
+    random_fill_test.cpp
+  DEPENDS
+    libc.include.pthread
+    libc.src.__support.OSUtil.linux.random
+)
diff --git a/libc/test/integration/src/__support/OSUtil/linux/random_fill_test.cpp b/libc/test/integration/src/__support/OSUtil/linux/random_fill_test.cpp
new file mode 100644
index 00000000000000..bde24a1e0d5521
--- /dev/null
+++ b/libc/test/integration/src/__support/OSUtil/linux/random_fill_test.cpp
@@ -0,0 +1,22 @@
+//===-- Tests for pthread_equal -------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/__support/OSUtil/linux/random.h"
+
+#include "test/IntegrationTest/test.h"
+
+void smoke_test() {
+  using namespace LIBC_NAMESPACE;
+  uint32_t buffer;
+  random_fill(&buffer, sizeof(buffer));
+}
+
+TEST_MAIN() {
+  smoke_test();
+  return 0;
+}

@SchrodingerZhu
Copy link
Contributor Author

SchrodingerZhu commented Sep 24, 2024

Some pending questions:

What would the be the most proper behavior post fork?

Should I clear the state space and repopulate the states across fork? Or if we are not anticipating further usage, is it appropriate to simply let some tls blocks leak from the freelist?

Copy link
Contributor

@michaelrj-google michaelrj-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple quick nits from a preliminary review. Overall this seems like a good start but will need refining

Copy link
Member

@jyknight jyknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments. There's a lot of work needed on signal safety/re-entrancy in this code -- I'm sure there's more such issues which I didn't leave a comment about.

Copy link
Contributor

@michaelrj-google michaelrj-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this a bit deeper, I think there are significant changes needed.

@QuarticCat QuarticCat removed their request for review October 5, 2024 20:14
@SchrodingerZhu
Copy link
Contributor Author

@michaelrj-google PTAL when you have time. I guess the team are quite busy before the devconf.

@michaelrj-google
Copy link
Contributor

Yes, sorry the review is delayed. I will hopefully get to it after the devconf. Feel free to ping me if I haven't gotten to it by November

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants