-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[libc][stdlib] Implement setenv() with environment management infrastructure #163018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kaladron
wants to merge
3
commits into
llvm:main
Choose a base branch
from
kaladron:setenv
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+656
−0
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
b288ac0
[libc][stdlib] Implement setenv() with environment management infrast…
kaladron 4985ce1
Remove unsetenv cleanup calls from setenv tests, to be uncommented af…
kaladron 71166d9
Wrap the files in if(LLVM_LIBC_FULL_BUILD) so that they won't show up…
kaladron File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
//===-- Implementation of internal environment utilities ------------------===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#include "environ_internal.h" | ||
#include "config/app.h" | ||
#include "src/__support/CPP/string_view.h" | ||
#include "src/__support/macros/config.h" | ||
#include "src/string/memcpy.h" | ||
|
||
// We use extern "C" declarations for malloc/free/realloc instead of including | ||
// src/stdlib/malloc.h, src/stdlib/free.h, and src/stdlib/realloc.h. This allows | ||
// the implementation to work with different allocator implementations, | ||
// particularly in integration tests which provide a simple bump allocator. The | ||
// extern "C" linkage ensures we use whatever allocator is linked with the test | ||
// or application. | ||
extern "C" void *malloc(size_t); | ||
extern "C" void free(void *); | ||
extern "C" void *realloc(void *, size_t); | ||
|
||
namespace LIBC_NAMESPACE_DECL { | ||
namespace internal { | ||
|
||
// Minimum initial capacity for the environment array when first allocated. | ||
// This avoids frequent reallocations for small environments. | ||
constexpr size_t MIN_ENVIRON_CAPACITY = 32; | ||
|
||
// Growth factor for environment array capacity when expanding. | ||
// When capacity is exceeded, new_capacity = old_capacity * | ||
// ENVIRON_GROWTH_FACTOR. | ||
constexpr size_t ENVIRON_GROWTH_FACTOR = 2; | ||
|
||
// Global state for environment management | ||
Mutex environ_mutex(false, false, false, false); | ||
char **environ_storage = nullptr; | ||
EnvStringOwnership *environ_ownership = nullptr; | ||
size_t environ_capacity = 0; | ||
size_t environ_size = 0; | ||
bool environ_is_ours = false; | ||
|
||
char **get_environ_array() { | ||
if (environ_is_ours) | ||
return environ_storage; | ||
return reinterpret_cast<char **>(LIBC_NAMESPACE::app.env_ptr); | ||
} | ||
|
||
void init_environ() { | ||
// Count entries in the startup environ | ||
char **env_ptr = reinterpret_cast<char **>(LIBC_NAMESPACE::app.env_ptr); | ||
if (!env_ptr) | ||
return; | ||
|
||
size_t count = 0; | ||
for (char **env = env_ptr; *env != nullptr; env++) | ||
count++; | ||
|
||
environ_size = count; | ||
} | ||
|
||
int find_env_var(cpp::string_view name) { | ||
char **env_array = get_environ_array(); | ||
if (!env_array) | ||
return -1; | ||
|
||
for (size_t i = 0; i < environ_size; i++) { | ||
cpp::string_view current(env_array[i]); | ||
if (!current.starts_with(name)) | ||
continue; | ||
|
||
// Check that name is followed by '=' | ||
if (current.size() > name.size() && current[name.size()] == '=') | ||
return static_cast<int>(i); | ||
} | ||
|
||
return -1; | ||
} | ||
|
||
bool ensure_capacity(size_t needed) { | ||
// IMPORTANT: This function assumes environ_mutex is already held by the | ||
// caller. Do not add locking here as it would cause deadlock. | ||
|
||
// If we're still using the startup environ, we need to copy it | ||
if (!environ_is_ours) { | ||
char **old_env = reinterpret_cast<char **>(LIBC_NAMESPACE::app.env_ptr); | ||
|
||
// Allocate new array with room to grow | ||
size_t new_capacity = needed < MIN_ENVIRON_CAPACITY | ||
? MIN_ENVIRON_CAPACITY | ||
: needed * ENVIRON_GROWTH_FACTOR; | ||
char **new_storage = | ||
reinterpret_cast<char **>(malloc(sizeof(char *) * (new_capacity + 1))); | ||
if (!new_storage) | ||
return false; | ||
|
||
// Allocate ownership tracking array | ||
EnvStringOwnership *new_ownership = reinterpret_cast<EnvStringOwnership *>( | ||
malloc(sizeof(EnvStringOwnership) * (new_capacity + 1))); | ||
if (!new_ownership) { | ||
free(new_storage); | ||
return false; | ||
} | ||
|
||
// Copy existing pointers (we don't own the strings yet, so just copy | ||
// pointers) | ||
if (old_env) { | ||
for (size_t i = 0; i < environ_size; i++) { | ||
new_storage[i] = old_env[i]; | ||
// Initialize ownership: startup strings are not owned by us | ||
new_ownership[i] = EnvStringOwnership(); | ||
} | ||
} | ||
new_storage[environ_size] = nullptr; | ||
|
||
environ_storage = new_storage; | ||
environ_ownership = new_ownership; | ||
environ_capacity = new_capacity; | ||
environ_is_ours = true; | ||
|
||
// Update app.env_ptr to point to our storage | ||
LIBC_NAMESPACE::app.env_ptr = | ||
reinterpret_cast<uintptr_t *>(environ_storage); | ||
|
||
return true; | ||
} | ||
|
||
// We already own environ, check if we need to grow it | ||
if (needed <= environ_capacity) | ||
return true; | ||
|
||
// Grow capacity by the growth factor | ||
size_t new_capacity = needed * ENVIRON_GROWTH_FACTOR; | ||
|
||
// Use realloc to grow the arrays | ||
char **new_storage = reinterpret_cast<char **>( | ||
realloc(environ_storage, sizeof(char *) * (new_capacity + 1))); | ||
if (!new_storage) | ||
return false; | ||
|
||
EnvStringOwnership *new_ownership = | ||
reinterpret_cast<EnvStringOwnership *>(realloc( | ||
environ_ownership, sizeof(EnvStringOwnership) * (new_capacity + 1))); | ||
if (!new_ownership) { | ||
// If ownership realloc fails, we still have the old storage in new_storage | ||
// which was successfully reallocated. We need to restore or handle this. | ||
// For safety, we'll keep the successfully reallocated storage. | ||
environ_storage = new_storage; | ||
return false; | ||
} | ||
|
||
environ_storage = new_storage; | ||
environ_ownership = new_ownership; | ||
environ_capacity = new_capacity; | ||
|
||
// Update app.env_ptr to point to our new storage | ||
LIBC_NAMESPACE::app.env_ptr = reinterpret_cast<uintptr_t *>(environ_storage); | ||
|
||
return true; | ||
} | ||
|
||
} // namespace internal | ||
} // namespace LIBC_NAMESPACE_DECL |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
//===-- Internal utilities for environment management ----------*- C++ -*-===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef LLVM_LIBC_SRC_STDLIB_ENVIRON_INTERNAL_H | ||
#define LLVM_LIBC_SRC_STDLIB_ENVIRON_INTERNAL_H | ||
|
||
#include "hdr/types/size_t.h" | ||
#include "src/__support/CPP/string_view.h" | ||
#include "src/__support/macros/attributes.h" | ||
#include "src/__support/macros/config.h" | ||
#include "src/__support/threads/mutex.h" | ||
|
||
namespace LIBC_NAMESPACE_DECL { | ||
namespace internal { | ||
|
||
// Ownership information for environment strings. | ||
// We need to track ownership because environment strings come from three | ||
// sources: | ||
// 1. Startup environment (from program loader) - we don't own these | ||
// 2. putenv() calls where caller provides the string - we don't own these | ||
// 3. setenv() calls where we allocate the string - we DO own these | ||
// Only strings we allocated can be freed when replaced or removed. | ||
struct EnvStringOwnership { | ||
bool allocated_by_us; // True if we malloc'd this string (must free). | ||
// False for startup environ or putenv strings (don't | ||
// free). | ||
|
||
// Default: not owned by us (startup or putenv - don't free). | ||
LIBC_INLINE EnvStringOwnership() : allocated_by_us(false) {} | ||
|
||
// Returns true if this string can be safely freed. | ||
LIBC_INLINE bool can_free() const { return allocated_by_us; } | ||
}; | ||
|
||
// Global mutex protecting all environ modifications | ||
extern Mutex environ_mutex; | ||
|
||
// Our allocated environ array (nullptr if using startup environ) | ||
extern char **environ_storage; | ||
|
||
// Parallel array tracking ownership of each environ string | ||
// Same size/capacity as environ_storage | ||
extern EnvStringOwnership *environ_ownership; | ||
|
||
// Allocated capacity of environ_storage | ||
extern size_t environ_capacity; | ||
|
||
// Current number of variables in environ | ||
extern size_t environ_size; | ||
|
||
// True if we allocated environ_storage (and are responsible for freeing it) | ||
extern bool environ_is_ours; | ||
|
||
// Search for a variable by name in the current environ array. | ||
// Returns the index if found, or -1 if not found. | ||
// This function assumes the mutex is already held. | ||
int find_env_var(cpp::string_view name); | ||
|
||
// Ensure environ has capacity for at least `needed` entries (plus null | ||
// terminator). May allocate or reallocate environ_storage. Returns true on | ||
// success, false on allocation failure. This function assumes the mutex is | ||
// already held. | ||
bool ensure_capacity(size_t needed); | ||
|
||
// Get a pointer to the current environ array. | ||
// This may be app.env_ptr (startup environ) or environ_storage (our copy). | ||
char **get_environ_array(); | ||
|
||
// Initialize environ management from the startup environment. | ||
// This must be called before any setenv/unsetenv operations. | ||
// This function is thread-safe and idempotent. | ||
void init_environ(); | ||
|
||
} // namespace internal | ||
} // namespace LIBC_NAMESPACE_DECL | ||
|
||
#endif // LLVM_LIBC_SRC_STDLIB_ENVIRON_INTERNAL_H |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use Clang Thread Safety Analysis to statically check this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to give it a try! I'm heading to bed now, but will take a look in the morning (likewise with the windows and Mac bot failures).
I tried running with ASan / UBSan but couldn't figure those out with the bump allocator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look - today, no. They're only really used for testing so far, not actually deployed inside LLVM. It looks like we'd have to build some infrastructure to use them, and I think that's best saved for a follow-up PR. I have a few things queued up that I'd like to get landed and then would happily loop back to this.