-
Notifications
You must be signed in to change notification settings - Fork 157
reftable: do make sure to use custom allocators, if any #1874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The reftable library goes out of its way to use its own set of allocator functions that can be configured using `reftable_set_alloc()`. However, Git does not configure this. That is not typically a problem, except when Git uses a custom allocator via some definitions in `git-compat-util.h`, as is the case in Git for Windows (which switched away from the long-unmaintained nedmalloc to mimalloc). Then, it is quite possible that Git assigns a `strbuf` (allocated via the custom allocator) to, say, the `refname` field of a `reftable_log_record` in `write_transaction_table()`, and later on asks the reftable library function `reftable_log_record_release()` to release it, but that function was compiled without using `git-compat-util.h` and hence calls regular `free()` (i.e. _not_ the custom allocator's own function). This has been a problem for a long time and it was a matter of some sort of "luck" that 1) reftables are not commonly used on Windows, and 2) mimalloc can often ignore gracefully when it is asked to release memory that it has not allocated. However, a recent update to `seen` brought this problem to the forefront, letting t1460 fail in Git for Windows, with symptoms much in the same way as the problem I had to address in d02c37c (t-reftable-basics: allow for `malloc` to be `#define`d, 2025-01-08) where exit code 127 was also produced in lieu of `STATUS_HEAP_CORRUPTION` (C0000374) because exit codes are only 7 bits wide. It was not possible to figure out what change in particular caused these new failures within a reasonable time frame, as there are too many changes in `seen` that conflict with Git for Windows' patches, I had to stop the investigation after spending four hours on it fruitlessly. To verify that this patch fixes the issue, I avoided using mimalloc and temporarily patched in a "custom allocator" that would more reliably point out problems, like this: diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 68f3829..9421d630b9f5 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -353,6 +353,69 @@ static int reftable_be_fsync(int fd) return fsync_component(FSYNC_COMPONENT_REFERENCE, fd); } +#define DEBUG_REFTABLE_ALLOC +#ifdef DEBUG_REFTABLE_ALLOC +#include "khash.h" + +static inline khint_t __ac_X31_hash_ptr(void *ptr) +{ + union { + void *ptr; + char s[sizeof(void *)]; + } u; + size_t i; + khint_t h; + + u.ptr = ptr; + h = (khint_t)*u.s; + for (i = 0; i < sizeof(void *); i++) + h = (h << 5) - h + (khint_t)u.s[i]; + return h; +} + +#define kh_ptr_hash_func(key) __ac_X31_hash_ptr(key) +#define kh_ptr_hash_equal(a, b) ((a) == (b)) + +KHASH_INIT(ptr, void *, int, 0, kh_ptr_hash_func, kh_ptr_hash_equal) + +static kh_ptr_t *my_malloced; + +static void *my_malloc(size_t sz) +{ + int dummy; + void *ptr = malloc(sz); + if (ptr) + kh_put_ptr(my_malloced, ptr, &dummy); + return ptr; +} + +static void *my_realloc(void *ptr, size_t sz) +{ + int dummy; + if (ptr) { + khiter_t pos = kh_get_ptr(my_malloced, ptr); + if (pos >= kh_end(my_malloced)) + die("Was not my_malloc()ed: %p", ptr); + kh_del_ptr(my_malloced, pos); + } + ptr = realloc(ptr, sz); + if (ptr) + kh_put_ptr(my_malloced, ptr, &dummy); + return ptr; +} + +static void my_free(void *ptr) +{ + if (ptr) { + khiter_t pos = kh_get_ptr(my_malloced, ptr); + if (pos >= kh_end(my_malloced)) + die("Was not my_malloc()ed: %p", ptr); + kh_del_ptr(my_malloced, pos); + } + free(ptr); +} +#endif + static struct ref_store *reftable_be_init(struct repository *repo, const char *gitdir, unsigned int store_flags) @@ -362,6 +425,11 @@ static struct ref_store *reftable_be_init(struct repository *repo, int is_worktree; mode_t mask; +#ifdef DEBUG_REFTABLE_ALLOC + my_malloced = kh_init_ptr(); + reftable_set_alloc(my_malloc, my_realloc, my_free); +#endif + mask = umask(0); umask(mask); I briefly considered contributing this "custom allocator" patch, too, but it is unwieldy (for example, it would not work at all when compiling with mimalloc support) and it would only waste space (or even time, if a compile flag was introduced and exercised as part of the CI builds). Given that it is highly unlikely that Git will lose the new `reftable_set_alloc()` call by mistake, I rejected that idea as simply too wasteful. Signed-off-by: Johannes Schindelin <[email protected]>
@pks for the record, this is the cumbersome approach that I talked about, the one that I think needs to be abandoned because it piles patch upon patch just to avoid moving the mimalloc stuff from |
Hey @dscho, I think you're pinging the wrong guy. |
Oops, @pks-t my comment was for you... |
Yup, unfortunately @dscho Thanks for investigating this one. In any case, I'm fine with not pursuing this for now, but may eventually revisit it. What does this mean for the ps/reftable-sans-compat-util topic that has triggered this investigation? Can that topic be moved forward with my proposed fix to move the mimalloc definitions around? Is there anything else I can help you with in this context? I'd really like to get the topic unblocked so that I can continue to iterate on the reftable library. |
Just move forward with this topic. I'll deal with the fall-out, as usual. |
This fixes the problems described in https://lore.kernel.org/git/[email protected]/.
Cc: Patrick Steinhardt [email protected]