-
Notifications
You must be signed in to change notification settings - Fork 15.4k
Description
When compiling a function that takes a function argument that's a static-length array (e.g. int x[static 1048576]), Clang lowers this into LLVM's dereferenceable intrinsic. However, as far as I can tell from the C standards, static on an array length doesn't guarantee that the array will be vaild for the entire function – only at the point of the function call. The lowering could therefore lead to a miscompile if the array is freed within the function.
Here's an example of a (contrived) end-to-end miscompile, using two files foo.c and bar.c, which are separately compiled at -O3 -g and then linked with -g but no optimisation options (I tested this on x86-64 but suspect it affects other platforms too):
foo.c
extern int bar(int x[static 1048576]); int foo(int x[static 1048576]) { *x = 1; int y = bar(x); int z = 0; while (y--) z += *x; return z; }bar.c
#include <stdlib.h> extern int foo(int x[static 1048576]); int bar(int x[static 1048586]) { free(x); return 0; } int main(void) { int *x = calloc(1048576, sizeof (int)); return foo(x); }
foo incorrectly gets a dereferenceable on the argument – I think it should instead be an llvm.assume … "dereferenceable" at the start of the function body itself (i.e. to guarantee that all 1048576 elements of x are dereferenceable at the start of the function, without necessarily guaranteeing that they aren't freed since). With some memory allocators, e.g. glibc calloc, this causes a segmentation fault at runtime, because a memory allocation that large is returned to the OS by free, but (relying on the dereferenceable) LLVM attempts to unconditionally read from it anyway (without checking to see whether y is 0 first).
It is possible that I'm mistaken about what the C standard requires here (although I checked with someone more knowledgeable in the C standards than me, and they agreed with my interpretation).