Skip to content

clang mislowers C function parameters that are arrays with static length into LLVM's dereferenceable #170026

@ais523

Description

@ais523

When compiling a function that takes a function argument that's a static-length array (e.g. int x[static 1048576]), Clang lowers this into LLVM's dereferenceable intrinsic. However, as far as I can tell from the C standards, static on an array length doesn't guarantee that the array will be vaild for the entire function – only at the point of the function call. The lowering could therefore lead to a miscompile if the array is freed within the function.

Here's an example of a (contrived) end-to-end miscompile, using two files foo.c and bar.c, which are separately compiled at -O3 -g and then linked with -g but no optimisation options (I tested this on x86-64 but suspect it affects other platforms too):

foo.c

extern int bar(int x[static 1048576]);
int foo(int x[static 1048576])
{
    *x = 1;
    int y = bar(x);
    int z = 0;
    while (y--) z += *x;
    return z;
}

bar.c

#include <stdlib.h>

extern int foo(int x[static 1048576]);
int bar(int x[static 1048586])
{
    free(x);
    return 0;
}

int main(void)
{
    int *x = calloc(1048576, sizeof (int));
    return foo(x);
}

foo incorrectly gets a dereferenceable on the argument – I think it should instead be an llvm.assume … "dereferenceable" at the start of the function body itself (i.e. to guarantee that all 1048576 elements of x are dereferenceable at the start of the function, without necessarily guaranteeing that they aren't freed since). With some memory allocators, e.g. glibc calloc, this causes a segmentation fault at runtime, because a memory allocation that large is returned to the OS by free, but (relying on the dereferenceable) LLVM attempts to unconditionally read from it anyway (without checking to see whether y is 0 first).

It is possible that I'm mistaken about what the C standard requires here (although I checked with someone more knowledgeable in the C standards than me, and they agreed with my interpretation).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions