Skip to content

[WebAssembly] Autovectorisation to v128.bitselect not working on loops #168275

@Photosounder

Description

@Photosounder

When compiling to WebAssembly with clang using -Os -msimd128 I'm noticing something strange when trying to use functions that match SIMD ops (I do this so that I can write truly portable SIMD). When I use this unrolled function it autovectorises nicely to v128.bitselect:

static inline v128_t opFD52_v128_bitselect(v128_t a, v128_t b, v128_t c)
{
    a.u64[0] = a.u64[0] & c.u64[0] | b.u64[0] & ~c.u64[0];
    a.u64[1] = a.u64[1] & c.u64[1] | b.u64[1] & ~c.u64[1];
    return a;
}

But when it's written as a loop (which is obviously the form I prefer) it falls back to using scalar operations on i64 lanes:

static inline v128_t opFD52_v128_bitselect(v128_t a, v128_t b, v128_t c)
{
    for (int i=0; i<2; i++)
        a.u64[i] = a.u64[i] & c.u64[i] | b.u64[i] & ~c.u64[i];
    return a;
}

Other functions I tried are correctly autovectorised in loop form. See on Godbolt: https://godbolt.org/z/ovh7944xj

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions