In the following example, we have a summation loop. One of the terms being contributed to the accumulator involves a subtraction. This "-" appears to force us to pick a fixed vectorization factor instead of a scalable one. Simply changing the "-" to a "+" results in a scalable vector factor being chosen.
typedef struct {
int x;
} s_t;
int example(s_t *a, s_t *b, int len) {
int accum = 0;
for (int i = 0; i < len; ++i) {
accum += a[i].x - b[i].x;
;
}
return accum;
}
https://godbolt.org/z/bPebGEaj9 for the codegen.
My guess is that this is some quirk of our cost model. I haven't tried to chase it down just yet.