Clang's insistence on __attribute__((target(...))) makes it impossible to build SIMD abstractions

Compiling the code
```
#include <immintrin.h>

template<class> struct Zero;

template<> struct Zero<__m256> {
	[[gnu::target("avx")]]
	[[gnu::always_inline]]
	inline __m256 operator()() const {
		return _mm256_setzero_ps();
	}
};

template<> struct Zero<__m256d> {
	[[gnu::target("avx2")]]
	[[gnu::always_inline]]
	inline __m256d operator()() const {
		return _mm256_setzero_pd();
	}
};

template<class T>
[[gnu::always_inline]]
static inline auto bar(T gen) {
	return gen();
}

[[gnu::target("avx")]]
void foo() {
	bar(Zero<__m256>());
}
```
fails with
```
<source>:29:2: error: AVX vector return of type '__m256' (vector of 8 'float' values) without 'avx' enabled changes the ABI
   29 |         bar(Zero<__m256>());
      |         ^
<source>:24:9: error: always_inline function 'operator()' requires target feature 'avx', but would be inlined into function 'bar' that is compiled without support for 'avx'
   24 |         return gen();
      |                ^
<source>:24:9: error: AVX vector return of type '__m256' (vector of 8 'float' values) without 'avx' enabled changes the ABI
```
but I don't see any reasonable annotation that could possibly be placed on `bar`, nor why this should be necessary at all.

Fundamentally, the `target` attribute is only required during codegen, for functions whose target cannot be inferred. But `bar` is `always_inline` with internal linkage, so unless its address is taken, it should not care what the target architecture is -- determining that is only the concern of the first non-inline caller, for which codegen actually occurs.

Moreover, I think the same could apply for _any_ function whose initial declaration is also a definition. When a function's body is always available, its `target` attribute could always be safely inferred as the union of the `target`s of all its callees, and thus making its specification mandatory only introduces friction.

I feel this is a severe design limitation -- every function, even a template, is being required to know the targets of its callees, but it has no reasonable way to tell what those might be. This makes it impossible to build abstractions over SIMD code without severe code duplication.

Could the insistence on `target` therefore be relaxed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clang's insistence on attribute((target(...))) makes it impossible to build SIMD abstractions #129398

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clang's insistence on __attribute__((target(...))) makes it impossible to build SIMD abstractions #129398

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Clang's insistence on attribute((target(...))) makes it impossible to build SIMD abstractions #129398