Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions compiler-rt/lib/builtins/cpu_model/aarch64.c
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ struct {
#elif defined(__linux__) && __has_include(<sys/auxv.h>)
#include "aarch64/fmv/mrs.inc"
#include "aarch64/fmv/getauxval.inc"
#elif defined(_WIN32)
#include "aarch64/fmv/windows.inc"
#else
#include "aarch64/fmv/unimplemented.inc"
#endif
Expand Down
42 changes: 42 additions & 0 deletions compiler-rt/lib/builtins/cpu_model/aarch64/fmv/windows.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#ifndef _ARM64_
#define _ARM64_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this define here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

winnt.h complains about it if not defined and clang doesn't define it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting - I would generally expect it to work to include those headers as is, with the compiler default defines.

Can you boil that down to a standalone repro, with a file that just includes <windows.h> or similar (I guess the relevant one here is <processthreadsapi.h>)? Or is the case that <processthreadsapi.h> is one of the Windows headers that you can't include directly unless you've included <windows.h> first? In that case, I'd prefer including that first, rather than adding our own extra defines. (If you want to, you can define WIN32_LEAN_AND_MEAN to reduce the overhead of including the whole <windows.h>.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

windows.h defines it so let's do that.

#endif
#include <processthreadsapi.h>
#include <stdint.h>

void __init_cpu_features_resolver(unsigned long hwcap,
const __ifunc_arg_t *arg) {}

void CONSTRUCTOR_ATTRIBUTE __init_cpu_features(void) {
if (__atomic_load_n(&__aarch64_cpu_features.features, __ATOMIC_RELAXED))
return;

#define setCPUFeature(F) features |= 1ULL << F

uint64_t features = 0;

setCPUFeature(FEAT_INIT);
setCPUFeature(FEAT_FP);

// https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-isprocessorfeaturepresent
if (IsProcessorFeaturePresent(PF_ARM_V8_CRC32_INSTRUCTIONS_AVAILABLE))
setCPUFeature(FEAT_CRC);
if (IsProcessorFeaturePresent(PF_ARM_V81_ATOMIC_INSTRUCTIONS_AVAILABLE))
setCPUFeature(FEAT_LSE);
if (IsProcessorFeaturePresent(PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE))
setCPUFeature(FEAT_DOTPROD);

if (IsProcessorFeaturePresent(PF_ARM_V8_CRYPTO_INSTRUCTIONS_AVAILABLE)) {
setCPUFeature(FEAT_AES);
setCPUFeature(FEAT_SHA2);
setCPUFeature(FEAT_PMULL);
}
if (IsProcessorFeaturePresent(PF_ARM_V83_JSCVT_INSTRUCTIONS_AVAILABLE))
setCPUFeature(FEAT_JSCVT);

if (IsProcessorFeaturePresent(PF_ARM_V83_LRCPC_INSTRUCTIONS_AVAILABLE))
setCPUFeature(FEAT_RCPC);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this unconditionally like this, we require a relatively new SDK for building. PF_ARM_V83_LRCPC_INSTRUCTIONS_AVAILABLE is only visible in WinSDK since 10.0.22621.0, and in mingw-w64 headers since mingw-w64/mingw-w64@4dd72e2 (since a year). If we enclose it, at least the last one, in an #ifdef PF_ARM_V83_LRCPC_INSTRUCTIONS_AVAILABLE, we wouldn't break people building with slightly older headers.

Looking forward, there's a bunch of more aarch64 feature constants that you can look for since WinSDK 10.0.26100, and quite recently in mingw-w64. They aren't documented in the public MS documentation yet, but they're mostly quite self-documenting - see mingw-w64/mingw-w64@f97814b for a list of them. If we look for those features, we definitely should wrap each of them in an #ifdef.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I didn't notice the headers change recently :D
Now we have a bunch of features, I refactored the code a bit.


__atomic_store(&__aarch64_cpu_features.features, &features,
__ATOMIC_RELAXED);
}
10 changes: 9 additions & 1 deletion compiler-rt/lib/builtins/cpu_model/cpu_model.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,15 @@
// We're choosing init priority 90 to force our constructors to run before any
// constructors in the end user application (starting at priority 101). This
// value matches the libgcc choice for the same functions.
#define CONSTRUCTOR_ATTRIBUTE __attribute__((constructor(90)))
#ifdef _WIN64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more correct and generic to make this _WIN32 (which applies to both 32 and 64 bit windows)? I get that the current change is for aarch64 only, but this does seem to be a quite architecture independent file, and unless the distinction between 32 and 64 is needed (which it seldom is) I prefer using _WIN32 as define to look for.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment still seems to be unresolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry, fixed in eda3898.

// Contructor that replaces the ifunc runs currently with prio 10, see
// the LowerIFuncPass. The resolver of FMV depends on the cpu features so set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered what this was about, if it was some feature in the compiler I was unaware of - but then I saw the other PR filed at the same time :-)

// the priority to 9.
#define CONSTRUCTOR_PRIOTITY 9
#else
#define CONSTRUCTOR_PRIOTITY 90
#endif
#define CONSTRUCTOR_ATTRIBUTE __attribute__((constructor(CONSTRUCTOR_PRIOTITY)))
#else
// FIXME: For MSVC, we should make a function pointer global in .CRT$X?? so that
// this runs during initialization.
Expand Down
Loading