Skip to content

Replace hand-maintained CPU tables with cpufeatures library#61292

Open
gbaraldi wants to merge 6 commits intomasterfrom
gb/cpufeatures
Open

Replace hand-maintained CPU tables with cpufeatures library#61292
gbaraldi wants to merge 6 commits intomasterfrom
gb/cpufeatures

Conversation

@gbaraldi
Copy link
Copy Markdown
Member

Summary

  • Replaces processor_x86.cpp, processor_arm.cpp, processor_fallback.cpp (~5000 lines of hand-maintained CPU/feature tables) with a unified processor_cpufeatures.cpp that uses the cpufeatures library
  • CPU/feature data is extracted from LLVM's TableGen at build time and shipped as standalone C headers — no LLVM runtime dependency for the tables themselves
  • cpuid.jl now queries feature sets from the C library instead of hardcoding them
  • Debug output available via JULIA_DEBUG=cpufeatures

What is cpufeatures?

A standalone library that extracts CPU names, feature sets, and feature dependencies from LLVM's MCSubtargetInfo at build time into generated C headers. The generated headers are committed to the repo, so a normal build only needs a C++17 compiler — no LLVM required. Supports x86_64, aarch64, and riscv64.

Changes

New files:

  • deps/cpufeatures.mk, deps/cpufeatures.version — build system integration
  • src/processor_cpufeatures.cpp — unified processor implementation

Modified files:

  • src/processor.cpp — includes new file instead of arch-specific ones
  • src/processor.h — feature enum → uint32_t typedef (indices from cpufeatures)
  • src/Makefile — updated dependencies, link -ltarget_parsing
  • src/crc32c.c — hardcode HWCAP bit instead of using removed enum
  • base/cpuid.jl — ISA sets from C queries, not hardcoded
  • base/Makefilefeatures_h.jl from cpufeatures headers
  • deps/Makefile — add cpufeatures to DEP_LIBS

Files to delete (follow-up):

  • src/processor_x86.cpp, src/processor_arm.cpp, src/processor_fallback.cpp
  • src/features_x86.h, src/features_aarch32.h, src/features_aarch64.h

Test plan

  • make clean && make -j succeeds (downloads, builds, links cpufeatures)
  • CPU detection: correct name and features on znver4
  • Multiversioned sysimage (generic;haswell;skylake-avx512): correct target selection
  • -C haswell selects haswell target, -C generic selects generic
  • CPU name aliases (skx, corei7, atom, etc.) resolve correctly
  • BinaryPlatforms ISA matching works
  • FMA, sin, sqrt, basic math all work
  • JULIA_DEBUG=cpufeatures shows target selection details

Related

🤖 Generated with Claude Code


#if defined(_CPU_X86_64_) || defined(_CPU_X86_)
// KNL/KNM special case
if (!(t.dis.flags & JL_TARGET_CLONE_ALL)) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we care about knl

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not.

@Keno
Copy link
Copy Markdown
Member

Keno commented Mar 13, 2026

Haven't reviewed in detail, but directionally, this is exactly what I wanted.

@giordano
Copy link
Copy Markdown
Member

giordano commented Mar 13, 2026

This makes the aarch64-linux-gnu tests error (more than usual, that is), because now this literally spamming

-contextidrel2' is not a recognized feature for this target (ignoring feature)

everywhere. Edit: same for aarch64-darwin, which at least errors more loudly.

@giordano
Copy link
Copy Markdown
Member

giordano commented Mar 13, 2026

Also, on an aarch64-linux system with big.LITTLE architecture Cortex-X925 + A725, this PR detects the CPU as the little variant:

$ julia +nightly -E 'Sys.CPU_NAME'
"cortex-x925"
$ julia +pr61292 -E 'Sys.CPU_NAME'
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
"cortex-a725"

@christiangnrd
Copy link
Copy Markdown
Contributor

Is there a way to handle aliases? M1, M2, and M3 are aliases of their mobile a1x counterparts.

julia> Base.BinaryPlatforms.CPUID._lookup_cpu("apple-a15")
Base.BinaryPlatforms.CPUID.ISA(Set(UInt32[0x00000002, 0x00000067, 0x00000073, 0x00000077, 0x0000005b, 0x0000002a, 0x000000a1, 0x0000003f, 0x00000057, 0x000000d1  …  0x0000009e, 0x0000000f, 0x000000e7, 0x000000e1, 0x00000047, 0x00000081, 0x000000e8, 0x000000da, 0x00000033, 0x00000009]))

julia> Base.BinaryPlatforms.CPUID._lookup_cpu("apple-m2")
Base.BinaryPlatforms.CPUID.ISA(Set{UInt32}())

@inkydragon inkydragon added the building Build system, or building Julia or its dependencies label Mar 13, 2026
@christiangnrd
Copy link
Copy Markdown
Contributor

My m2 is now properly detected and displayed on macOS and Linux!

@christiangnrd
Copy link
Copy Markdown
Contributor

Should the cpufeatures library include a fallback feature like we currently do for those building against an older version of LLVM?

Also, when I opened gbaraldi/cpufeatures#1, I had forgotten that the apple-a18 alias was added in LLVM 21, so if we don't re-add fallback, we'll have to update the library to not report apple-a18 until then.

Also also, should +CONTEXTIDREL2 be included in the JIT target features? Would it be better to display the target features as reported by the JIT after it's initialized?

ChrichriMBP:~$ JULIA_DEBUG=cpufeatures j +pr61292 -C apple-a18
[cpufeatures] sysimg_init_cb: cpu_target='apple-a18'
[cpufeatures]   host CPU: 'apple-m2'
[cpufeatures]   cmdline has 1 target(s)
[cpufeatures] arg_target_data: name='apple-a18' require_host=1
[cpufeatures]   found CPU 'apple-a18' in database
[cpufeatures]   JIT target: name='apple-a18' features=+CONTEXTIDREL2,+aes,+alternate-sextload-cvt-f32-pattern,+altnzcv,+am,+amvs,+arith-bcc-fusion,+arith-cbz-fusion,+bf16,+bti,+ccdp,+ccpp,+complxnum,+crc,+disable-latency-sched-heuristic,+dit,+dotprod,+ecv,+el2vmsa,+el3,+fgt,+flagm,+fp-armv8,+fp16fml,+fpac,+fptoint,+fullfp16,+fuse-address,+fuse-adrp-add,+fuse-aes,+fuse-arith-logic,+fuse-crypto-eor,+fuse-csel,+fuse-literals,+i8mm,+jsconv,+lor,+lse,+lse2,+mpam,+neon,+nv,+pan,+pan-rwv,+pauth,+perfmon,+predres,+ras,+rcpc,+rcpc-immo,+rdm,+sb,+sel2,+sha2,+sha3,+specrestrict,+tlb-rmi,+tracev8.4,+uaops,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a,+vh,+zcm,+zcz,+zcz-gp
[cpufeatures]   sysimg has 2 target(s):
[cpufeatures]     [0] name='generic' flags=0x1 features=+fp-armv8,+neon
[cpufeatures]     [1] name='apple-m1' flags=0x1 features=+aes,+altnzcv,+am,+ccdp,+ccpp,+complxnum,+crc,+dit,+dotprod,+el2vmsa,+el3,+flagm,+fp-armv8,+fp16fml,+fptoint,+fullfp16,+jsconv,+lor,+lse,+lse2,+mpam,+neon,+nv,+pan,+pan-rwv,+pauth,+perfmon,+predres,+ras,+rcpc,+rcpc-immo,+rdm,+sb,+sel2,+sha2,+sha3,+specrestrict,+ssbs,+tlb-rmi,+tracev8.4,+uaops,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a,+vh
[cpufeatures]   selected target 0 'generic' (vreg_size=16)
'apple-a18' is not a recognized processor for this target (ignoring processor)
'apple-a18' is not a recognized processor for this target (ignoring processor)
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.14.0-DEV.1899 (2026-03-13)
 _/ |\__'_|_|_|\__'_|  |  gb/cpufeatures/d2834a4730c (fork: 13 commits, 2 days)
|__/

@gbaraldi
Copy link
Copy Markdown
Member Author

That specific feature we manually filter because llvm doesn’t support it. I guess we can have some fallback logic. I’ll ask the robot if he has any implementation better than what we used to have, if not then we add that again.

@topolarity
Copy link
Copy Markdown
Member

With #61399 landed, this PR should be able to remove our last usage of LLVMSupport in the runtime 👍

gbaraldi and others added 3 commits April 1, 2026 18:35
Replace the hand-maintained processor_x86.cpp, processor_arm.cpp, and
processor_fallback.cpp with a single processor.cpp that uses the
cpufeatures library for CPU detection, feature resolution, sysimage
serialization, and target matching.

cpufeatures extracts CPU tables from LLVM's TableGen at build time,
so they stay in sync with the LLVM version Julia ships. This removes
~3600 lines of manually maintained feature lists and CPU definitions.

Key changes:
- src/processor.cpp: single file replacing three arch-specific backends
- deps/cpufeatures.mk: new dependency (static library, no runtime dep)
- base/cpuid.jl: cross-arch ISA queries via cpufeatures
- base/loading.jl: updated ImageTarget for new serialization format
- src/init.c: CPU target validation (unknown names, multi-target checks)
- Removed LLVMTargetParser from libjulia-internal link

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove JL_TARGET_CLONE_* from processor.h — these flags were an
intermediate bitfield mapping between cpufeatures' FeatureDiff and the
per-function clone categories in multiversioning. Now:

- jl_target_spec_t carries explicit bools (clone_all, opt_size, min_size)
  and a tp::FeatureDiff directly
- JL_CLONE_* enum is file-local to llvm-multiversioning.cpp (only consumer)
- TargetSpec::clone_flags() derives the per-function mask from FeatureDiff
- Metadata serialization packs/unpacks the bools into a uint32_t

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update metadata flag values in the three multiversioning .ll tests
to match the new packed_flags() format (clone_all=1<<0, opt_size=1<<1,
has_new_math=1<<3, has_new_simd=1<<4, etc).

Since clone_flags() now always includes LOOP and CPU categories,
loop-containing functions get cloned for more targets. Consolidate
CHECK lines in annotate-only test where multiple functions now share
the same clone mask.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gbaraldi
Copy link
Copy Markdown
Member Author

gbaraldi commented Apr 1, 2026

@giordano @christiangnrd @vchuravy @topolarity Could you folks give the PR a once over as well. I think it's in a quite good state. If possible as well to https://github.com/gbaraldi/cpufeatures. I would like to finally get this monkey of my back. Specially since #41924 was my first PR ever merged to Julia. So I guess it's fate

@christiangnrd
Copy link
Copy Markdown
Contributor

christiangnrd commented Apr 1, 2026

What happens if a CPU that's in the cpufeatures generated table isn't in the LLVM version that Julia was built with? The current cpufeatures fallback only checks if the cpu is in the tables, not the runtime llvm. If the answer is "bad things", I need to remove the A18 detection that I prematurely added until llvm 21, and also would this prevent newer processors from running julia built with older llvm?

Edit: I opened gbaraldi/cpufeatures#4.

@gbaraldi
Copy link
Copy Markdown
Member Author

gbaraldi commented Apr 1, 2026

So we currently assume and check that you have the same LLVM version. So we do have to make sure that in cpufeatures we gate things on the LLVM version.
We probably will have to keep different branches for each LLVM version in cpufeatures?

#if defined(TARGET_TABLES_LLVM_VERSION_MAJOR) && defined(LLVM_VERSION_MAJOR)
static_assert(TARGET_TABLES_LLVM_VERSION_MAJOR == LLVM_VERSION_MAJOR,
    "cpufeatures tables were generated with a different LLVM major version than Julia uses");
#endif

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@christiangnrd
Copy link
Copy Markdown
Contributor

So we currently assume and check that you have the same LLVM version. So we do have to make sure that in cpufeatures we gate things on the LLVM version. We probably will have to keep different branches for each LLVM version in cpufeatures?

#if defined(TARGET_TABLES_LLVM_VERSION_MAJOR) && defined(LLVM_VERSION_MAJOR)
static_assert(TARGET_TABLES_LLVM_VERSION_MAJOR == LLVM_VERSION_MAJOR,
    "cpufeatures tables were generated with a different LLVM major version than Julia uses");
#endif

To what level is building current julia on an older LLVM version supposed to be supported? Would the build process automatically select the cpufeatures branch based on the llvm version?

@gbaraldi
Copy link
Copy Markdown
Member Author

gbaraldi commented Apr 1, 2026

It's kinda supported (but not really) and you would have to choose a specific commit and I think that's fine

@christiangnrd
Copy link
Copy Markdown
Contributor

It's kinda supported (but not really) and you would have to choose a specific commit and I think that's fine

That sounds reasonable to me. Would it be worth adding a quick comment to the devdocs?

I also just tried generating the tables over in cpufeatures down to llvm 17 since that's the lowest currently "supported". v19 worked (tests fail however), but generating the tables failed on llvm 17 and 18

@christiangnrd
Copy link
Copy Markdown
Contributor

christiangnrd commented Apr 1, 2026

Is this expected? Feels a bit verbose.

ChrichriMBP:~$ JULIA_DEBUG=cpufeatures j +pr61292
[cpufeatures] match_sysimg_target: cpu_target='native'
[cpufeatures]   JIT target: name='apple-m2'
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
[cpufeatures]   image has 2 target(s)
[cpufeatures]   selected target 1 'apple-m1' (vreg_size=16)
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.14.0-DEV.1968 (2026-04-01)
 _/ |\__'_|_|_|\__'_|  |  gb/cpufeatures/9b093e42949 (fork: 5 commits, 0 days)
|__/                   |

julia>

@christiangnrd
Copy link
Copy Markdown
Contributor

The apple-m1 target didn't use to include FEAT_SSBS, but it's now being detected. apple-m4 and newer don't have it. Is that an issue, and if so, would a fix be to build julia while targeting apple-m1 without it? Alternatively, apple-m4 adds SME and SME2 so maybe that should be added as a cputarget in the default build?

@gbaraldi
Copy link
Copy Markdown
Member Author

gbaraldi commented Apr 2, 2026

I though ssbs became default on armv8.5 though I’ll check

@gbaraldi
Copy link
Copy Markdown
Member Author

gbaraldi commented Apr 2, 2026

It is very verbose because it prints once per package image. Though I'm not sure if there's another behavior. It's similar to JULIA_DEBUG=loading in that matter

Picks up gbaraldi/cpufeatures#5 which blacklists FEAT_SSBS from
hw_feature_mask. SSBS is a speculative execution mitigation that
doesn't affect codegen but was inconsistently present across LLVM
CPU definitions (apple-a16 has it, apple-m4 doesn't), causing
false target mismatches on Apple Silicon.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

building Build system, or building Julia or its dependencies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants