-
Notifications
You must be signed in to change notification settings - Fork 258
Description
What is the problem the feature request solves?
I noticed that the linux profile (no git) in the makefile uses cpu-native rust flags. If build servers are running different hardware than customers / internal clusters, comet will crash. I recently had this happen where I build with AVX-512 instructions whereas another environment had partial AVX-512 support and another environment only AVX/AVX2.
https://www.faceofit.com/intels-avx10-2-roadmap/ shows that there are quite some gaps in processors supporting AVX-512, and soon (?) we're having AVX10.2 in the mix.
Describe the potential solution
Create profiles for different instruction sets that set different flags. For example, I removed AVX-512 now by running
RUSTFLAGS="-C target-cpu=x86-64 -C target-feature=-avx512f,-avx512cd,-avx512er,-avx512pf,-avx512bw,-avx512dq,-avx512vl,-avx512vnni" cargo build --release
Perhaps also create different JARs compiled with different flags or outline what the current JARs support on the website.
Additional context
You can recognize a CPU instruction set crash by seeing a crash log like
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGILL (0x4) at pc=0x00007efbe69b67e0, pid=12, tid=100
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.15+6 (17.0.15+6) (build 17.0.15+6)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.15+6 (17.0.15+6, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C [libcomet-2676614620071579916.so+0x3b1d7e0][thread 99 also had an error]
[thread 101 also had an error]
[thread 102 also had an error]
arrow_buffer::buffer::boolean::BooleanBuffer::count_set_bits::h657f1e93a4df02b2+0x120
or something similar