Skip to content

Conversation

madhav-madhusoodanan
Copy link
Contributor

Context

This is a redo of PR #1814, since a lot of details have changed with PRs #1863, #1862, #1861, #1852.

r? @folkertdev
cc: @Amanieu

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch 2 times, most recently from 41db5a8 to 5fc0f3b Compare August 5, 2025 10:16
Comment on lines +199 to +235
match str::parse::<u32>(etype_processed.as_str()) {
Ok(value) => data.bit_len = Some(value),
Err(_) => {
data.bit_len = match data.kind() {
TypeKind::Char(_) => Some(8),
TypeKind::BFloat => Some(16),
TypeKind::Int(_) => Some(32),
TypeKind::Float => Some(32),
_ => None,
};
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are only some type kinds covered here? Maybe this could be a method on TypeKind?

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 9e28106 to 111cd5d Compare August 5, 2025 16:22
Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should rebase on top of the upstream master branch instead of merging it in. That keeps the git history clean.

Comment on lines 188 to 195
x86_64-unknown-linux-gnu*)
CPPFLAGS="${TEST_CPPFLAGS}" RUSTFLAGS="${HOST_RUSTFLAGS}" RUST_LOG=warn \
cargo run "${INTRINSIC_TEST}" "${PROFILE}" \
--bin intrinsic-test -- intrinsics_data/x86-intel.xml \
--runner "${TEST_RUNNER}" \
--cppcompiler "${TEST_CXX_COMPILER}" \
--target "${TARGET}"
;;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll have to see how to do this exactly, but we want to split these out of the main CI job to speed it up

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from f3f87f2 to 2ec747c Compare August 9, 2025 12:20
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 2ec747c to a8313d0 Compare September 5, 2025 08:16
@madhav-madhusoodanan
Copy link
Contributor Author

Seems like the CI run at this point failed due to this error. I'll retry shortly:

#6 39.45   Could not connect to archive.ubuntu.com:80 (185.125.190.82), connection timed out Could not connect to archive.ubuntu.com:80 (185.125.190.83), connection timed out Could not connect to archive.ubuntu.com:80 (91.189.91.83), connection timed out Could not connect to archive.ubuntu.com:80 (185.125.190.81), connection timed out Could not connect to archive.ubuntu.com:80 (91.189.91.81), connection timed out Could not connect to archive.ubuntu.com:80 (185.125.190.36), connection timed out Could not connect to archive.ubuntu.com:80 (185.125.190.39), connection timed out Could not connect to archive.ubuntu.com:80 (91.189.91.82), connection timed out
#6 39.45   Unable to connect to archive.ubuntu.com:80:
#6 39.45 Fetched 126 kB in 39s (3208 B/s)
#6 39.45 Reading package lists...
#6 39.46 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/questing/InRelease  Unable to connect to archive.ubuntu.com:80:
#6 39.46 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/questing-updates/InRelease  Unable to connect to archive.ubuntu.com:80:
#6 39.46 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/questing-backports/InRelease  Unable to connect to archive.ubuntu.com:80:
#6 39.46 W: Some index files failed to download. They have been ignored, or old ones used instead.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch 9 times, most recently from a119be1 to 0f3900b Compare September 8, 2025 19:25
being killed. Also separated declarations and definitions for C++
testfiles.
@madhav-madhusoodanan
Copy link
Contributor Author

madhav-madhusoodanan commented Sep 29, 2025

@folkertdev Seems like the clang++-19 was being passed as the value to the --cppcompiler flag, instead of clang++. That was the reason behind the No such file or directory error. Fixed it.

I ran into an issue with the process being killed (likely due to resource constraints since the clang++ processes are running in parallel). Would it be okay if I increase the CPU and memory limits within the Github workflow file?

For now I've made it sequential so that I can step over that error, but the amount of time it's taking has got me reconsidering it.

@folkertdev
Copy link
Contributor

I ran into an issue with the process being killed (likely due to resource constraints since the clang++ processes are running in parallel). Would it be okay if I increase the CPU and memory limits within the Github workflow file?

I don't think that is how it works? You just get a CI machine (typically with 4 cores) for some duration. Do you know how large these files get (how many intrinsics do you generate a test for)?

We definitely want to be running these processes in parallel.

@madhav-madhusoodanan
Copy link
Contributor Author

Ohh I see, I thought I could edit the resource allocation.
However, while running them in parallel I consistently run into the process being killed.

Let me experiment with keeping more number of mod_i.cpp and a worker pooling mechanism, seeing as we can't change the resource allocation.

@folkertdev
Copy link
Contributor

you could also try to just cut the total number of generated tests (e.g. to 10%) to see if that helps?

By the looks of it the whole process is just way too slow, and the runners eventually time out. So, you'll have to figure out how to make it faster. I think the crux here is what I plan to do: link all the c++ and rust code into one binary that generates, in effect, a large number of rust #[test] tests, and just execute that using the given runner.

But that is probably not feasible for you in the remaining time, so by just cutting the total number of tests (maybe using some random sampling, e.g. keep every 10th test) we'll be able to merge your code without completely tanking our CI times.

@madhav-madhusoodanan
Copy link
Contributor Author

Good idea @folkertdev. Let me first check whether processing smaller-sized chunks would be of any help.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch 2 times, most recently from c81c215 to 6733138 Compare September 30, 2025 08:06
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch 2 times, most recently from 2876624 to 6be6472 Compare October 1, 2025 13:13
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 6be6472 to 87e39a2 Compare October 2, 2025 19:47
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch 2 times, most recently from 20ec01d to b476cd1 Compare October 3, 2025 13:04
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from b476cd1 to 3b77dc5 Compare October 3, 2025 13:10
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from ceb2a37 to 9dbc078 Compare October 5, 2025 18:48
@madhav-madhusoodanan
Copy link
Contributor Author

@sayantn was this change related to the change we made for some x86 intrinsics? Found this in the CI tests

Illegal instruction at address = 56435cedbf49: f2 0f 2b 00 0f ae f8 f2 0f 10 04 24 66 0f 2e 
Image name: /checkout/target/x86_64-unknown-linux-gnu/release/deps/core_arch-92d6a56b530b6fbf
Offset in image: 0x6cef49
If you believe your application should attempt to execute
this illegal instruction (and others that may be present),
Then use this knob: -emit-illegal-insts 0
and this error message will be avoided.
Use -print-exception-details to get more details.

SDE ERROR: Illegal instruction at address = 56435cedbf49: f2 0f 2b 00 0f ae f8 f2 0f 10 04 24 66 0f 2e 
Image name: /checkout/target/x86_64-unknown-linux-gnu/release/deps/core_arch-92d6a56b530b6fbf
Offset in image: 0x6cef49
If you believe your application should attempt to execute
this illegal instruction (and others that may be present),
Then use this knob: -emit-illegal-insts 0
and this error message will be avoided.
Use -print-exception-details to get more details.

@sayantn
Copy link
Contributor

sayantn commented Oct 7, 2025

I don't think I have seen this before, seems to be in _mm_stream_sd. I did recently update the implementation in #1929, but the CI did pass there (and locally). It shouldn't happen if we didn't change anything, might be spurious. Worth a retry in CI

instead of letting C++ do it (and potentially change the bits)
@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-x86-addition branch from 81bd07f to 8c19290 Compare October 7, 2025 05:16
@Amanieu
Copy link
Member

Amanieu commented Oct 7, 2025

_mm_stream_sd and _mm_stream_ss seem to be in sse4a, not sse2.

@sayantn
Copy link
Contributor

sayantn commented Oct 7, 2025

Yea, and the CI failures looked like it got to sse4a and then failed (although I don't know how much we can trust output order, considering that the tests are parallelized)

@Amanieu
Copy link
Member

Amanieu commented Oct 7, 2025

SSE4a is an AMD extension, so intel-sde may not support it. Does the intrinsic tester check for feature availability before using those features?

@sayantn
Copy link
Contributor

sayantn commented Oct 7, 2025

Iirc sde supports (most) amd extensions. It even has a cpu preset named amd-future. Currently we just enable the sse4a, vp2intersect, xop and tbm in the future cpu preset of sde

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants