Skip to content

Vectordave - Add vectorization on x86/ARM#1024

Merged
rbergen merged 3 commits intodrag-racefrom
vectordave
Oct 26, 2025
Merged

Vectordave - Add vectorization on x86/ARM#1024
rbergen merged 3 commits intodrag-racefrom
vectordave

Conversation

@davepl
Copy link
Contributor

@davepl davepl commented Oct 25, 2025

This adds vectorization approaches for SSE2, AVX, AVX512, and NEON. It's worth about 30% perf on the Mac and ubdellamd.

It also enables restrict in the hopes it keeps the compiler assured that sieve memory is not being aliased.

  • I read the contribution guidelines in CONTRIBUTING.md.
  • I placed my solution in the correct solution folder.
  • I added a README.md with the right badge(s).
  • I added a Dockerfile that builds and runs my solution.
  • I selected drag-race as the target branch.
  • All code herein is licensed compatible with BSD-3.

Comment on lines +359 to +363
while (wordIndex + cycleLen <= fullWordCount)
{
size_t idx = 0;
while (idx + 8 <= cycleLen)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question applies to all main and nested while loops in this section of code: is there a particular reason for using while instead of for?

I know this is dangerously close to a coding style comment and I'm the first to say we shouldn't spend too much time on those, certainly in this project. However, I think combining the loop condition and the increment expression (which is neatly at the end of every loop) in one line clarifies the "cycle logic" in this particular case.

@davepl
Copy link
Contributor Author

davepl commented Oct 26, 2025 via email

@davepl
Copy link
Contributor Author

davepl commented Oct 26, 2025 via email

@rbergen
Copy link
Contributor

rbergen commented Oct 26, 2025

The screen snippet doesn't come through when you email respond to GitHub comments, but I imagine it's an example of a for loop with an empty first clause (for (; wordIndex + cycleLen <= fullWordCount; wordIdex += cycleLen) or for(; idx + 8 <= cycleLen; idx += 8)).

If that's right then I was aware of this, but I still think the looping logic itself is more clearly expressed in a for context. However, the fact I'm one of the maintainers of this project doesn't change it's your solution, so if you prefer the while, then that's what we'll merge.

@davepl
Copy link
Contributor Author

davepl commented Oct 26, 2025 via email

@rbergen rbergen merged commit 38c9bee into drag-race Oct 26, 2025
344 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants