Introduce Scalar Simd impl #111
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This
Simdimplementation has a native lane width of 1, which is convenient for reusing SIMD-capable logic to compute a single value, either because that's all you need, or to help handle an unaligned total number of inputs.Blockers:
u8as both an element and a mask type results in some type collisions. Probably easy to fix by adding a newtype. I previously triedboolas a mask type but that didn't quite work out.Simdmethod is copied fromFallback. That's a lot of duplication. Should this replaceFallback? That might result in worse code for applications which actually do want data-parallelism but aren't targeting an environment otherwise supported by this crate. Is that a realistic concern?