-
Notifications
You must be signed in to change notification settings - Fork 7
Description
https://wiki.edg.com/bin/view/Wg21sofia2025/P344020
Name
n_elements might not be the best name. Alternative suggestions:
- first_n_elements
- set_first_n_elements
- set_mask_n
- set_first_n
- mask_of_n
- mask_with_n_set
- first_n_bits_of_mask
- first_n_of_mask
- mask_with_n_set
- (please add...)
Free function
Should it be a free function so that it can be simd-generic? It would need to take a template:
auto m0 = simd::n_elements<float>(x);
auto m1 = simd::n_elements<simd_mask<float>>(x);
Preconditions
Currently:
N <= 0 gives empty mask.
N >= size() gives full mask
Should preconditions be added to some of the possibilities?
Negative - Doesn't make sense for N to ever be negative (unless you do something tricksy like making it mean the last N, but that is horrible), so this seems reasonable.
Zero - this might indicate an error since the program should never be processing nothing and should probably have detected that and done it differently.
N == size() - A bit like Zero, this might indicate that the program has an error since it should have handled a full block differently (and using a mask can be expensive). But a common pattern in targets with cheap masks is to have a loop which processes all blocks in the same way, even if they are only partial:
for (auto b : blocks)
process(data, n_elements(b.size())
// No special remainder handling - the loop processes full blocks or partial blocks with the same loop above.
N>size() - It might indicate an error, but when processing blocks from a larger data it would allow you to track how many remaining items are left:
for (int i=0; i<data.size(); ++i)
{
block = spanX.subspan(i); // get last part of span
s = partial_load<SIMD>(block); // Don't care if the block's subspan is too big.
nm = simd::n_elements<SIMD::mask>(data.size() - i); // Maybe we shouldn't care if N is too big here either?
}
My preference is for N>0 only, which allows N>=size() for the possible coding styles above. I might be persuaded to also add N<=size() though, but not N < size().