-
Notifications
You must be signed in to change notification settings - Fork 222
Added floating point types support for work_group_reduce tests #2525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| for (size_t j = 0; j < max_wg_size; j++) | ||
| ref_vals[j] = get_random_float(min_range, max_range, d); | ||
|
|
||
| for (size_t i = 0; i < (size_t)n_elems; i += max_wg_size) | ||
| { | ||
| size_t work_group_size = std::min(max_wg_size, n_elems - i); | ||
| memcpy(&inptr[i], ref_vals.data(), | ||
| sizeof(Type) * work_group_size); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this means we'll be using the same input data for all work-groups. While that's not inherently bad, it does mean we're doing redundant testing by running multiple work-groups.
Some options to consider:
- Do nothing, these tests don't run for very long so a little redundant testing is fine.
- Test with different random data for each work-group instead, but verify against the worst-case
max_err. - Only test a single work-group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Test with different random data for each work-group instead, but verify against the worst-case max_err.
Please keep in mind that the max_err may be different for other random sets. That's why I propagated the same data across other workgroups to avoid such computations during the verification phase. Since it is not a performance load and the code is consistent with integral types perhaps it may stay as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards "do nothing", but I don't think this is consistent with the integral types. As far as I can tell:
- This code, for the floating-point types, generates a set of random data for the "max work-group size", then stamps out the same data for each work-group, until each of the "n_elems" has a value.
- The code below, for the integral types, generates a random value for each of the "n_elems", which means that each work-group will have a different set of random data.
Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, your description is correct. In other words kernels and CL calls stays the same for integral and floating-point types but generation and verification part is not consistent indeed.
Related to #1364