-
Notifications
You must be signed in to change notification settings - Fork 75
[Interpreter][histogram] Fix silent data corruption #5391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Interpreter][histogram] Fix silent data corruption #5391
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes silent data corruption in the interpreter's tl.histogram implementation. The issue occurred because np.histogram returns int64 values by default, but when storing int32 tensors, tl.store would write 8 bytes instead of 4, corrupting adjacent memory.
Key changes:
- Added dummy weights parameter to
np.histogramto preserve the input data type - Added detailed comment explaining the fix
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Docs specify that returned dtype is taken based on optional weights.dtype | ||
| # This is fix for interpreter cases where for example int32 tensor is being passed | ||
| # But unexpectedly int64 values are being returned causing | ||
| # tl.store to write 8 bytes instead of 4 bytes which lead to silent data corruption |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Add a comment explaining that the dummy weights preserve the input dtype in the histogram output, as this is a subtle behavior of np.histogram that may not be immediately obvious to future maintainers.
| # tl.store to write 8 bytes instead of 4 bytes which lead to silent data corruption | |
| # tl.store to write 8 bytes instead of 4 bytes which lead to silent data corruption | |
| # Note: Providing dummy_weights with the same dtype as the input ensures that | |
| # np.histogram returns a histogram with the same dtype as the weights. | |
| # This is a subtle behavior of np.histogram; without weights, the output dtype defaults to int64. |
anmyachev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job!
There's silent data corruption when calling
tl.histogramwith interpreter.Based on
np.histogramdocs: https://numpy.org/doc/2.3/reference/generated/numpy.histogram.htmlReturned dtype is taken account when optional weights param is passed, int64 othwerwise.
That leads to
tl.storethinking it's saving int32 values while there's int64 in my example tensor passed, so it's writing 8 bytes at once instead of 4 bytes, leading to writing 4 bytes exceeding it's data range causing silent data corruption.Applying "dummy_weights" fixes returned data type as expected fixing data corruption.