Skip to content

Conversation

@mieshkiwrk
Copy link
Contributor

@mieshkiwrk mieshkiwrk commented Oct 27, 2025

There's silent data corruption when calling tl.histogram with interpreter.

# test.py
import torch
import ctypes
import triton
import triton.language as tl


@triton.jit
def histogram_kernel(x_ptr, z_ptr):
    offset = tl.arange(0, 1)
    x = tl.load(x_ptr + offset)
    z = tl.histogram(x, 1)
    buf = (ctypes.c_int32 * 2).from_address(int(z_ptr))

    print(f'before store: {list(buf)}')
    tl.store(z_ptr + offset, z) # tl.store treats z values as int64 while they're int32
    print(f'after store: {list(buf)}')


device = 'cpu'
torch.manual_seed(17)
x = torch.ones(1, device=device, dtype=torch.int32)
z = torch.ones(2, dtype=torch.int32, device=device)
histogram_kernel[(1, )](x, z)

# Output:
# TRITON_INTERPRET=1 TRITON_TEST_SUITE=interpreter python test.py 
# before store: [1, 1]
# after store: [1, 0] <- second element shouldn't be cleared

Based on np.histogram docs: https://numpy.org/doc/2.3/reference/generated/numpy.histogram.html
Returned dtype is taken account when optional weights param is passed, int64 othwerwise.
That leads to tl.store thinking it's saving int32 values while there's int64 in my example tensor passed, so it's writing 8 bytes at once instead of 4 bytes, leading to writing 4 bytes exceeding it's data range causing silent data corruption.

import numpy as np

data = np.array([1], dtype=np.int32)
bins = 1

print(f'Data dtype before: {data.dtype}')
histogram = np.histogram(data, bins=bins, range=(0, bins))[0]
print(f'Data dtype after: {histogram.dtype}')

# Data dtype before: int32                                                                                                                                           
# Data dtype after: int64

Applying "dummy_weights" fixes returned data type as expected fixing data corruption.

@mieshkiwrk mieshkiwrk changed the title [Interpreter][histogram] Silent data corruption [Interpreter][histogram] Fix silent data corruption Oct 27, 2025
@mieshkiwrk mieshkiwrk requested a review from Copilot October 27, 2025 13:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes silent data corruption in the interpreter's tl.histogram implementation. The issue occurred because np.histogram returns int64 values by default, but when storing int32 tensors, tl.store would write 8 bytes instead of 4, corrupting adjacent memory.

Key changes:

  • Added dummy weights parameter to np.histogram to preserve the input data type
  • Added detailed comment explaining the fix

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Docs specify that returned dtype is taken based on optional weights.dtype
# This is fix for interpreter cases where for example int32 tensor is being passed
# But unexpectedly int64 values are being returned causing
# tl.store to write 8 bytes instead of 4 bytes which lead to silent data corruption
Copy link

Copilot AI Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Add a comment explaining that the dummy weights preserve the input dtype in the histogram output, as this is a subtle behavior of np.histogram that may not be immediately obvious to future maintainers.

Suggested change
# tl.store to write 8 bytes instead of 4 bytes which lead to silent data corruption
# tl.store to write 8 bytes instead of 4 bytes which lead to silent data corruption
# Note: Providing dummy_weights with the same dtype as the input ensures that
# np.histogram returns a histogram with the same dtype as the weights.
# This is a subtle behavior of np.histogram; without weights, the output dtype defaults to int64.

Copilot uses AI. Check for mistakes.
@mieshkiwrk mieshkiwrk linked an issue Oct 27, 2025 that may be closed by this pull request
Copy link
Contributor

@anmyachev anmyachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!

@mieshkiwrk mieshkiwrk merged commit 3a1a8c6 into main Oct 29, 2025
23 checks passed
@mieshkiwrk mieshkiwrk deleted the mdziado/interpreter_fix_numpy_histogram_dtype branch October 29, 2025 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FAILED language\test_core.py::test_histogram[8-512] on BMG, Windows

3 participants