-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
Description
Which component has the problem?
CuTe DSL
Bug Report
Describe the bug
cute.print_tensor() doesn't print anything if the tensor is the result of a copy.
Example output of the script:
(.leovenv) ubuntu@machine $ python cant_print_tensor_after_a_copy.py
(.leovenv) ubuntu@machine $
Steps/Code to reproduce bug
import cutlass.cute as cute
import cutlass
@cute.kernel
def print_kernel():
smem = cutlass.utils.SmemAllocator()
s = smem.allocate_tensor(cute.Float16, cute.make_layout((8, 8), stride=(8, 1)), byte_alignment=16)
r = cute.make_fragment(s.layout, s.element_type)
cute.basic_copy(s, r) # if you comment this out, the tensor will print correctly
cute.print_tensor(r)
@cute.jit
def print_launcher():
print_kernel().launch(grid=(1, 1, 1),block=(1, 1, 1))
cutlass.cuda.initialize_cuda_context()
print_launcher()Expected behavior
cute.print_tensor() should always print the tensor at runtime if the IR compiled correctly.
Environment details (please complete the following information):
using nvidia-cutlass-dsl==4.1.0 on ubuntu 22.04 arm64 on b200.
Reactions are currently unavailable