Skip to content

[BUG][CuTeDSL] cute.print_tensor() won't print anything if the tensor is the result of a copy #2631

@sfc-gh-lpaille

Description

@sfc-gh-lpaille

Which component has the problem?

CuTe DSL

Bug Report

Describe the bug
cute.print_tensor() doesn't print anything if the tensor is the result of a copy.

Example output of the script:

(.leovenv) ubuntu@machine $ python cant_print_tensor_after_a_copy.py 
(.leovenv) ubuntu@machine $ 

Steps/Code to reproduce bug

import cutlass.cute as cute
import cutlass

@cute.kernel
def print_kernel():
    smem = cutlass.utils.SmemAllocator()
    s = smem.allocate_tensor(cute.Float16, cute.make_layout((8, 8), stride=(8, 1)), byte_alignment=16)
    r = cute.make_fragment(s.layout, s.element_type)
    cute.basic_copy(s, r) # if you comment this out, the tensor will print correctly
    cute.print_tensor(r)

@cute.jit
def print_launcher():
    print_kernel().launch(grid=(1, 1, 1),block=(1, 1, 1))

cutlass.cuda.initialize_cuda_context()
print_launcher()

Expected behavior
cute.print_tensor() should always print the tensor at runtime if the IR compiled correctly.

Environment details (please complete the following information):
using nvidia-cutlass-dsl==4.1.0 on ubuntu 22.04 arm64 on b200.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions