Skip to content

[QST][CUTEDSL] Address Misalignment in FP8 Gemm  #2902

@Willie-Qu

Description

@Willie-Qu

I am seeking help from Cutlass community. I encountered misaligned address (16 bytes requested) when load matrix from smem to register. I found out the problem started at "thr_copy_ldmatrix_A.partition_S(sA)".
Does anyone know how to debug and solve this issue?

--------------------------------
DEBUG: sA: 
tensor<ptr<i8, smem, align<1024>> o ((64,1),(8,8),(1,4)):((1,0),(64,512),(0,4096))>

DEBUG: thr_copy_ldmatrix_A: 
Tiled Copy
  Tiler MN:        (32:1,32:1)
  TV Layout tiled: ((4,8,2,2),((4,2,2),(1,1))):((128,1,16,0),((32,8,512),(0,0)))
Copy Atom
  ThrID:           32:1
  TV Layout Src:   ((2,2,4,2),16):((16,128,32,0),1)
  TV Layout Dst:   ((4,8),(1,2,2,2)):((32,1),(1,16,8,128))
  Value type:      i8
--------------------------------
tCsA_copy_view = thr_copy_ldmatrix_A.partition_S(sA)

DEBUG: tCsA_copy_view: 
tensor<ptr<i8, smem, align<8>> o (((8,2),2),2,2,(1,4)):(((1,128),1024),32,2048,(0,4096))>
--------------------------------

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions