-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Possibly related to this comment in intercept.cc,
it seems like Omniprobe returns the following when each kernel argument is smaller that 8 Bytes.
>>>>>>>> HSA intercept registered.
ADDRESS_MESSAGE,timestamp,kernel,src_line,dispatch,exec_mask,xcc_id,se_id,cu_id,kind,address
host_device_combined_omniprobe: /app/omniprobe/src/interceptor.cc:649: void hsaInterceptor::fixupKernArgs(void *, void *, void *, arg_descriptor_t): Assertion `desc.clone_hidden_args_length <= desc.kernarg_length - desc.explicit_args_length' failed.
This was encountered with the following kernel argument - when 32b integer arguments are used,
__global__ void hgemm_kernel(const __half* __restrict__ A,
const __half* __restrict__ B,
__half* __restrict__ C,
int M, int N, int K)
but not encountered (proceeds to instrumentation) with the following kernel argument - when replaced with 64b integers,
__global__ void hgemm_kernel(const __half* __restrict__ A,
const __half* __restrict__ B,
__half* __restrict__ C,
int64_t M, int64_t N, int64_t K)
coleramos425
Metadata
Metadata
Assignees
Labels
No labels