Skip to content

Possible incorrect name mangling in CUDA kernel instantiation #121468

@OgnianM

Description

@OgnianM

I've been diagnosing a problem related to a cudaKernelLaunch returning "named symbol not found" for certain kernel launches.
The code that produces the problem is quite complicated and I've been unable to isolate it into something simpler yet, but it seems like the name mangling is producing invalid names.

Running this code:

KernelCUDA<ExecutionShapes_t, Threads, Blocks, Functions_t, decltype(Parameters)...><<<Blocks, Threads, MaxShmem, stream>>>(Functions, Parameters...);

if (cudaGetLastError() == cudaErrorSymbolNotFound) {
    auto name = typeid(KernelCUDA<ExecutionShapes_t, Threads, Blocks, Functions_t, decltype(Parameters)...>).name();
    std::cout << "Mangled: " << name << std::endl;
    int status = 0;
    auto demangled = abi::__cxa_demangle(name, 0, 0, &status);
    std::cout << "Demangled: " << demangled << std::endl;
}

The mangled name ends up being FvN5tadma5tupleIJZZZZNS_9OptimizerIJNS_4NodeINS_8SequenceIJEEEZ9InputNodeIJRNS_6TaggedILi0ENS_6TensorIlNS_9AllocatorILNS_6MemoryE1EEENS3_IJLi1ELi10EEEES4_EEEERNS6_ILi2ESC_EERNS6_ILi1ESC_EEEEDaDpOT_EUlDpRKSJ_E_JSD_SF_SH_EEENS2_INS3_IJLi1ELi10ELi768EEEEZNS_6gatherILi0ETkNS_9AnyTensorENS7_IfSA_NS3_IJLi30522ELi768EEEES4_EETkNS_9AnyTensorESD_EEDaRKT0_RKT1_EUlT_RSV_S10_RKT2_E_JNS7_IfSA_SR_S4_EEKSU_KSD_EEENS2_ISR_ZNSS_ILi0ETkNS_9AnyTensorENS7_IfSA_NS3_IJLi2ELi768EEEES4_EETkNS_9AnyTensorESH_EEDaSX_S10_EUlS11_S12_S10_S15_E_JS17_KS1C_KSH_EEENS2_ISR_ZNS_13CombineToNodeITkNS_9AnyTensorENS6_ILm3765833763ES17_EETkNS_9AnyTensorENS6_ILm2639559269ES17_EEZNS_plITkNS_9AnyTensorES1I_TkNS_9AnyTensorES1J_EEDaRKS11_SX_EUlS1M_SX_E_EEDaS1M_SX_S10_EUlS1M_SX_S10_RS13_E_JKS1I_KS1J_S17_EEENS2_ISR_ZNS1H_ITkNS_9AnyTensorENS6_ILm2072007968ES17_EETkNS_9AnyTensorES17_ZNS1K_ITkNS_9AnyTensorES1T_TkNS_9AnyTensorES17_EEDaS1M_SX_EUlS1M_SX_E_EEDaS1M_SX_S10_EUlS1M_SX_S10_S1O_E_JKS1T_KS17_S17_EEENS2_INS3_IJXtlNS_15BlockSizeMarkerILl512EEEEELi10ELi1EEEENS6_ILj2048EZNS_10ReduceNodeILi2ELb1ETkNS_9AnyTensorENS6_ILm3627180312ES17_EEZNHS17_4meanILin1ELb1ERS23_EEDcOSY_EUlS1M_SX_E_UlS1M_E_ZNHS24_ILin1ELb1ES25_EEDcS26_EUlS1M_E_QeqsrSY_6deviceLS9_1EEEDaS10_S15_RKT3_RKT4_EUlS1M_SX_RSY_E_EEJKS23_NS7_IfSA_NS3_IJLi1ELi10ELi1EEEES4_EEEEENS2_IS4_Z10OutputNodeIJRNS6_ILm129000195ENS7_IfSA_NS3_IJLi1ELi10ELi30522EEEES4_EEEEEEDaSL_EUlSO_E_JS2Q_EEEEE12CompileGraphILS9_1EEENS_13CompiledGraphERKSQ_RKS1A_RKS1G_RKS1S_RKS1Y_RKS2M_RKS2T_ENKUlTnivE_clILi3EEEDavENKUlTpTnivE_clIJLi2ELi3EEEEDavENKUlTnivE_clILi0EEEDavEUlRS11_DpOT0_E_ZZZZNS2V_ILS9_1EEES2W_S2Y_S30_S32_S34_S36_S38_S3A_ENKS3C_ILi3EEEDavENKS3E_IJLi2ELi3EEEEDavENKS3G_ILi1EEEDavEUlS3H_S3K_E_EEERNS7_IfNS8_ILS9_3EEESR_S4_EERS1C_RSC_RS17_S3T_E

And abi::__cxa_demangle returns status code -2 mangled_name is not a valid name under the C++ ABI mangling rules

This happens when instantiating any kernel at this point in the code, but only if it's instantiated with a lambda.

__global__ void RandomKernel(auto) {}

...
RandomKernel<<<1,1>>>([]{}); // Boom

So a kernel instantiated with a lambda (or any type?) dependent on a complicated mess of templates breaks name mangling at some point.

I'm still working to isolate the problem and will post any additional findings here, any thoughts appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    clang:frontendLanguage frontend issues, e.g. anything involving "Sema"cudaduplicateResolved as duplicate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions