Skip to content

CUDA: Host-side static class member leaks into PTX as extern global #98151

@mkuron

Description

@mkuron

Consider the following bit of code:

template <class _Elem>
class codecvt {
    static int id;
};

template <class _Elem>
int codecvt<_Elem>::id;

template class codecvt<char>;

Including this into a CUDA file (you don't even have to use the class in a device-side function or variable) leads to warnings like ptxas warning : Unresolved extern variable '_ZN7codecvtIcE2idE' in whole program compilation, ignoring extern qualifier or, when compiling with -fgpu-rdc, errors like these at link time: nvlink error : Undefined reference to '_ZN7codecvtIcE2idE'. The resulting PTX contains this line: .extern .global .align 4 .u32 _ZN7codecvtIcE2idE;.

This code pattern with the static member of a template class plus explicit instantiation appears several times throughout Microsoft's STL implementation, which e.g. makes it impossible to #include <locale> (and various other headers) when using CUDA on Windows. The code above was reduced from https://github.com/microsoft/STL/blob/vs-2019-16.9/stl/inc/xlocale.

Godbolt: https://godbolt.org/z/n1jzMrMqd

When compiling with -O1 or higher, the GlobalOptPass eliminates the unused variable from the device code and the problem goes away. NVCC does not show this issue, it never generates .extern .global symbols for host-side static class members. Clang appears to only be generating these extraneous .extern .global symbols for code that exactly follows the above pattern; eliminating the templates or even replacing the explicit instantiation with implicit instantiation makes the problem go away.

The same problem also affects AMD HIP, see https://godbolt.org/z/hd9YKT36s. There the extraneous symbol shows up as .hidden in the assembly code and it is eliminated by the GlobalDCEPass in -O1 and higher.

Metadata

Metadata

Labels

clang:codegenIR generation bugs: mangling, exceptions, etc.cuda

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions