Use -fvisibility=hidden during CUDA device step linking when CMAKE_CUDA_VISIBILITY_PRESET is hidden
description of the issue
When linking an executable that depends on a static library w/ CUDA code and shared library that also depends on the same static library, the executable crashes at at startup or errors when calling into CUDA code. The problem specifically occurs when one of the libraries is shared and the others are static, forcing them all shared avoids the issue. In our actual project which is fairly complex this manifests as a SEGV before main in __cudaRegisterLinkedBinary
. In our much simplified reproducer code (details below) the problem manifests as "invalid device function error". In both cases adding -fvisibility=hidden
to the CUDA device linking step resolves the issue. We have set CMAKE_CUDA_VISIBILITY_PRESET hidden
in the project's top level CMakeLists.txt and the flag -fvisibility=hidden
appears everywhere else but the CUDA device linking step. CMake should use the -fvisbility=hidden
flag in the CUDA device linking step when CMAKE_CUDA_VISIBILITY_PRESET
is set to hidden.
steps to reproduce
There is a small reproducer that can be used to illustrate the issue. Here are the steps to clone, compile, and run.
git clone git@github.com:burlen/device_link_issue.git
cd device_link_issue
mkdir bin
cd bin
rm -rfI ./*; cmake -DBUILD_SHARED_LIBS=OFF ..
make VERBOSE=1
./test/exec
The code should print a string of "0.25" the results of a simple calculation. Instead there is a "invalid device function" error.
To illustrate that the problem is the lack of -fvisibiltiy=hidden
during the device linking step, uncomment line 83 of the CMakeLists.txt (target_link_options(${TGT_NAME} PRIVATE $<DEVICE_LINK:-fvisibility=hidden>)
) and repeat the configure, make and run steps:
rm -rfI ./*; cmake -DBUILD_SHARED_LIBS=OFF ..
make VERBOSE=1
./test/exec
the code now produces the correct result.
system details and versions
fedora 35 linux
cmake version 3.22.2
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
g++ (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)
03:00.0 VGA compatible controller: NVIDIA Corporation TU104GL [Quadro RTX 4000] (rev a1)
NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6