CUDA: Cannot target CUDA Linker option "Perform Device Link" for static library compilation
We have a CUDA project using CMake which builds a static library, and then several targets which depend on the static library.
All targets use the CMake setting CUDA_SEPARABLE_COMPILATION ON
to enable relocatable device code.
Our project builds fine, however if we modify any CUDA within the static library and perform an incremental build on Windows/Visual Studio, we receive a link error from nvcc ('fatbinlinkerror') as the device code within the static library has not been re-linked. Hence we must always perform rebuilds, which takes significantly longer than an incremental build would.
We have reported this issue as a bug to the Nvidia's CUDA compiler team, and received a response notifying us that it was due to the static library not being told to perform a device link (the highlighted setting in the below screenshot). They also clarified that GCC always does a full device link, which is why that remains unaffected.
Looking through our CUDA targets, the two static libraries (created with add_library(${NAME} STATIC ${SRC})
) have this setting disabled, and the executable targets (created with add_executable(${NAME} ${SRC})
) all have it enabled. I can't see any obvious way to manually enable this setting via CMake (e.g. FindCUDA docs). My assumption would be that this setting should always be enabled for any target with CUDA_SEPARABLE_COMPILATION
enabled, however as a C/C++ static library does not have a link stage this has been overlooked in the case of CUDA.
Here's the full quote of the reply I received from reporting the bug to NVidia. I don't believe I can link to the actual bug report I sent them, as the nvcc bug reporting system is mostly private.
Thanks for reporting this ticket with reproducer . I do see the same behavior following the build steps you described . On windows , we had -rdc to generate relocatable device code , you do enabled it when build the device code lib (GPU codes), but you didn't enable the dlink option when generate this lib . The dlink is done when generating the final executable . After the first build , when you modify the devices codes in lib , VS would only detect there is incrementally changes libs and only do the host linking incrementally again . So as you see some device symbols may not found in the linking .
On linux , we usually do all links again especially when device codes are changed . As you can see your whole project may cost huge time to do a rebuild , so if there is only device codes in static libs and your parent project contains only host codes , enable the dlink in static libs is a good option for you . I tried on your project that should work , note that when you changed to enable the dlink , first rebuild is still needed .