Clang CUDA on Windows doesn't build without manual flags and standard paths
I have just tried to use CMake 3.17.20200520-g81e8f627 and clang to compile a CUDA project we normally build with nvcc. I bumped into a few issues:
CMake seems to ignore theThis is intended.
CMAKE_CUDA_COMPILER_WORKSoption, so the compiler check is always performed.
At least during the compiler test, CMake does not seem to forward theThis was because
CMAKE_CUDA_ARCHITECTURESvalues to clang, I have to set it manually using
CMAKE_CUDA_FLAGS=--cuda-gpu-arch=XXX. Otherwise it tries to compile for
sm_20, which apparently my CUDA installation (CUDA 10.1) doesn't support.
cuda-pathwasn't set, see below.
Related to the above, I don't know how to make CMake compile for multiple GPU architectures at once.Trick no longer required.
CMAKE_CUDA_ARCHITECTUREScan contain an array of architectures (since nvcc can compile for multiple architectures at once), but since clang can only compile for one architecture at a time with
--cuda-gpu-arch, I cannot build multiple architectures with the trick above.
CMake does not seem to forward the path to the CUDA runtime to clang, I have to set it manually using
When linking the test program, the clang linker cannot find the CUDA lib files because the linker include path is set to just
<CUDA_PATH>/lib(which is empty) instead of
<CUDA_PATH>/lib/x64(which contains the 64bit lib files). I copied the lib files manually into the
After that last step, CMake decided the compiler was working.
Then I went on to compile our project, a shared library, in
CMAKE_BUILD_TYPE=RelWithDebInfo. Generator is MinGW Makefiles. Compilation went fine, however at link time there were inconsistencies in the runtime libraries being used. CXX files correctly used
MD_DynamicRelease, but CU files used
MT_StaticRelease. This can be seen below.
mingw32-make reports the following CXX compiler command being used:
C:\PROGRA~1\LLVM\bin\CLANG_~1.EXE <CUSTOM_PREPROCESSOR> <CUSTOM_INCLUDE_PATHS> -O2 -g -DNDEBUG -Xclang -gcodeview -D_DLL -D_MT -Xclang --dependent-lib=msvcrt <CUSTOM_COMPILER_OPTS> -o <OUTPUT_FILE> -c <CXX_FILE>
and the following CU compiler command:
C:\PROGRA~1\LLVM\bin\CLANG_~1.EXE <CUSTOM_PREPROCESSOR> <CUSTOM_INCLUDE_PATHS> --cuda-gpu-arch=sm_30 --cuda-path=C:/mingw/cuda --cuda-gpu-arch=sm_30 <CUSTOM_COMPILER_OPTS> -std=gnu++14 -x cuda -c <CUDA_FILE> -o <OUTPUT_FILE>
and for the linker:
C:\PROGRA~1\LLVM\bin\CLANG_~1.EXE -fuse-ld=lld-link -nostartfiles -nostdlib -O2 -g -DNDEBUG -Xclang -gcodeview -D_DLL -D_MT -Xclang --dependent-lib=msvcrt -shared -o <OUTPUT_DLL> -Xlinker /implib:<LIBRARY>.lib -Xlinker /pdb:<LIBRARY>.pdb -Xlinker /version:0.0 <OBJECTS> <LINK_LIBS>
-D_DLL -D_MT are missing for the CU compilation command, but also other important compilation flags, like
-O2 -g. Notice also how there are now two
--cuda-gpu-arch=sm_30; I guess one of them is the one I added manually in
CMAKE_CUDA_FLAGS (to make the compiler test pass) and the other was added by CMake somehow.