CUDA: Separable compilation fails with Clang and native/all/all-major archs

CMake fails when CMAKE_CUDA_ARCHITECTURES is set to native, all or all-major with Clang for targets with CUDA_SEPARABLE_COMPILATION setting as ON, then cmake does not detect correct value for architecture

When env is clang and CMAKE_CUDA_ARCHITECTURES is set to native, all, or all-major then targets which has CUDA_SEPARABLE_COMPILATION set to ON, gives error because it uses architecture as sm_native if native is set and sm_all if all or all-major is set for CMAKE_CUDA_ARCHITECTURES

The targets which are set with CUDA_SEPARABLE_COMPILATION as ON, calls internally in cmake "cmMakefileExecutableTargetGenerator::WriteDeviceExecutableRule" function in file cmMakefileExecutableTargetGenerator.cxx (!5221 (diffs))

  if (this->Makefile->GetSafeDefinition("CMAKE_CUDA_COMPILER_ID") == "Clang") {
    this->WriteDeviceLinkRule(commands, targetOutput);
  } else {
    this->WriteNvidiaDeviceExecutableRule(relink, commands, targetOutput);
  }

so when env is gnu, then it calls WriteNvidiaDeviceExecutableRule which does not use CMAKE_CUDA_ARCHITECTURES variable but when env is clang, it calls WriteDeviceLinkRule and it uses the value of CMAKE_CUDA_ARCHITECTURES variable. And this does not find the exact architecture value when CMAKE_CUDA_ARCHITECTURES is native, all or all-major.

Edited Mar 12, 2024 by Brad King

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information