CUDA: Separable compilation fails with Clang and native/all/all-major archs
CMake fails when CMAKE_CUDA_ARCHITECTURES is set to native, all or all-major with Clang for targets with CUDA_SEPARABLE_COMPILATION setting as ON, then cmake does not detect correct value for architecture
When env is clang and CMAKE_CUDA_ARCHITECTURES is set to native, all, or all-major then targets which has CUDA_SEPARABLE_COMPILATION set to ON, gives error because it uses architecture as sm_native if native is set and sm_all if all or all-major is set for CMAKE_CUDA_ARCHITECTURES
The targets which are set with CUDA_SEPARABLE_COMPILATION as ON, calls internally in cmake "cmMakefileExecutableTargetGenerator::WriteDeviceExecutableRule" function in file cmMakefileExecutableTargetGenerator.cxx (!5221 (diffs))
if (this->Makefile->GetSafeDefinition("CMAKE_CUDA_COMPILER_ID") == "Clang") {
this->WriteDeviceLinkRule(commands, targetOutput);
} else {
this->WriteNvidiaDeviceExecutableRule(relink, commands, targetOutput);
}
so when env is gnu, then it calls WriteNvidiaDeviceExecutableRule which does not use CMAKE_CUDA_ARCHITECTURES variable but when env is clang, it calls WriteDeviceLinkRule and it uses the value of CMAKE_CUDA_ARCHITECTURES variable. And this does not find the exact architecture value when CMAKE_CUDA_ARCHITECTURES is native, all or all-major.