CUDA: Support PTX, fatbin and cubin targets when CUDA is enabled
The CUDA compilation trajectory includes:
- PTX intermediate representation code, immediate result of compiling a (possibly preprocessed) device-side CUDA source file
- cubin files, containing fully-compiled SASS assembly code for a certain NVIDIA GPU microarchitecture
- fatbin files, ELF files containing fully-compiled code for some number of microarchitectures, and possibly some PTX sections as well.
As CUDA is - or seems to be - a first-class citizen in CMake now, we should be able to define targets which are PTX, cubin or fatbin files - rather than go through hoops such as custom target definitions to achieve that.
PTX is actually practically done already - if we now compile a device-code-only file like so:
add_library(mylib OBJECT a.cu)
set_property(TARGET mylib PROPERTY CUDA_PTX_COMPILATION ON)
then we'll only get a PTX file, except it'll be within CMakeFiles rather than a proper output location.
cubin and fatbin could be achievable either using properties of add_executable()
/add_library()
targets, or using a different command (add_cuda_binary_target
maybe?). Implementation-wise, it's basically using -fatbin
or -cubin
and ensuring appropriate --generate-code
is specified (e.g. only one for cubin I guess). Actually, maybe you don't even have to check the --generate-code
validity and let the compiler/linker complain if they want to.