CUDA: Support PTX, fatbin and cubin targets when CUDA is enabled
The CUDA compilation trajectory includes:
- PTX intermediate representation code, immediate result of compiling a (possibly preprocessed) device-side CUDA source file
- cubin files, containing fully-compiled SASS assembly code for a certain NVIDIA GPU microarchitecture
- fatbin files, ELF files containing fully-compiled code for some number of microarchitectures, and possibly some PTX sections as well.
As CUDA is - or seems to be - a first-class citizen in CMake now, we should be able to define targets which are PTX, cubin or fatbin files - rather than go through hoops such as custom target definitions to achieve that.
PTX is actually practically done already - if we now compile a device-code-only file like so:
add_library(mylib OBJECT a.cu) set_property(TARGET mylib PROPERTY CUDA_PTX_COMPILATION ON)
then we'll only get a PTX file, except it'll be within CMakeFiles rather than a proper output location.
cubin and fatbin could be achievable either using properties of
add_library() targets, or using a different command (
add_cuda_binary_target maybe?). Implementation-wise, it's basically using
-cubin and ensuring appropriate
--generate-code is specified (e.g. only one for cubin I guess). Actually, maybe you don't even have to check the
--generate-code validity and let the compiler/linker complain if they want to.