Combining XLF with FindCUDAToolkit + CUDA-Fortran
This is perhaps a bit vague without a minimum working reproducer, but I'm wondering if there are examples/documentation somewhere for successful CUDA-Fortran CMake builds with this kind of complexity:
- IBM XL toolchain (XLF, XLC) with mixed C/C++/Fortran code on Power9 architecture
- Use
FindCUDAToolkit
to pull in the appropriate CMake targets (likeCUDA::cudart
for linking) - use
-qcuda
to enableCUDA
on the XLF side - generator is
Ninja
I'm finding it really hard to get the build and especially linking (device linking in particular) to run smoothly. Is there a way to "force" device linking in a canonical CMake way on top of the host linking? It seems like CMake is perfectly happy to build without the device link at the moment.
Any hints on internal variables I might need to mess with here? Do I need enable_language(CUDA)
even though I'm not directly invoking nvcc
and using CUDA-Fortran? It hasn't had much effect so far.
I can get somewhat better behavior by using the xlcuf
frontend for linking vs. xlf2008_r
or the xlc++
, but can never really get around issues like "cudaGetSymbolAddress failed with error code 13: invalid device symbol
produced by __xlcuf_init()
when mixing C main()
with CUDA-Fortran calls for example.
Perhaps this should work just fine and some docs are just needed, or perhaps there are some manual shims you might suggest?
To make matters more complex, the online docs for XLF seem to suggest that host and device Fortran constructs are not genuinely separated until the IR is produced, which is forcing me to only apply i.e., -qcuda
at set_source_files_properties
level because otherwise the builds will fail with Fortran construct issues.