Wiser handling of CUDA toolkit vs installed driver version mismatch
Sometimes, one will find a newer CUDA toolkit than is fully supported by the CUDA driver installed on the system. Example: Driver 440.82 which works with CUDA 10.2 and CUDA 11.2 manually installed to /usr/local/cuda
.
In this case, compiling against the toolkit, then linking against the driver .so
library (libcuda.so
) sometimes fails (and there may also be runtime errors if it doesn't). See this SO question for an example.
Now, CMake's behavior - at least with version 3.13.4 - is not to complain about any mismatch, hope for the best, compile against the toolkit and link against the system-installed driver library. This results in the linking error I just mentioned.
Now, the three possible approaches to this matter I can think of are:
- Declare an error on configuration.
- Use the stub library which comes with the CUDA toolkit,
lib64/stubs/libcuda.so
- link against that. The linking will work, but the executable will (likely? certainly?) fail to run on the system, or run and report a version mismatch. - Do what is done now.
I don't think the current behavior is appropriate - if nothing else, then at least because users may be confused regarding the cause of the linking failure.