CUDA::cuda_driver soname
Hi! I'm experiencing soname mismatch issues trying to link a target that transitively depends on CUDAToolkit::cuda_driver
, when using the stub driver library at build time.
Context
At least in Nixpkgs, when building a package that asks to link libcuda.so
directly, we build it against the stub libraries distributed by nvidia together with cudart:
❯ tar -tJf - < <(wget -O- -q https://developer.download.nvidia.com/compute/cuda/redist/cuda_cudart/linux-x86_64/cuda_cudart-linux-x86_64-11.8.89-archive.tar.xz) | grep libcuda.so
cuda_cudart-linux-x86_64-11.8.89-archive/lib/stubs/libcuda.so
We later strip and substiute the runpaths so as to look up the driver in a dedicated impure location.
CMake recording libcuda.so.1
in DT_NEEDED
rather than libcuda.so
presents us a problem in that there's no versioned soname in the upstream's tarball and we can't (won't) allow the build to see the real driver. We could maybe take the burden of generating the symlinks, but I'd like to verify there isn't an error happening in the previous steps
An example of where we encounter this issue: https://github.com/ggerganov/llama.cpp/pull/4606/files#diff-1e7de1ae2d059d21e1dd75d5812d5a34b0222cef273b7c3a2af62eb747f9d20aR305-R306
llama-cpp-cuda-f08e03e-dirty> : && /nix/store/...-gcc-wrapper-11.4.0/bin/g++ -O3 -DNDEBUG tests/CMakeFiles/test-tokenizer-0-falcon.dir/test-tokenizer-0-falcon.cpp.o -o bin/test-tokenizer-0-falcon common/libcommon.a libllama.so && :
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: warning: libcuda.so.1, needed by libllama.so, not found (try using -rpath or -rpath-link)
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: libllama.so: undefined reference to `cuMemCreate'
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: libllama.so: undefined reference to `cuMemAddressReserve'
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: libllama.so: undefined reference to `cuMemSetAccess'
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: libllama.so: undefined reference to `cuDeviceGet'
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: libllama.so: undefined reference to `cuGetErrorString'
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: libllama.so: undefined reference to `cuDeviceGetAttribute'
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: libllama.so: undefined reference to `cuMemMap'
llama-cpp-cuda-f08e03e-dirty> /nix/store/...-binutils-2.40/bin/ld: libllama.so: undefined reference to `cuMemGetAllocationGranularity'
llama-cpp-cuda-f08e03e-dirty> collect2: error: ld returned 1 exit status
https://gist.github.com/SomeoneSerge/c93b43b19607a9217ddb79d26b0f7d77
Thanks