CUDA: Always extract CUDA Toolkit root from nvcc verbose output
Fixes #21750, #21763 Given that NVCC can be provided by multiple different sources (NVIDIA HPC SDK, CUDA Toolkit, distro) each of which has a different layout, we need to extract the CUDA toolkit root from the compiler itself, allowing us to support numerious different scattered toolkit layouts. The NVIDIA HPC SDK specifically ships two copies of nvcc one in `compilers/bin/` and one in `cuda/bin`. Thus when using `compilers/bin/nvcc` the Toolkit root logic fails.
Showing with 32 additions and 38 deletions