FindBLAS/find_library super slow on Windows dev machines
On a typical windows development system, we would run cmake in a Visual Studio developer shell which adds many entries to the PATH variable. For example on my current machine, the boiler-plates that get baked into PATH (both by visual studio and by standard Windows 10 setup) look like below:
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\bin\HostX64\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\vs2019\\x64;C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\vs2019\;C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\\x64;C:\Program Files (x86)\Windows Kits\10\bin\\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;...custom paths...;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\WINDOWS\System32\OpenSSH\;...some custom paths....;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\Linux\bin\ConnectionManagerExe
FindBLAS by default will invoke find_library for all 9 of its supported backend. Each invocation will then search through 20-30 different directories in PATH. On my system, this takes 4.3 seconds.
Moreover, FindBLAS DOES NOT cache its search result. Most packages out there do not enclose their find_package() or find_dependency() calls with if(NOT XXX_FOUND)
guards. Take the Ceres solver packaged in vcpkg as an example. Ceres has many dependency paths that leads to BLAS, and in some configuration (e.g., using suitesparse and blas), will invoke FindBLAS 4 times. This will get multiplied further if I have project A depending on B, and both of them uses ceres.
On my current project, this means FindBLAS gets called 8 times during cmake configuration, because Ceres gets find_packaged twice. This takes 30-40 seconds on my SSD-equipped Windows machine.
In the WSL environment (basically a linux vm inside the windows), the same setup takes 3 seconds, because find_library does much less searching and the filesystem implementation is faster.
Is there a way to get this mess cleaned-up? There are two major issues here:
- find_library(...) is always going to be slow on typical windows dev machines, since PATH is really long there.
- modules such as FindBLAS which can get called multiple times though different dependency paths do not cache their results or include-guard themselves.
My current solution is to override find_package(BLAS) via vcpkg's internal plumbing to only do it when not already BLAS_FOUND.