Clang/Windows: Race when using `cmake -E vs_link_exe` with llvm-mt
I am debugging a weird llvm test failures in our builder which popped up a few month ago, which is caused by the fact that an executable generated during llvm's cmake configuration step could not be executed, so the tests won't get the correct error string on Windows. An example test failure can seen https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8832262407257821617/overview after cmake switched to llvm-mt by default in patch https://github.com/Kitware/CMake/commit/b12aec6c8daa3e087e6d0fa0441f59622251eb46. At first I thought it was a llvm-mt.exe bug. But after I managed to generate a minimal reproducer, it seems to be an issue with cmake's vs_link_exe feature.
How to reproduce the issue:
- Install Visual Studio, make sure Windows SDK 10 is also installed (I tested both VS2019 and VS2022, the free community version is OK).
- Build Clang/LLVM for windows (or you can use the one built from our builders. You might have to manually symlink
lld.exeas some ZIP program does not process symlinks on windows correctly)
- Build CMake (or use the prebuilt from https://cmake.org/download/, I used
- Download the reproducer
llvm-mt-reproduce.zipfrom this ticket and extract it.
- Assuming clang-cl.exe is placed under
C:\src\clang\windows-amd64\binand cmake.exe is placed under
x64 Native Tools Command Prompt for VSfrom the start menu.
- Navigate to the directory that contains the extracted files from the reproducer package, run:
C:\src\llvm-mt-reproduce>run-with-cmake.bat ... C:\src\llvm-mt-reproduce>out\getErrc.exe Access is denied. C:\src\llvm-mt-reproduce>
getErrc.exe is a program generated during llvm's cmake configuration step to extract the correct error message string for llvm's unit tests. When this step fails, all error message related unit test will fail, causing the test failures we saw on our builders. The content of the batch file used in the reproducer package is listed as follow:
SETLOCAL set CLANG_CL_PATH=C:\src\clang\windows-amd64\bin\clang-cl.exe set CMAKE_PATH=C:\src\cmake-3.23.0-rc2-windows-x86_64\bin\cmake.exe set MT_PATH=C:\src\clang\windows-amd64\bin\llvm-mt.exe ::set MT_PATH="C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64\mt.exe" set RC_PATH="C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64\rc.exe" ::set RC_PATH=C:\src\clang\windows-amd64\bin\llvm-rc.exe set LINKER_PATH=C:\src\clang\windows-amd64\bin\lld-link.exe %CLANG_CL_PATH% /nologo -TP /DWIN32 /D_WINDOWS /Zc:inline /Zc:__cplusplus /Zc:strictStrings /Oi /Zc:rvalueCast /Brepro /bigobj /W4 -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation /Gw -no-canonical-prefixes /MDd /Zi /Ob0 /Od /RTC1 -std:c++14 /showIncludes /Foout\getErrc.cpp.obj /Fdout\ -c -- .\getErrc.cpp %CMAKE_PATH% -E vs_link_exe --intdir=out --rc=%RC_PATH% --mt=%MT_PATH% --manifests -- %LINKER_PATH% /nologo out\getErrc.cpp.obj /out:out\getErrc.exe /implib:out\getErrc.lib /pdb:out\getErrc.pdb /version:0.0 /machine:x64 /STACK:10000000 /debug /INCREMENTAL /subsystem:console kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib
You can change the paths in
run-with-cmake.bat file if your paths does not match mine.
On my machine, the
getErrc.exe generated from cmake's vs_link step cannot be executed and will always result in "this app cannot run on your PC" error and "Access is denied" will be printed to the cmd. How ever, if you run
run-with-cmake.bat again (to be precise, to run the
cmake -E vs_link_exe ... from the batch file), the
out\getErrc.exe generated by the 2nd run will work without any issues:
C:\src\llvm-mt-reproduce> C:\src\cmake-3.23.0-rc2-windows-x86_64\bin\cmake.exe -E vs_link_exe --intdir=out --rc="C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64\rc.exe" --mt=C:\src\clang\windows-amd64\bin\llvm-mt.exe --manifests -- C:\src\clang\windows-amd64\bin\lld-link.exe /nologo out\getErrc.cpp.obj /out:out\getErrc.exe /implib:out\getErrc.lib /pdb:out\getErrc.pdb /version:0.0 /machine:x64 /STACK:10000000 /debug /INCREMENTAL /subsystem:console kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib C:\src\llvm-mt-reproduce>out\getErrc.exe no such file or directory;is a directory;invalid argument;permission denied C:\src\llvm-mt-reproduce>
So it make me suspect that there might be an data race issue here. When the
cmake -E vs_link_exe ... was invoked the first time, an incomplete manifest file that was still being written might have been present to the linker and embedded to the final getErrc.exe, causing the error messages we saw. When running the command twice, the manifest file were already correctly generated, so this time the executable has the correct manifest file embedded. If this is the case, I think this is a bug on cmake side and should be fixed.
A few other observations:
- When I add
--traceflag to the
cmake -E vs_link_exe ...(so final command line looks like ) to get the command traces, the original successful cmake invocation will result in error with messages:
Running with trace output on. CMake Warning: Ignoring extra path from command line: "advapi32.lib" CMake Error: Unknown argument -- CMake Error: Run 'cmake --help' for all supported options.
Is it a cmake flag parser bug? I have to use Windows audit mode to record the command line flags in order to debug this failure.
- The feature currently implemented by llvm-mt is very limited (only
/outflags are implemented, others are not) compared to the original
mt.exeprovided by Windows MSVC. Is it too early for cmake to set llvm-mt by default on Windows? Can we switch it back to the