Ninja: Generate step performance very slow on Windows
My cmake project takes over 60 seconds to perform the generate step on Windows 10 with the Ninja generator, as well as the Ninja Multi-Config generator.
The same project takes roughly 5 seconds on a Linux machine.
I have corroborating reports from > 5 other developers in my organization seeing roughly the same numbers on their windows and linux machines.
Using the Visual Studio 2022 Performance profiling features, i was able to get a report claiming that >50% of the whole cmake.exe execution is spent inside of cmNinjaTargetGenerator::WriteObjectBuiildStatement(...)
. when generating with the Ninja Multi-Config generator.
I profiled this with cmake-3.25.0-rc3 (https://github.com/Kitware/CMake/releases/download/v3.25.0-rc3/cmake-3.25.0-rc3.zip) with the "x64-RelWithDebugInfo" settings in the Visual Studio 2022 CMake project mode.
I added
add_link_options(LINKER:/PROFILE)
On line 16 of CMakeLists.txt, between
unset(CMAKE_USER_MAKE_RULES_OVERRIDE_C)
and
# FIXME: This block should go away after a transition period.
if(MSVC AND NOT CMAKE_VERSION VERSION_LESS 3.15)
Here's an attempt at copying out of the profiler. Unfortunately, Microsoft removed the feature that could be used to export the profiler data in a complete form, so this is the best i can do.
|Function Name|Total CPU \[unit, %\]|Self CPU \[unit, %\]|Module|Category|
|-|-|-|-|-|
| + cmake \(PID: 19448\)|39755 \(100.00%\)|213 \(0.54%\)|Multiple modules||
|\| + \[External Call\]
ntdll.dll!0x00007ffeac3c26a1|39492 \(99.34%\)|0 \(0.00%\)|ntdll|UI \| IO \| Kernel \| Security \| Runtime|
|\|\| + static int \_\_scrt\_common\_main\_seh\(\)|39491 \(99.34%\)|0 \(0.00%\)|cmake|IO \| Kernel \| Security \| Runtime|
|\|\|\| + main|39491 \(99.34%\)|0 \(0.00%\)|cmake|IO \| Kernel \| Security \| Runtime|
|\|\|\|\| + static int do\_cmake\(int, const char \* \*\)|39488 \(99.33%\)|0 \(0.00%\)|cmake|IO \| Kernel \| Security \| Runtime|
|\|\|\|\|\| + public: int \_\_cdecl cmake::Run\(class std::vector\<...\> const &,bool\)|39256 \(98.74%\)|0 \(0.00%\)|cmake|IO \| Kernel \| Security \| Runtime|
|\|\|\|\|\|\| + public: int \_\_cdecl cmake::Generate\(void\)|37121 \(93.37%\)|0 \(0.00%\)|cmake|IO \| Kernel \| Security \| Runtime|
|\|\|\|\|\|\|\| + protected: virtual void \_\_cdecl cmGlobalNinjaGenerator::Generate\(void\)|32640 \(82.10%\)|0 \(0.00%\)|cmake|IO \| Kernel \| Security \| Runtime|
|\|\|\|\|\|\|\|\| + public: virtual void \_\_cdecl cmGlobalGenerator::Generate\(void\)|32426 \(81.56%\)|1 \(0.00%\)|cmake|IO \| Kernel \| Security \| Runtime|
|\|\|\|\|\|\|\|\|\| + public: virtual void \_\_cdecl cmLocalNinjaGenerator::Generate\(void\)|30313 \(76.25%\)|2 \(0.01%\)|cmake|IO \| Kernel \| Runtime|
|\|\|\|\|\|\|\|\|\|\| + public: virtual void \_\_cdecl cmNinjaNormalTargetGenerator::Generate\(class std::basic\_string\<...\> const &\)|29515 \(74.24%\)|0 \(0.00%\)|cmake|IO \| Kernel \| Runtime|
|\|\|\|\|\|\|\|\|\|\|\| + protected: void \_\_cdecl cmNinjaTargetGenerator::WriteObjectBuildStatements\(class std::basic\_string\<...\> const &,class std::basic\_string\<...\> const &,bool\)|22011 \(55.37%\)|1 \(0.00%\)|cmake|IO \| Kernel \| Runtime|
|\|\|\|\|\|\|\|\|\|\|\|\| + protected: void \_\_cdecl cmNinjaTargetGenerator::WriteObjectBuildStatement\(class cmSourceFile const \*,class std::basic\_string\<...\> const &,class std::basic\_string\<...\> const &,bool\)|21403 \(53.84%\)|17 \(0.04%\)|cmake|IO \| Kernel \| Runtime|
|\|\|\|\|\|\|\|\|\|\|\|\|\| - public: void \_\_cdecl cmGlobalNinjaGenerator::WriteBuild\(class std::basic\_ostream\<...\> &,class cmNinjaBuild const &,int,bool \*\)|10526 \(26.48%\)|16 \(0.04%\)|cmake|IO \| Kernel \| Runtime|
|\|\|\|\|\|\|\|\|\|\|\|\|\| - protected: class std::basic\_string\<...\> \_\_cdecl cmNinjaTargetGenerator::ComputeFlagsForObject\(class cmSourceFile const \*,class std::basic\_string\<...\> const &,class std::basic\_string\<...\> const &\)|5503 \(13.84%\)|6 \(0.02%\)|cmake|Runtime|
This seems like it must be a particularly pathological edge-case in std::fstream, if I'm assuming correctly what the issue is.
Note that my cmake project is producing >50MB CMakeFiles/impl-Release.ninja and CMakeFiles/impl-Debug.ninja, so even if there's something really terrible about std::fstream, CMake is trying to generate a huge amount of data, which could be greatly reduced by making more aggressive use of Ninja's variable substitution features.