Running tests with `make test -jN` hangs with make 4.4+ & CMake 3.29.0+ if tests spawn a background process
Description:
When running the KeePassXC test suite (which uses ctest
) on recent versions of make and CMake using make test -jN
, I found that the test suite would hang at the end, after all tests had passed.
In particular, there was a make
process that was stuck, and an xclip
process running in the background. Killing that xclip
process made the make
process get unstuck and finish the test run.
This did not happen in the past, so, after further debugging, I found that the recent CMake 3.29.0+ integration with the make 4.4+ "FIFO style" jobserver was leaking a file descriptor to test processes (see the associated MR !9580 (merged)), which was causing make
to hang waiting for the leaked file descriptors to be closed.
In particular the conditions under which the hang happens are:
- make 4.4+ (for the jobserver FIFO style, which is available and the default since that version; see
--jobserver-style=fifo
inman 1 make
). - CMake 3.29.0+ (for the make jobserver integration)
- The tests are run using
make test -jN
(e.g.make test -j2
).- It will not reproduce either
ctest -j2
(nomake
) ormake test -j1
(no jobserver).
- It will not reproduce either
- A test should leave an "orphan" background process running.
Steps to reproduce / Test case:
Here is a small test case to reproduce the issue as long as you have the right make+CMake versions (or, there's a Dockerfile based on Alpine Linux 3.20 which is currently affected):
cmake_make_jobserver_fifo_hang_reproducer.tar.gz
Run ./run.sh
and the test should pass, but the make
process should hang instead of returning the prompt to the user.
Additional comments:
What happens is that the test spawns a sleep
process in the background. If CMake leaks the jobserver FIFO file descriptor to that process, make
will keep waiting for that file descriptor to be closed, instead of finishing.
The scenario I found in KeePassXC's test suite is a bit more convoluted but the essentials are the same. The suite includes some clipboard tests which use xclip
when running on X11. The normal behaviour of xclip
is to fork to the background and keep running as long as it owns the clipboard selection. The test suite does not kill the backgrounded xclip
, which perhaps is not great, but also not a problem since it will exit once it loses the clipboard ownership. That xclip
is the one that has an open leaked jobserver FIFO file descriptor and causes make
to hang.
PS: Thanks to Robert Candau (@antiz), Arch's KeePassXC package mantainer, for figuring out the issue went away when running make in sequential mode.