FindDoxygen is incapable of disambiguating multiple installs by version alone
Problem
When trying to find a specific version of Doxygen via find_package(Doxygen x.y.z)
on a system with multiple versions installed, the selected install does not actually correlate with the requested version and it is merely a coincidence if the correct version is the one found. The version argument in this case is mostly ignored, except strictly for the purpose of throwing an error if the selected Doxygen install version does not match it.
Example
Installs:
- (1.9.1) C:\ProgramData\chocolatey\bin\doxygen.exe [Locatable via the Windows system 'PATH' environment variable]
- (1.9.4) C:\Program Files\doxygen\bin\doxygen.exe [Locatable via Doxygen's uninstaller key in the registry]
Script:
find_package(Doxygen 1.9.4)
In this case, which is the actual situation that led me to file this issue, the 1.9.1 install will be selected and CMake will then throw an error noting that the found version of the package was unsuitable, even though the 1.9.4 install is available and theoretically locatable by CMake. Uninstalling the 1.9.1 version allows CMake to correctly find the 1.9.4 install.
Cause
This occurs because of FindDoxygen.cmake's use of find_program() and its limitations.
FindDoxygen.cmake(439:449) [739446a9a1975367b2c9a6b105b3fde1656c725b]:
find_program(
DOXYGEN_EXECUTABLE
NAMES doxygen
PATHS
"[HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\doxygen_is1;Inno Setup: App Path]/bin"
/Applications/Doxygen.app/Contents/Resources
/Applications/Doxygen.app/Contents/MacOS
/Applications/Utilities/Doxygen.app/Contents/Resources
/Applications/Utilities/Doxygen.app/Contents/MacOS
DOC "Doxygen documentation generation tool (http://www.doxygen.org)"
)
find_program
only returns a single match, and as noted in it's documentation in regards to its search procedure,
- Search the standard system environment variables. This can be skipped if NO_SYSTEM_ENVIRONMENT_PATH is passed or by setting the CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH to FALSE.
- The directories in PATH itself.
- On Windows hosts no extra search paths are included
has a higher precedence than
- Search the paths specified by the PATHS option or in the short-hand version of the command. These are typically hard-coded guesses.
So in the above example, even though an install of the desired Doxygen version (1.9.4) is available and locatable by CMake, it will never be found as long as the 1.9.1 install is present. This is because this use of find_program
will always locate the 1.9.1 install first via the 'PATH' environment variable and set DOXYGEN_EXECUTABLE
to its path without reaching the PATHS
portion of the search procedure. Essentially the one install shadows the other.
Workarounds
I consider these workaround because of find_package
's advertised capabilities which I will touch on in a moment.
- Adding the main Doxygen install path to Doxygen_ROOT (though I can’t seem to get this to work)
- Adding the main Doxygen install path to CMAKE_PREFIX_PATH
- Adding the main Doxygen install path to CMAKE_PROGRAM_PATH
- Writing your own FindDoxygen.cmake, which likely would be nearly the same as the official one, but call find_program with different arguments to change the search order such that the desired Doxygen install is found first
Preamble
I initially went down the rabbit hole of discovering this limitation because the documentation of a library of mine suddenly was only erratically being built on Windows via a GitHub Actions despite no changes to my workflow. To make a long story short, it ended up simply being from a small update to their Windows Server 2022 image. They updated MinGW from 8.1.0 to 11.2.0, the latter of which contains a build of Doxygen, while the former did not. The two are registered within the system image exactly as in my example, and so as the rollout of their new image was being performed, the runners on the old image would find Doxygen 1.9.4 (that I installed separately) correctly and build my documentation, while the runners with the newer image would only find 1.9.1 and fail to build my documentation due to features I use missing from that older version. I was then further confused when adding "1.9.4" to my find_package(Doxygen ...)
call did not fix the issue.
I mention this to demonstrate how the current implementation of the Find module can cause quite fickle behavior due to a large number of factors that may be out of a user's control, and because I can only imagine that a good number of other people are affected by this issue, as this is now basically relevant to everyone using GitHub Actions's on Windows, Doxygen, and CMake. While maybe not a massive group, I'd still figure it's quite sizeable.
Additionally, this likely affects other executable based "packages" and Find modules.
Discussion
I understand why this is technically challenging. find_program
is only tangentially related to the find_package
process and has no concept of package versions, it simply finds the first match according to its search procedure and sets a variable, leaving it to any potentially wrapping Find module to handle versioning; however, it's current behavior imposes significant challenges upon such a module due to said ignorance, as it would need to call find_program
many, many times with a variety of arguments (and deal with its variable caching) to get around this limitation.
While one may argue that the above options listed under "Workarounds" are intended to handle this situation and that they should simply be used, I certainly disagree, as to me those only make sense to use when I need to direct CMake how to choose between two different installs of the same version of a package. Regardless of the technical implications, from the user's perspective the current way this is handled is undesirable and breaks expectations. find_package
's documentation and behavior make it clear that it should be capable of disambiguating multiple installs of the same package by version alone. For example, on a system with numerous versions of Qt installed, I can throw them all into CMAKE_PREFIX_PATH
as an environment variable in any order, configure a CMakeLists.txt that contains find_package(Qt6 6.3.1)
and know the right version will be located.
While I do understand that to some extent this relies on the packages in question to have well-formed package configs/well-written Find modules, and that Kitware could simply state "we can't control every third-party project, you need to use the workaround options", here FindDoxygen.cmake is a first-party module, and so I think effort should be made to fix this limitation in at least this case. The larger issue of lack of consistency with find_package
's version argument when the "package" in question is actually a program could be forked to a different discussion.
Simply put, as a CMake user, I think it is reasonable to assume that if I have multiple installs of a package on my system that differ in version, that I know CMake is capable of finding individually, that simply calling find_package(<PackageName> <Version>
) should provide me with the correct install, with the methodology of the search abstracted from me, since theoretically the provided version is all CMake should need to make the right selection.
This is currently not the case for Doxygen.
Solutions
I don't have much when it comes to this section, as I largely just wanted to bring (or likely reintroduce) this to everyone's attention for discussion in hopes that this situation can eventually be improved.
The "basic" solution is to rewrite the FindDoxygen.cmake module to work around this limitation, but due to the current restrictions of find_program
this would be quite cumbersome and inefficient. The current arguments to the find_program
call would instead have to be stored in a list, with the call then being made multiple times, in part iterating said list and in part utilizing the many NO_*
options of the function to essentially run through its search procedure one step at a time. Every time a match is found the Find module would then check to see if the version matches as it does now via find_package_handle_standard_args
, and if it doesn't move on to the next awkward use of find_program
. This would also necessitate use of the costly NO_CACHE
option for every call.
Certainly less than ideal.
Since the crux of the issue is the current behavior of find_program
, I think a more reasonable approach would be to introduce a mode of find_program
, or similar function that provides slightly more fine control of what it returns. Perhaps one in which it follows find_program
's current search procedure, but instead of stopping upon one match it exhausts the entire procedure and stores each match in the provided variable as a list. This way, scripts like FindDoxygen.cmake can easily iterate the list and perform version checking on each match until the requested version is found (or not), which is much cleaner. The contents of the list would be ordered according to find_program'
s current search priority, such that simply checking the first element of the list will provide the same result as having used the function with the same arguments as it exists currently.