ExternalProject/FetchContent: Allow specifying both a git tag and commit sha
tl;dr: allow simultaneously specifying both a git tag and a commit SHA to get the best of both worlds of the existing GIT_TAG behaviors.
Problem:
Today, uses of FetchContent (or ExternalProject) with git sources have two options, each with upsides and downsides.
-
GIT_TAG
to reference a branch or tag -
GIT_TAG
to reference a commit SHA
Using a branch or tag makes the intent clear in the configuration (referencing v1.2.3 is much clearer than referencing de12fc34), and using branch or tags allows the GIT_SHALLOW option to be used to reduce fetch time for large repositories.
Using a SHA allows quicker update checking and prevents against unexpected changes in tags or branches (intentional or otherwise) breaking code.
The CMake recommendation is to use SHA only. The result are FetchContent_Declare sections like:
FetchContent_Declare(imgui
GIT_REPOSITORY git://github.com/ocornut/imgui.git
GIT_TAG 58075c4414b985b352d10718b02a8c43f25efd7c # v1.80
GIT_SHALLOW OFF
)
Following good "hygiene" the maintainer is likely to list the tag name anyway (v1.80
) but now it's just in a comment, and might drift (e.g. the SHA might be changed someday while the comment is missed and left out-of-date). GIT_SHALLOW
is off (by default, but explicitly in that example) which can be painful on repositories with lots of history, or whose history has some large files in it (e.g., documentation images and the like).
Proposal:
I propose adding a new option to specify a git commit SHA. Let's call it GIT_COMMIT
. This option can be specified in combination with the existing GIT_TAG
:
FetchContent_Declare(imgui
GIT_REPOSITORY git://github.com/ocornut/imgui.git
GIT_TAG v1.80
GIT_COMMIT 58075c4414b985b352d10718b02a8c43f25efd7c
GIT_SHALLOW ON
)
The fetch behavior would be the existing tag behavior. The repository is fetched via the tag, and GIT_SHALLOW
works as expected. In addition, after the fetch, CMake checks that the checked-out commit matches the GIT_COMMIT
value, and raises a fatal error if it does not. This provides all the same protections as fetching the commit directly, while also allowing the more semantically-meaningful name of the tag to be specified/checked, and allows GIT_SHALLOW
to work.
For update, the commit-sha behavior is used: if the repository is already synced to the specific commit, no update fetch is required. This allows the faster update of the existing commit-sha behavior with no additional cost or complications.
For completeness: if only GIT_TAG
is specified then the existing behavior is used (for compatibility). This could be changed to only support tags via a policy; I have no opinion on whether that's beneficial or not. If only GIT_COMMIT
is specified, the behavior is as-if the commit SHA were specified today via GIT_TAG
. In other words, specifying both GIT_TAG
and GIT_COMMIT
would be entirely optional, and each option can still be used independently.