Force SHARED libraries to be linked after STATIC libraries / permit link features for complex targets
The basic setup
The whole issue is typical for the GNU linker.
Let a
and b
be defined by an external package management system, e.g. conan.
That means this intermediate layer with the INTERFACE
libary around an IMPORTED
library is a given!
Let a
have a openly visible symbol a_export
- e.g. an allocator function. (Specifically: zcalloc
from zlib
)
Let b
export that same symbol as a weak symbol. (Specifically: Some builds of the glib
library set do this, as they each link zlib
statically, but still need to share a single symbol definition for zcalloc
.)
Let c
be my library. I need to use functionality from a and b, using public interfaces only.
I expect b to to be arbitrarily interchangeable against other versions. Specifically against ones that didn't export a_export
.
add_library(a_impl IMPORTED )
add_library(a INTERFACE)
target_link_libraries(a PUBLIC a_impl)
add_library(b_impl IMPORTED)
target_link_libraries(b_impl PUBLIC a)
add_library(a INTERFACE)
target_link_libraries(b PUBLIC b_impl)
add_library(c SHARED)
target_link_libraries(c PUBLIC b a)
Now, what issues am I facing?
Depending on the order in which CMake is arranging the link targets, it's more or less random whether a_export
will be first found in b_impl
(and therefor skipped by the linker when processing a_impl
resulting in an accidental import with no definition included in c
), or first found in a_impl
so I end up with the expected local definition.
In the first case I end up with a_export
being undefined by accident, and this introducing a dependency onto implementation details of b_impl
which are stronger than the intended interface.
CMake will happily sort static libraries after shared libraries in the linker invocation, so this scenario is quite likely to happen. And there is no real control over the sort order either.
While I can just swap target_link_libraries(c PRIVATE b a)
around in this trivial example, it's not possible to ensure correct ordering in real life examples as there are several more layers to this.
Options
WHOLE_ARCHIVE
from CMAKE_LINK_LIBRARY_USING_<FEATURE>
simply doesn't work with interface libraries, or the library groups those resemble. Meaning $<LINK_LIBRARY:WHOLE_ARCHIVE,a>
is not possible to use.
But even if it was, WHOLE_ARCHIVE
would declare more than I want. I only expect for the actually used set of objects from a_impl
to be self-contained within c
- but I do not expect to pull in the entire bloat of unused objects also found in a_impl
.
The combination of -Wl,--whole-archive
and -Wl,--gc-sections
removes that bloat, but is only legal for C - but not for C++ as C++ permits initializers with side effects which -Wl,--gc-sections
incorrectly strips. So not really an option.
Otherwise it would already had helped if CMake had allowed me to shift all static libraries before all shared libraries in the link order, thus rendering the resulting shared library / executable always as self-contained as possible. (Not to confuse with -Bsymbolic
- only meaning that every symbol that had existed in any static library exists at least as a weak symbol. Quite the opposite actually - -Bsymbolic
also only applies to symbols which have been chosen for static linkage before, but does not guarantee that internal linkage has been chosen over external symbols either.)
While there certainly is a use case where you'd want to link static libraries to satisfy exactly only missing symbols in previously shared libraries (and thus shared needs to linked before static libraries), that's a rather contrived example. In fact, the best-practice of using -Wl,--no-undefined
does not even permit this constellation to occur at all.
To be more precise: A --start-group
/ --end-group
pair around the whole set of static libraries and objects before any shared libraries would ensure that self containment goal.
Impact
Judging from similar issues in deployed libraries out in the wild, this "accidental non-inclusion of symbols" due to shared libraries masking objects and static libraries is actually happening quite often, and often doesn't show unless you try to treat libraries as portable between systems.
It's not safe to do this by default though, as fixing the self-containment issues also means that now in some cases suddenly c
will provide a_export
which it had never done before which is substituting the definition in case of library version mixes in deployment.