Custom processing of CMake output
Audience
Seeking feedback especially from IDE implementors and those interested in writing post-processing scripts for CMake output.
Background
CMake's output is currently a bit of an inconsistent mix of formatting, indenting and prepending text on some lines. This makes it difficult for scripts and other tools to reliably interpret the output to improve the user experience. Features such as colorisation in a terminal or selectively expanding/collapsing sections of the log output is made difficult because of these inconsistencies and the lack of any contextual information for most output. See #19418 (comment 590741) for some recent comments related to this, or #18943 (closed) for earlier discussions.
Internally, CMake either already has pretty good metadata about each line it outputs or could be made to do so fairly readily. Each message has a specific log level, a feature that has been extended recently by !3268 (merged). Earlier work in !3056 (closed) already demonstrated how project-defined log contexts could be supported. What is missing is a way for external tools to have access to this information so they can post-process the output in their own interesting ways.
Proposed New Functionality
An idea would be to support an environment variable which could hold the location of a script or executable to pipe all output through. I'll use the placeholder name CMAKE_OUTPUT_POSTPROCESSOR
for the sake of discussion. If this environment variable is unset or empty, CMake would behave as it does now. If it is non-empty, CMake would insert metadata on the beginning of every line in a predictable way so that the post-processor could decide how it wanted to treat each line. The metadata would need to include the log level and the log context (at least). As one example, each line could be prefixed with [loglevel][logcontext]
, or some other easy-to-process format. If we wanted to support future expansion, we could make the first line of output a kind of header row listing the [xxx]
entities that will precede each line so that scripts could simply ignore those bits it didn't recognise.
Consider the following extract from a CMakeLists.txt file that makes use of both log contexts and log indenting (the latter is already being discussed separately in #19418). I've chosen somewhat arbitrary variable names just to highlight the key points:
message(STATUS "Message with no context")
list(APPEND CMAKE_LOG_CONTEXT first)
message(DEBUG "Inside first level of log context")
list(APPEND CMAKE_LOG_CONTEXT second)
list(APPEND CMAKE_MESSAGE_INDENT " ")
message("Indented multi-line message\nwith a second line")
list(POP_BACK CMAKE_LOG_CONTEXT)
list(POP_BACK CMAKE_MESSAGE_INDENT)
message(WARNING "Danger Will Robinson!")
Here's an example of what such output might look like with CMake 3.15.0 plus the indenting features currently being worked on in !3464 (merged):
-- Message with no context
-- Inside first level of log context
Indented multi-line message
with a second line
CMake Warning at CMakeLists.txt:12 (message):
Danger Will Robinson!
With the proposed enhancements, this would appear to a post-processing script something like this instead:
[level][context]
[STATUS][]Message with no context
[DEBUG][first]Inside first level of log context
[DEBUG][first.second] Indented multi-line message
[DEBUG][first.second] with a second line
[WARNING][]CMake Warning at CMakeLists.txt:12 (message):
[WARNING][] Danger Will Robinson!
[WARNING][]
[WARNING][]
For terminal colorisation, a post-processing script could insert control codes as appropriate for each line. They may also do some message collapsing like that mentioned in #19418 (comment 590741). For an IDE, the post-processing script could just be a simple pass-through, with the IDE itself doing the work (the environment variable is then just triggering the insertion of the metadata).
Why an environment variable?
Developers sometimes jump between running CMake through an IDE and from a terminal for the same build directory. Therefore, it would be inappropriate to use a cache variable because the right behavior will depend on the environment in which CMake is run, not on cached information. A terminal run would want colorisation and therefore pipe through a script dedicated to inserting the relevant control codes. An IDE would just want to collect and strip off the information on each line so it could expand/collapse sections by context, or apply a log filter that could be changed interactively.
The JSON file api doesn't seem the right fit for this functionality either (could potentially be made to work for IDEs, but wouldn't suit use at a terminal).
An environment variable is easy for users to set in a terminal and IDEs can also easily control it for the child process they create to run CMake. Neither would record any information anywhere that would interfere with the other. No separate configuration file would be needed, only the name of a script or program to pipe the output through.