Proposal for debugging mode for CMake scripts
This is follow-up from a [post on the forums](https://discourse.cmake.org/t/status-of-cmake-debugger/567) about the status of debugging CMake scripts. Microsoft is interested in contributing this support for a future CMake release. We're interested in getting community feedback and building consensus to ensure this proposal covers most common scenarios. While earlier discussion didn't cover using the Debug Adapter Protocol (DAP), this proposal is based on DAP for reasons that are explained below. This is based on feedback from the debugger team at Microsoft, but if there are compelling reasons why DAP is not a good fit for CMake we can change direction. # CMake debugging protocol ## Goals and non-goals This document is a proposal for adding functionality to CMake to allow external debugging tools like Visual Studio or Visual Studio Code to drive CMake script execution and inspect execution state. Currently available tools for debugging CMake scripts like `--trace` and `variable_watch()` are not interactive and can be difficult to use to introspect CMake targets and properties. Based on [initial discussion](https://discourse.cmake.org/t/status-of-cmake-debugger/567) this proposal defines a protocol to support these features. - Set breakpoints based on filenames and line numbers. - Set breakpoints when CMake errors or warnings are triggered. - Set breakpoints when variable values are read or written. - Step into, step over, or step out of the currently executing scope. - View the call stack. - View the state of currently defined variables. - View the state of currently defined targets, directories, and tests with their properties. - View the state of global properties. - Execute arbitrary CMake while normal script execution is suspended. These features, while potentially useful, are being considered out of scope for the current proposal. - Editing the values of variables and properties. Based on feedback CMake has too many assumptions about these states not changing behind the scenes to implement this functionality correctly in version 1. - Debugging generator expressions. Implementing this effectively requires more consideration of which scenarios are important and how to mark generator expressions for debugging. - Providing a human-friendly command line interface. A new application could wrap the protocol described here to create a command line tool. - Debugging child processes when CMake launches another instance of CMake. - Designing how debugging functionality should be integrated into IDEs. ## Why use the Debug Adapter Protocol? This proposal uses the [Debug Adapter Protocol](https://microsoft.github.io/debug-adapter-protocol/overview) (DAP), a standard debugging protocol already widely supported by tools like Visual Studio, Visual Studio Code, Eclipse, Emacs, and Vim. By using DAP, CMake will simplify the work required by vendors to integrate debugging in their current or future tools and create a more consistent debugging experience between tools. One downside to using DAP is that the protocol as specified may not have first-class support for features unique to certain debuggers. For example DAP has no notion of a CMake target or target properties. This proposal solves these issues by leveraging how DAP clients ignore unknown fields to allow vendors to declare their support for extensions during protocol initialization and falling back to represent CMake-specific concepts using more general DAP messages if extension support is not declared. Vendors have the option of using these extensions for deeper CMake integration, but extension use is not required. Because of its generality DAP is also a more complex protocol than a bespoke protocol, especially for a language like CMake with no threading and simple string types in almost all cases. However, DAP is designed to allow implementers to safely ignore large portions of the protocol if they are irrelevant, and the remaining complexity seems to be an appropriate trade-off for broader tool compatibility. ## Starting a debugging session Debugging is enabled by adding the `--debugger` flag to a CMake configuration or script command line. By default, communication will happen over stdin and stdout but the `--debugger-pipe=<pipe name>` flag can be added to use a named pipe instead. ``` cmake ... --debugger --debugger-pipe=cmake_debug_pipe ``` stdin and stdout are supported to simplify integration with external tools. Note that like CMake server a "named pipe" refers to a local domain socket on Unix or a named pipe on Windows and is assumed to be bidirectional. If the `--debugger` flag is present but `--debugger-pipe` is not, all normal script output (status, messages, errors etc.) will not reach stdout or stderr and will instead be captured in the `output` DAP message. If both flags are specified, then normal script output will reach stdout or stderr and will also be captured in the `output` DAP message. ## Communication and message format As described in the DAP specification, [communication happens over text channel with ASCII encoding for message headers and UTF-8 encoding for message content](https://microsoft.github.io/debug-adapter-protocol/overview#base-protocol). Message content is always a JSON object. For reference, a sample message could look like this (`\r` and `\n` are only shown for clarity and represent literal carriage return and newline ASCII characters). ```json Content-Length: 119\r\n \r\n { "seq": 153, "type": "request", "command": "next", "arguments": { "threadId": 3 } } ``` As specified, `Content-Length` counts the length of the message content in bytes (not including the newlines separating the header from the content). CMake should be able to parse this format with no extra message bracketing required. DAP supports three types of messages. Additional details can be found in the DAP specification. - **Events** are sent from CMake to the client unprompted when important actions happen in script execution like hitting a breakpoint. - **Requests** are sent from the client to CMake to configure the debugger and request additional information about execution state. - **Responses** are sent from CMake to the client in response to a request. CMake will listen for messages on a new background thread and add them to a queue for later processing. At the beginning of every CMake command invocation, the thread executing the script will first check the queue for any unprocessed messages, respond as needed, and then resume script execution. For messages like `pause` or events like hitting a breakpoint that suspend script execution, the thread executing the script will continue to process messages from the queue, potentially blocking on an empty queue, until receiving a message that resumes execution. Requests that modify execution state (currently only `evaluate` from a REPL) will fail if the script is not paused. ## Supported messages This section describes all the DAP messages supported by CMake along with clarification about how CMake will interpret certain messages. It may be helpful to read this section side-by-side with the [DAP specification](https://microsoft.github.io/debug-adapter-protocol/specification). ### initialize request This is the first message sent. Notably it includes the client's locale, whether line numbers should be zero or one based, and the path format (`path` or `uri`). Locale is ignored because CMake is not localized. If the `path` path format is requested, CMake accept paths following the same rules it would use to normalize paths specified on the command line (like for `CMAKE_C_COMPILER`), and emit paths that follow native platform conventions (forward slashes on Linux and backslashes on Windows). The request is extended to support these optional fields. - `supportsCMakeTargetsRequest`: If true in both the request and response, data about CMake targets will not be included in the `variables` response and will instead be provided through the `cmakeTargets` request. - `supportsCMakeDirectoriesRequest`: If true in both the request and response, data about CMake build system directories will not be included in the `variables` response and will instead be provided through the `cmakeDirectories` request. - `supportsCMakeTestsRequest`: If true in both the request and response, data about CMake tests will not be included in the `variables` response and will instead be provided through the `cmakeTests` request. The capabilities in the response are also extended to support these optional fields. - `supportsCMakeTargetsRequest` - `supportsCMakeDirectoriesRequest` - `supportsCMakeTestsRequest` The response body is extended to add these fields. - `cmakeVersion`: a required object representing the version of CMake hosting the debugger containing these fields. - `major`: a required integer representing the major version number. - `minor`: a required integer representing the minor version number. - `patch`: a required integer representing the patch number. The response will include these exception filter names. - `AUTHOR_WARNING` - `AUTHOR_ERROR` - `FATAL_ERROR` - `INTERNAL_ERROR` - `MESSAGE` - `WARNING` - `LOG` - `DEPRECATION_ERROR` - `DEPRECATION_WARNING` #### Implementation notes The following capabilities are set to `true` in the response. No other Boolean capability values are sent. - `supportsFunctionBreakpoints` - `supportsEvaluateForHovers` - `supportsExceptionInfoRequest` - `supportsDataBreakpoints` - `supportsCMakeTargetsRequest` - `supportsCMakeDirectoriesRequest` - `supportsCMakeTestsRequest` `supportedChecksumAlgorithms` is set to an empty array. `exceptionBreakpointFilters` is set to the following, which correspond to the values of CMake's `MessageType` enumeration. ```json [ { "filter": "AUTHOR_WARNING", "label": "Warning (dev)" }, { "filter": "AUTHOR_ERROR", "label": "Error (dev)" }, { "filter": "FATAL_ERROR", "label": "Fatal error" }, { "filter": "INTERNAL_ERROR", "label": "Internal error" }, { "filter": "MESSAGE", "label": "Other messages" }, { "filter": "WARNING", "label": "Warning" }, { "filter": "LOG", "label": "Debug log" }, { "filter": "DEPRECATION_ERROR", "label": "Deprecation error" }, { "filter": "DEPRECATION_WARNING", "label": "Deprecation warning" } ] ``` ### initialized event This event is sent after the `initialize` response. ### stopped event This event indicates that script execution has been suspended for some reason. #### Implementation notes `reason` is the only field included in the response body. ### terminated event This event is sent just before the CMake process terminates. Because the CMake process is both the debugee and the debug adapter from the perspective of DAP, this message indicates that *both* the debug adapter and debugee have terminated. #### Implementation notes No `restart` information is included. ### output event This event indicates that the script attempted to write output. If `--debugger-pipe` was not specified, normal script output is suppressed from stderr/stdout so that it does not pollute the debugger protocol. If the output would have triggered a exception breakpoint, the output event is fired before triggering the exception breakpoint. The `category` field will always be either `stdout` or `stderr` depending on the scripts's output. The `output` field will contain the output produced by the script, including any preamble (like `CMake Warning (dev)`). No other fields are included in the event body. ### launch request This request starts script execution. No additional information is included to configure the launch because this information was already passed to CMake on the command line. The `noDebug` field is ignored. This contradicts the DAP specification, but simplifies the implementation in CMake. Clients that do not want debugging should not specify the `--debugger` flag on the command line. ### attach request This request is required by DAP, but does not make sense for CMake because the debug adapter and debugee are the same process. Therefore, the response will always indicate failure. #### Implementation notes `message` is set to `not supported`. ### disconnect request Normally this request terminates the debug adapter and disconnects from the debugee, but because both are the same process for CMake this requests termination of both the debug adapter and debugee. The `terminateDebugee` field is ignored. This contradicts the DAP specification, but is required given the CMake process model. After sending the response, CMake will send a `terminated` event and then the process will exit. ### setBreakpoints request This request sets all breakpoints for a given source file, overwriting any existing breakpoints for the file. `column` is ignored on the requested breakpoints. #### Implementation notes The breakpoints in the response always have `verified` set to `true`, and include no other fields (notably no `id` field is returned). ### setFunctionBreakpoints request This request replaces all existing function breakpoints with new function breakpoints. Function is broadly construed to refer to any CMake command, whether function, macro, or built-in. The `name` field on each requested `FunctionBreakpoint` is assumed to be case-insensitive, matching CMake's behavior for command invocations. The `FunctionBreakpoint` type is extended with these fields. - `cmakeType`: an optional array of strings from the set `function`, `macro`, or `built-in`. The function breakpoint will only fire if the type of the CMake command is one of the specified types. If unspecified, the function breakpoint will fire for any CMake command type. #### Implementation notes The breakpoints in the response always have `verified` set to `true`, and include no other fields. ### setExceptionBreakpoints request This request sets which types of CMake output (errors, warnings, deprecation warnings, etc.) trigger exception breakpoints. ### dataBreakpointInfo request This request obtains information on possible data breakpoints that could be set on a CMake variable. As specified by DAP, in the request `variablesReference` must either refer to a variable container from the `scopes` or `variables` response, or be unspecified. If unspecified, the response from CMake will have `dataId` set to `null`. If the `variablesReference` and `name` in the request identify a defined CMake variable, the response will have `dataId` and `description` set to the name of the variable. If not, `dataId` is set to `null`. For defined variables, `accessTypes` is set to the following value, which includes CMake extensions. See `setDataBreakpoints` for the meaning of these extensions. ```json [ "read", "write", "readWrite", "cmakeModified", "cmakeRemoved" ] ``` #### Implementation notes `description` when `variablesReference` is unspecified is set to `cannot set a data breakpoint on an expression`. `description` when `variablesReference` and `name` do not refer to a CMake variable is set to `not a CMake variable`. ### setDataBreakpoints request This request replaces all existing data breakpoints with new data breakpoints. The supported values for `accessType` in the requested breakpoints are extended to support these values. - `cmakeRead` - `cmakeUnknownRead` - `cmakeModified` - `cmakeUnknownModified` - `cmakeRemoved` The meanings of these values are the same as the `variable_watch()` command. The standard DAP access types correspond as follows. - `read`: `cmakeRead` or `cmakeUnknownRead` - `write`: `cmakeModified`, `cmakeUnknownModified`, or `cmakeRemoved` - `readWrite`: all extension values If `accessType` is unspecified, it is assumed to be `write`. #### Implementation notes The breakpoints in the response always have `verified` set to `true`, and include no other fields. ### continue request This request resumes script execution. `threadId` is ignored in the request, and `allThreadsContinued` is always `true` in the response. ### next request This request instructs the debugger to "step over", which is defined as running until the next command invocation in the current CMake scope or a parent scope. `threadId` is ignored in the request. ### stepIn request This request instructs the debugger to "step in", which is defined as running until the next command invocation. `threadId` is ignored in the request. ### stepOut request This request instructs the debugger to "step out", which is defined as running until the next command invocation outside the current function, macro, include, inline file, or directory scope (depending on what type of scope execution is currently in). `threadId` is ignored in the request. ### pause request This request suspends script execution at the next available opportunity. Due to the CMake's internal processing (see Communication and message format), pausing will happen at the beginning of the next command invocation. `threadId` is ignored in the request. ### stackTrace request This request returns a stack trace from the current execution state. `threadId` is ignored in the request. Stack frames are derived by walking the existing tree of state snapshots created internally by CMake from the current leaf node to the root. In the response, `id` for each frame is set to the number of steps from the leaf node to the state snapshot in question. `name` is set to the function, macro, directory, or filename depending on the type of state snapshot. `source` and `line` are set to correspond to the source position, and `column` is always set to zero. Each frame is extended with the following fields. - `cmakeSnapshotType`: a required string from the set `BaseType`, `BuildsystemDirectoryType`, `DeferCallType`, `FunctionCallType`, `MacroCallType`, `IncludeFileType`, `InlineListFileType`, `PolicyScopeType`, or `VariableScopeType`. These correspond to the values defined in `cmStateEnums::SnapshotType`. ### scopes request This request returns the available variable scopes for a given stack frame. Every frame includes a scope with `name` set to `Locals`, `presentationHint` set to `locals`. This scope stores the CMake variables defined in this state snapshot, not inherited from other snapshots. `namedVariables` is set to the number of CMake variables. If the client and CMake did not specify `supportsCMakeDirectoriesRequest` as `true`, then the top frame also includes a scope with `name` set to `Directory Properties` and `namedVariables` set to the number of properties set on the current CMake directory scope. The base frame includes a scope with `name` set to `Global Properties` and `namedVariables` set to the number of global CMake properties (see below for implementation concerns). If the client and CMake did not specify `supportsCMakeTargetsRequest` as `true`, then the base frame also includes a scope with `name` set to `Targets` and `namedVariables` set to the number of defined CMake targets. If the client and CMake did not specify `supportsCMakeTestsRequest` as `true`, then the base frame also includes a scope with `name` set to `Tests` and `namedVariables` set to the number of defined CMake tests. #### Implementation notes `variablesReference` is set to the frame ID + 5 (to account for other potential scopes) for the locals scope. `variablesReference` set to one for the global properties scope, two for the targets scope, three for the tests scope, and four for the directory properties scope. `indexedVariables` is always set to zero and `expensive` is always set to `false` for all scopes. ### variables request This request returns all the variable values for a given variable reference, which could be CMake variables, properties, targets, directories, or tests. For CMake variables, global properties, and directory properties, the response includes `name` and `value` set to appropriate values. `variablesReference` is always set to zero to indicate that this is not a structured value. If the client and CMake did not specify `supportsCMakeTargetsRequest`, then a request for variables from the targets scope will have `name` set to the target name and `value` set to an empty string. A `variables` request with `variablesReference` set to an identifier returned by an individual target will return that target's properties. `name` will be set to the property name, `value` will be set to the property value as returned by `get_property(... TARGET ...)`. `variablesReference` is always set to zero to indicate that the property value is not structured. If the client and CMake did not specify `supportsCMakeTestsRequest`, then a request for variables from the tests scope will have `name` set to the test name and `value` set to an empty string. A `variables` request with `variablesReference` set to an identifier returned by an individual test will return that test's properties. `name` will be set to the property name, `value` will be set to the property value as returned by `get_property(... TEST ...)`. `variablesReference` is always set to zero to indicate that the property value is not structured. #### Implementation notes `variablesReference` for a target or test will be set to the total number of stack frames + 5 (to reserve zero, a potential base targets scope, a potential base global properties scope, a potential base test properties scope, and a potential top frame directories scope) + an implementation defined index for the target or test that is always positive. Target and test indices must not conflict. All variables that represent CMake variables will have a `presentationHint` with `kind` set to either `data` or `dataBreakpoint`. `data` is used if a data breakpoint has not been set for this variable name, and `dataBreakpoint` is used if a data breakpoint has been set for this variable name. This is required to enable data breakpoint support in some debuggers. ### threads request CMake script execution is single-threaded, so this request always returns a single thread with `id` equal to one. #### Implementation notes The response has this body. ```json { "threads": [{ "id": 1, "name": "CMake script" }] } ``` ### evaluate request This request evaluates a CMake script string or CMake variable name. If `context` in the request is set to the standard `watch` or `hover` values, then `expression` is expected to be then name of a CMake variable and `result` in the response will store the value of the variable, if it exists. If `context` in the request is anything else (like the standard `repl` value), then `expression` is expected to be a CMake script string and `result` in the response will be an empty string because CMake has no return values. Executing a CMake script string is only supported when script execution is paused and will fail if the script is currently running. Script execution is equivalent to the `cmake_language(EVAL CODE ...)` except that errors do not halt overall script execution, only evaluation of the snippet. Evaluation failure is reported by setting `result` to `evaluation error`. ### exceptionInfo request This request retrieves the details of a CMake error, warning, or message. `threadId` is ignored in the request. In the response, `exceptionId` will be set to the corresponding label from `exceptionBreakpointFilters` (from the `initialize` response), `description` will be set to the message that was logged, excluding any preamble, and `breakMode` is always set to `always`. Descriptions are not localized. ## Extension messages CMake defines three extension messages. These messages provide a way to get more detailed information about CMake targets, directories, and tests. The CMakeProperty interface may appear in some responses. While the implementation will not be in TypeScript, the message format is defined here using TypeScript syntax to match the rest of the DAP specification. ```typescript interface CMakeProperty { /** * The name of the property. */ name: string; /** * The value of the property. */ value: string; /** * True if the property was inherited from another CMake scope. */ inherited: boolean; } ``` ### cmakeTargets request This request retrieves currently defined CMake targets and their properties. Clients should only send this request if they specify `supportsCMakeTargetsRequest` as `true` during the `initialize` request and CMake specifies `supportsCMakeTargetsRequest` as `true` during the `initialize` response. ```typescript interface CMakeTargetsRequest extends Request { command: 'cmakeTargets'; arguments?: CMakeTargetsArguments; } interface CMakeTargetsArguments { /** * If true, target properties are included in the response. */ includeDetails?: boolean; /** * The set of target names to include in the response. If unspecified, all * defined targets are included. */ targetNames?: string[]; } interface CMakeTargetsResponse extends Response { body: { /** * The currently defined CMake targets. */ targets: CMakeTarget[]; } } interface CMakeTarget { /** * The name of this target. */ name: string; /** * The properties defined on this target, including inherited properties. */ properties?: CMakeProperty[]; } ``` ### cmakeDirectories request This request retrieves currently defined CMake build system directories and their properties. Clients should only send this request if they specify `supportsCMakeDirectoriesRequest` as `true` during the `initialize` request and CMake specifies `supportsCMakeDirectoriesRequest` as `true` during the `initialize` response. ```typescript interface CMakeDirectoriesRequest extends Request { command: 'cmakeDirectories'; arguments?: CMakeDirectoriesArguments; } interface CMakeDirectoriesArguments { /** * If true, directory properties are included in the response. */ includeDetails?: boolean; /** * The set of directories to include in the response. If unspecified, all * defined directories are included. Directory names should be in a form * accepted by get_property(... DIRECTORY ...). */ directories?: string[]; } interface CMakeDirectoriesResponse extends Response { body: { /** * The currently defined CMake build system directories. */ directories: CMakeDirectory[]; } } interface CMakeDirectory { /** * The full path to this directory in the system native formt. */ path: string; /** * The properties defined on this directory, including inherited properties. */ properties?: CMakeProperty[]; } ``` ### cmakeTests request This request retrieves currently defined CMake tests and their properties. Clients should only send this request if they specify `supportsCMakeTestsRequest` as `true` during the `initialize` request and CMake specifies `supportsCMakeTestsRequest` as `true` during the `initialize` response. ```typescript interface CMakeTestsRequest extends Request { command: 'cmakeTests'; arguments?: CMakeTestsArguments; } interface CMakeTestsArguments { /** * If true, test properties are included in the response. */ includeDetails?: boolean; /** * The set of test names to include in the response. If unspecified, all * defined tests are included. */ testNames?: string[]; } interface CMakeTestsResponse extends Response { body: { /** * The currently defined CMake tests. */ tests: CMakeTest[]; } } interface CMakeTest { /** * The name of this test. */ name: string; /** * The properties defined on this test, including inherited properties. */ properties?: CMakeProperty[]; } ``` ## Unsupported messages The following messages are specified by DAP as optional and are not supported by CMake. - cancel request - continued event - exited event - thread event - breakpoint event: This event could be useful to add later so that breakpoints can be verified once sources are loaded but is considered out of scope for an MVP. - module event - loadedSource event - process event - capabilities event - progressStart event - progressUpdate event - progressEnd event - invalidated event - runInTerminal reverse request - configurationDone request - restart request - terminate request - breakpointLocations request - setInstructionBreakpoints request - stepBack request - reverseContinue request - restartFrame request - goto request - setVariable request - source request: This request could be added to support dynamic non-file-based sources, but these are rare in CMake. - terminateThreads request - modules request - loadedSources request - setExpression request - stepInTargets request - gotoTargets request - completions request - readMemory request - disassemble request ## Implementation concerns Location-based breakpoints are internally stored based on filename and line number, and at the beginning of each command execution the current source position is checked against the list of breakpoints. This means that the debugger provides no guarantees that breakpoints are set in valid locations that could be hit during script execution (for example, on a blank line or comment). Adding this support is possible using DAP's `breakpoint` event, but is considered out of scope for MVP. Some properties are not stored internally by CMake but are instead computed on-the-fly when requested, like `ALIASED_TARGET` or `ALIASED_GLOBAL`. Therefore, the set of property names returned by the `variables` or `cmakeTargets` requests will be the union of all documented property names for the given scope with property names explicitly defined by the script. Stack frames will be derived from CMake's internal state snapshots, which do not currently include source positions. Debugger support will require either capturing the source position whenever snapshots are created under debugging, or somehow deriving the source position from already available information. Existing backtraces do not provide enough fidelity for common debugging scenarios, like viewing a stack trace with multiple function calls in the same file. Variable access by the debugger itself, such as in response to a `variables` request, should not trigger `variable_watch()` commands or breakpoints. Variable access by user code during an `evaluate` request should trigger `variable_watch()` commands and breakpoints like `cmake_language()` would. ## Open questions - ~~In earlier discussion there was interest in exposing the [currently defined export sets](https://discourse.cmake.org/t/status-of-cmake-debugger/567/10). I am not familiar enough with CMake usage to know what exactly would be useful to return here but I am open to adding it.~~ Resolved as out of scope for MVP.
issue