Proposal: Lua as alternative imperative language
Hi all,
I'd like to propose enabling Lua as alternative language for implementing CMake modules.
For a reference I have a prototype (see !3934 (closed)) where a few existing modules (FindLua51.cmake
, FindPackageMessage.cmake
and FindPackageHandleStandardArgs.cmake
) are re-implemented in Lua.
General plan
First we need to allow executing Lua scripts at all. This is achieved by tweaking a few functions to recognize Lua scripts (I suggest doing it by simply checking file extension) and execute them on Lua VM:
-
cmMakefile::ReadListFile()
-- will enable running Lua scripts from CMake's script mode:cmake -P <scipt>.lua
. Note that directory configuration is not affected asCMakeLists.txt
has ".txt" extension. -
cmMakefile::ReadDependendFile()
-- will enable cmake scripts execute Lua scripts by usinginclude("/full/path/to/script.lua")
. The command's semantics is preserved here: included script code is just executed in the current context. -
cmFindPackageCommand::FindModule()
-- here we additionally need to probe "Find.lua" to allow implementing "find" modules in Lua.
Note that include()
command also works for module names by assuming ".cmake" extension
and looking in CMAKE_MODULE_PATH; I suggest to keep that behavior intact (i.e. not try ".lua" extension):
Lua modules (in the collection of functions/classes sense) are expected to execute once so let's limit
their availability to Lua code only.
The next step is to make possible writing useful Lua scripts/modules. My approach here is to provide carefully designed APIs that use CMakeLib classes/functions directly (as opposed to just providing a wrapper to invoke CMake's builtin/scripted command with a list of string arguments), and eventually reach feature parity with built-in commands. I believe one quite common way to provide bindings for built-in commands will be as follows:
- refactor command implementation to decouple argument parsing from the "real" command that does the actual work
- expose command(s) classes to Lua The ability to call cmake commands should be seen as a last resort (e.g. to invoke a command defined in cmake script or during transition period when some built-in command's functionality is not yet available in Lua through dedicated API). Obviously this is a significant amount of work, in the prototype I mostly implemented API required for porting the modules I chose, and indeed the exact content of Lua "standard lib" is subject to discussion.
Backward compatibility
Let's first assume that "standard library" is completed, stable and we make Lua support public. The existing users can be affected in 3 ways:
- different
Find<package>
file may be found byfind_package()
command because of accidental ".lua" file in CMAKE_MODULE_PATH. - included file may fail to be processed if it was stored with ".lua" extension
- cmake script may fail to execute if it has ".lua" extension
I suppose the natural way to handle these issues is to make corresponding cmake release introduce a new policy. Besides issues 3) and even 2) probably aren't worth worrying about.
Now there is a transitioning period when standard library is being developed. For that time I can think of 2 strategies:
- keep development in a separate branch, but perform necessary refactorings on master. There are not much modifications to existing files, so this option seems viable
- restrict eonly xecution of Lua code to files CMake's
Modules/
directory (e.g. for porting some modules or moving command arguments parsing to Lua) and have a flag that enables the rest.
The prototype implements none of the above, it adds "all-or-nothing" --lua
command line option to cmake.
Lua version
My prototype uses Lua 5.1 I believe we should fix the language version to 5.1, the benefits are:
- we can use LuaJIT (though I'm not sure whether all platforms CMake is supposed to build on are supported)
- way less headache overall (compared to multiple versions) See also FAQ of another project with long history and legacy programming language
Sandboxing
We have full support over environment Lua scripts execute in, so a big question is what it should look like.
By default Lua scripts run in "strict" environment: all accesses to non-existing global variables cause an error. This gives a few benefits:
- catches typos in source code
- enforces good practice of module writing: module imports other used modules, implements its functionality, returns the table of exports -- everything happens without data exchange through global namespace.
The downside is that 3rd-party Lua libraries may fail in such environment. In fact my prototype provides another
command-line option (--no-sandbox
) for disabling this protection in order to use "luaunit" library as-is.
Still I believe this is a good thing, any included 3rd-party libraries should be patched.
Another concern is Lua own standard libraries.
I'm strongly convinced that any interactions with underlying platform must be done using the same routines as
CMakeLib does. This means standard "os" and "io" libraries shouldn't be used.
In the prototype I partially implemented own "os" library where for example putenv()
and getenv()
implementations use the corresponding methods in SystemTools
; "io" is currently included though, again for the sake of "luaunit".
There are a few issues with "io" library:
- I/O logic in
file()
command implementation is pretty sophisticated, I'd better not give facilities to do it differently - it gives out handles to raw
FILE*
and requires user to callclose()
on them explicitly or it'll be called only during garbage collection. I suppose it's nice to have API that does not allow resource leaks.
Next, there is module loading. By default Lua has a few loaders to handle require()
calls.
For example one looks for ".lua" files using patterns in package.path
variable, another looks
for dynamic libraries using patterns in package.cpath
, etc.
I'd like to keep it simple and only look for ".lua" files in CMAKE_MODULE_PATH.
Finally there's another group of functionality that probably shouldn't be available to scripts -- sort of "meta"
capabilities: debug
library, loading code from arbitrary files/strings.
Modules
As I mentioned earlier, "cmake standard library" is far from being complete, but here are a few sketches.
state
This module provides means to interact with cmState
.
Working with definitions:
local state = require "state"
local vars = state.vars
vars.MYSTRING = "abc" -- set(MYSTRING "abc")
vars["MYNUMBER"] = 10 -- set(MYNUMBER "10")
vars.MYBOOL:set(true) -- set(MYBOOL "ON")
vars["MYBOOL"] = nil -- unset(MYBOOL)
local v = vars.MYLIST
v:set {"a", 1, true} -- set(MYLIST "a;1;ON")
local s = vars.MYSTRING:value() -- s = "abc"
local n = vars.MYNUMBER:tonumber() -- n = 10
local t = v:tolist() -- t = {"a", "1", "ON"}
if vars.MYBOOL:is_defined() then
-- not executed
end
-- if (MYLIST)
if v:is_truthy() then
-- ...
end
Defining a scripted function:
local commands = state.commands
local function dosomething(args)
if #args == 0 then
return false, "called with incorrect number of arguments"
end
-- ...
return true
end
commands.myfunction = dosomething
Calling scripted command defined elsewhere:
commands.message("STATUS", "Hello!")
Exported vars
, commands
and cache
are proxy collections -- when indexed by name they return internally cached proxy objects that are nothing more than object-oriented wrapper around the name.
runtime
Currently implements basic support for OOP: functions to create classes, test hierarchies (isinstance
, issubclass
).
Additionally provides function to generate POD-like classes (with initializers performing runtime type checking) dynamically from declarative specification. Take a look at command implementation guide in !3934 (closed) description.
argparse
This module exports ArgumentParser
class to help with command arguments parsing.
Parser is configured declaratively (again, see description of !3934 (closed)).
Unlike namesake Lua/Python libraries parser produces a prepared command instance.
Currently it is pretty basic (in prototype only 2 commands are implemented with Lua), but
for example sub-commands are rather straightforward to add.
Ideally, parser for any builtin command's interface should be reasonably easy to create.
list
Implements generic List
class.
For now it is used to specify type of certain command parameters.
Most list()
sub-commands will be available as methods of this class.