DIP50
Title: | AST Macros |
---|---|
DIP: | 50 |
Version: | 1 |
Status: | Draft |
Created: | 2013-11-10 |
Last Modified: | 2013-11-10 |
Author: | Jacob Carlborg |
Abstract
The basic concept of AST (Abstract Syntax Tree) macros, or just syntax macros, is fairly simple. A macro is just like any other function or method except that it will only run at compile time. When a macro is called, instead of evaluating its argument and then calling the function, an AST is created for each argument passed to the function. The macro will then return a new AST which is injected and type checked at the call site. This means that the call to the macro will be replaced with the AST returned by the macro.
Example
macro myAssert (Context context, Ast!(bool) val, Ast!(string) str = null)
{
auto message = str ? "Assertion failure: " ~ str.eval : val.toString();
auto msgExpr = literal(constant(message));
return <[
if (!$val)
throw new AssertError($msgExpr);
]>;
}
void main ()
{
myAssert(1 + 2 == 4);
}
Compiling and running the above program would result in the following assert error:
core.exception.AssertError@main(13): Assertion failure: 1 + 2 == 4
The interesting part here is that the assert message contains the actual expression that failed.
Rationale
AST macros can be used extend the language with new semantics without changing the actual language. Instead of adding new features to the language AST macros can be a general solution
to implement language changes in library code. Many existing language features could have been
implemented with AST macros, like scope
, foreach
and similar language constructs.
Formal Definition
Declaring a Macro
A macro is always declared with the macro
keyword followed by its name and a parameter list. The first parameter of macro is always of the type Context
, therefore the parameter list cannot be empty. The rest of the parameters are always of the type Ast
. A macro always need to return either void
or a value of the type Ast
.
macro foo (Context context, Ast!(string) str)
{
return str;
}
Calling a Macro
A macro is called just like any other function. The first parameter, which is of
the type Context
, is passed implicitly by the compiler. The rest of the
arguments are passed like in a regular function call. Although you won't pass
arguments of the type Ast
, regular values are passed instead and the compiler
creates an AST of the arguments and pass them as Ast
arguments to the macro.
The Context Parameter
The first parameter of a macro declaration is always of the type Context
. This
parameter is mostly passed implicitly by the compiler. It's also possible to
pass the context parameter manually. This is useful when having helper functions
for a macro and need to retain the context given to the original macro.
The context parameter contains information about the surrounding context where the macro was called. This can be information like the surrounding method and class from which the macro was called.
Bonus
This context parameter also contain information about the complete compilation environment, like:
- The arguments used when the compiler was invoked
- Functions for emitting messages of various verbosity level, like error,
warning and info
- Functions for querying various types of settings/options, like which versions
are defined, is "debug" or "release" defined and so on
- In general providing as much as possible of what the compiler knows about the
compile run
- The context should have an associative array with references to all scoped variables at initiation point.
This has the benefit of enabling a macro to check variables that are "passed" to it as well as modify it. The this keyword value should be available to the macro if it is from either a class or struct.
Semantics
Since macros can only be called at compile time the compiler will strip out all macros before the code generating phase.
Quasi-Quoting
Quasi-quoting is basically a form of syntax sugar for creating syntax tree. It could be considered as AST literals. In all examples in this text the following syntax is used for quasi-quoting:
<[ writeln("asd"); ]>
Splicing
Splicing is a syntax used for dynamically inserting a piece of an AST in
quasi-quotes. In all examples in this text a dollar sign, $
, is used for
splicing.
<[ writeln($expr); ]>
The syntax for quasi-quoting and splicing is just an abstraction of what the syntax would actually look like. Regardless of what syntax is used there's always the option to manually create syntax trees using an API.
The AST Macro
The ast
macro is an option to implement quasi-quoting in a library macro. The
ast
macro takes an arbitrary expression and transform it to an AST. It also
supports splicing. The ast
macro has a couple of overloads and their
declarations look as follows:
macro ast (T) (Context context, Ast!(T) expr)
{
// ...
}
macro ast (Context context, Ast!(void delegate ()) block)
{
// ...
}
The first overload takes an arbitrary expression and converts it into an AST. The second overload takes a delegate, this is to be able to convert a whole block of code to an AST.
Bonus
Calling a Macro
Macros are extend to be callable from anywhere it's possible to use a mixin.
Statement Macros
A statement macro is a macro that takes a Statement
as its last parameter.
The difference compared to regular macros is the calling syntax. Statement macros
are called with the same syntax used for statements, like the example below:
macro foo (Context context, Statement block)
{
return block;
}
macro bar (Context context, Ast!(int) arg, Statement block)
{
return block;
}
void main ()
{
foo
{
writeln("foo");
writeln("foo again");
}
foo
writeln("foo2");
bar(3) {
writeln("bar");
writeln("bar again");
}
bar(3)
writeln("bar2");
}
Just like many of the built-in statements the braces are optional when there's
only a single expression in the statement. Since the statement is always the
last parameter in the macro declaration and it's always passed outside the
regular argument list it's legal to have parameters with default arguments or a
variadic parameter list before the statement parameter.
macro foo (Context context, Ast!(string)[] arg ..., Statement block)
{
return block;
}
macro bar (Context context, Ast!(string) fmt = null, Statement block)
{
return block;
}
Attribute macros
An attribute macro is a macro that acts like a user defined attribute. It can be used anywhere a user defined attribute can be used. When an attribute macro is used on a declaration, instead of attaching a value to the declaration the macro is called and the AST of the declaration is passed as the last parameter to the macro. The declaration is replaced with whatever syntax tree the macro returns.
An attribute macro always take a Declaration
as its last parameter. The same
rules about default arguments and variadic parameter that apply to statement
macros apply to attribute macros as well.
macro attr (Context context, Declaration decl)
{
auto attrName = decl.name;
auto type = decl.type;
return <[
private $decl.type _$decl.name;
$decl.type $decl.name ()
{
return _$decl.name;
}
$decl.type $decl.name ($decl.type value)
{
return _$decl.name = value;
}
]>;
}
class Foo
{
@attr int bar;
}
Use cases
Examples of usage of AST macros that would be useful for extending the language.
Linq
Linq is a .net library that encorperates searching and manipulation of data. A c# example is:
using System;
using System.Linq;
class Program
{
static void Main()
{
int[] array = { 1, 2, 3, 6, 7, 8 };
var elements = from element in array
where element > 5
select element;
foreach (var element in elements)
{
}
}
}
This could be implemented by an end user as:
import linq;
import std.stdio;
void main() {
int[] array = [1, 2, 3, 6, 7, 8];
int[] data;
query {
from element in array
where element > 2
add element to data
}
}
That code would be converted to:
import linq;
void main() {
int[] array = [1, 2, 3, 6, 7, 8];
int[] data;
foreach (element; array) {
if (element > 5) data ~= element;
}
}
C#'s ability of specifying the variable to be set to is not required at least for this example. However it should be able to be specified e.g.
query {
int data
from element in array
where element > 5
select element
}
This would be closer to c#'s.
That code would be converted to:
import linq;
void main() {
int[] array = [1, 2, 3, 6, 7, 8];
int[] data;
foreach (element; array) {
if (element > 5) data ~= element;
}
}
For improvements of this it would be suggested that the ability to be able to get the current variables declared within scope. This will enable the ability to check for if variables defined e.g. the array. If it is not it will be possible give a good compiler error. It would enable the ability to instead of specifying the type of an array value it could determine if based upon the array given.
Calculation
Given a simple macro example that will add two numbers together and then return it, the values requested must be available. Using scoped variables passed by reference on the context this is possible.
func(1, 2); // example args
void func(int i, int i2) {
foo {
output
i, i2
}
}
macro foo (Context context, Ast!(string) str)
{
string outputVariable = // get return through str
string name1 = // get i through str
string name2 = // get i2 through str
return outputVariable = "auto " ~ outputVariable ~ text(context.scopeVariables!int(name1) + context.scopeVariables!int(name2)) ~ ";";
}
When unrolled it will become:
void func(int i, int i2) {
auto output = 3;
}
This essentially emulates pure functions however as stated in Linq example that it would enable checking of variables and types as required.