Compile-time vs. compile-time
- By H. S. Teoh, March 2017
One of D's oft-touted features is its awesome compile-time capabilities, which open up wonderful meta-programming opportunities, code-generation techniques, DSLs that are transformed into code at compile-time and therefore incur zero runtime overhead, and so on. Acronyms like CTFE have become common parlance amongst D circles.
However, said "compile-time" capabilities are also often the source of much confusion and misunderstanding, especially on the part of newcomers to D, often taking the form of questions posted to the discussion forum by frustrated users such as: "Why can't the compiler read this value at compile-time, since it's clearly known at compile-time?!", or "Why do I have to write it in this convoluted, roundabout way instead of the straightforward way?"
This article hopes to clear up most of these misunderstandings by explaining just what exactly D's "compile-time" capabilities are, give a brief overview of how it works, and thereby hopefully give newcomers to D a better handle on what exactly is possible, and what to do when you run into a snag.
Contents
There's compile-time, and then there's compile-time
Part of the confusion is no thanks to the overloaded term "compile-time". It sounds straightforward enough -- "compile-time" is just the time when the compiler performs its black magic of transforming human-written D code into machine-readable executables. Therefore, if feature X is a "compile-time" feature, and feature Y is another "compile-time" feature, then surely X and Y ought to be usable in any combination, right? Since, after all, it all happens at "compile-time", so the compiler ought to be able to just sort it all out magically.
The reality, of course, is a bit more involved than this. There are, roughly speaking, at least two distinct categories of D features that are commonly labelled "compile-time":
- Template expansion, or abstract syntax tree (AST) manipulation; and
- Compile-time function evaluation (CTFE).
While these two take place at "compile-time", they represent distinct phases in the process of compilation, and understanding this distinction is the key to understanding how D's "compile-time" features work.
(There are, of course, finer distinctions in the process of compilation than these two categories, but for our purposes, we don't have to worry about such details.)
Template expansion / AST manipulation
One of the first things the compiler does when it compiles your code, is to transform the text of the code into what is commonly known as the Abstract Syntax Tree (AST).
For example, this program:
import std.stdio;
void main(string[] args)
{
writeln("Hello, world!");
}
is parsed into something resembling this:
(Note: this is not the actual AST created by the compiler; it is only a crude example to illustrate the point. The AST created by the compiler would be more detailed and have more information stored in each node.)
The AST represents the structure of the program as seen by the compiler, and contains everything the compiler needs to eventually transform the program into executable machine code.
One key point to note here is that in this AST, there are no such things as variables, memory, or input and output. At this stage of compilation, the compiler has only gone as far as building a model of the program structure. In this structure, we have identifiers like args
and writeln
, but the compiler has not yet attached semantic meanings to them yet. That will be done in a later stage of compilation.
Part of D's powerful "compile-time" capabilities stem from the ability to manipulate this AST (to some extent) as the program is being compiled. Among the features that D offers are templates and static if
.
Templates
- If you are already familiar with the basics of templates, you may want to skip to the following section.
One of D's powerful features is templates, which are similar to C++ templates. Templates can be thought of as code stencils, or stencils of a subtree of the AST, that can be used to generate AST subtrees. For example, consider the following template struct:
struct Box(T)
{
T data;
}
In D, this is shorthand for:
template Box(T)
{
struct Box
{
T data;
}
}
Its corresponding AST tree looks something like this:
When you instantiate the template with a declaration like:
Box!int intBox;
for example, the type Box!int
may not yet be defined. However, the compiler automatically makes a copy of the AST subtree under the TemplateBody
node and substitute int
for every occurrence of T
in it. This generated AST subtree is then included among the program's declarations:
Which corresponds to this code fragment:
struct Box!int
{
int data;
}
(Note that you cannot actually write this in your source code; the name Box!int
is reserved for the template expansion process and cannot be directly defined by user code.)
Similarly, if you instantiate the same template with a different declaration, such as:
Box!float floatBox;
it is as if you had declared something like this:
struct Box!float
{
int data;
}
Box!float floatBox;
In effect, you are creating "virtual AST subtrees" every time you instantiate a template, which get grafted into your program's AST when the template is instantiated. This feature is great for avoiding boilerplate code: you can factor out commonly-repeated bits of code into a template, and thereby adhere to the DRY (Don't Repeat Yourself) principle.
static if
D templates are only the beginning of what D is capable of doing. Another powerful tool in the AST manipulation phase of D compilation is static if
. For example:
struct S(bool b)
{
static if (b)
int x;
else
float y;
}
The static if
here means that the boolean parameter b
is evaluated when the compiler is expanding the template S
. The value of must be known at the time the template is being expanded. In D circles, we often say that the value must be known "at compile-time", but it is important to more precise. We will elaborate on this more later.
If the value of b
is true
, then the else
branch of the static if
is pruned away from the expanded template. That is, when you write:
S!true s;
it is as if you declared:
struct S!true
{
int x;
}
Note that the else
branch is completely absent from the expanded template. This is a very important point.
Similarly, when you write:
S!false t;
it is as if you had declared:
struct S!false
{
float y;
}
Note that the if
branch is completely absent from the expanded template. This is also a very important point.
In other words, static if
is a choice that affects the effective AST seen by later compilation phases.
To drive this point home further: at this stage, when static if
is evaluated, notions like variables, memory, and I/O don't exist yet. We are manipulating the structure of the program; not its execution. In the above code, x
and y
are merely identifiers; they have not yet been assigned any semantic meanings like variables or data fields that reside in a concrete offset in the struct
. This is done in subsequent stages of compilation.
Why is this point so important? Because it relates to a common misunderstanding of D's "compile-time" capabilities as it relates to CTFE, or Compile-Time Function Evaluation. Let's talk about that next.
CTFE
CTFE stands for Compile-Time Function Evaluation. This is an extremely powerful feature that D offers, and is similar to C++'s constexpr
feature (though in the author's opinion far more powerful).
The first and most important thing to understand about CTFE is that it happens after the AST manipulation phase has been completed. More precisely, it happens when the compiler has "finalized" the AST of that part of the program, and is now ready to assign semantic meaning to the various constructs represented by the AST. Identifiers are assigned meanings as modules, functions, function arguments, variables, and so on, and other semantic analyses such as VRP (Value Range Propagation) are performed.
Constant-folding, glorified
Part of this semantic analysis is constant-folding. For example, if we wrote something like this:
int i = 3*(5 + 7);
it would be a waste of computational resources to perform the indicated calculation (add 5 to 7, multiply the result by 3) at runtime, because all the arguments of the calculation are constants known to the compiler, and the result will never change at runtime. Of course, this particular example is trivial, but imagine if this line of code were inside a busy inner loop in a performance-critical part of the program. If we folded this constant expression into its resulting value, the program would run faster, since it wouldn't have to repeat this same calculation over and over, and indeed, it wouldn't even need to perform the calculation at all, since the answer is already known by the compiler.
The D compiler detects these constant expressions and replaces them with their value: it is as if we wrote:
int i = 36;
This process is called constant-folding, and is implemented by basically all modern compilers of any language today. So it is nothing to get excited about. The D compiler, however, takes this game to a whole new level. Consider, for example, this code:
int f(int x)
{
return x + 1;
}
int i = 3*(5 + f(6));
Again, if we perform the calculation manually, we see that the value of the expression is 36. However, this time, the constants involved in the expression are masked by the function call to f
. But since the definition of f
is visible to the compiler, and all it does is to add a constant value 1 to its argument, the compiler should be able to deduce that, again, the expression is constant and does not need to be performed at runtime.
But what if f
was more complicated than merely adding a constant to its argument, for example:
int f(int x)
{
int a = 0;
for (int i=1; i <= x/2; i++)
{
a += i;
}
return a + 1;
}
int i = 3*(5 + f(6));
Again, the value of f(6)
is constant, but in order to know its value, the compiler has to effectively run this function during compilation. And in order to do that, the compiler essentially compiles the body of f
into a state where it can be run inside what amounts to a D virtual machine inside the compiler.
This, in a nutshell, is how CTFE came about, in D's history. There is a limit as to how much this D virtual machine can do, but there's an ongoing trend of expanding its capabilities as far as possible. As of this writing, the D compiler is able to execute significant chunks of the Phobos standard library during compilation, thus making many library features accessible during compilation without needing to implement them the second time inside the compiler. Furthermore, a focused effort is being spearheaded by Stephen Koch to replace the current CTFE engine with an even more powerful one based on a bytecode interpreter, that promises superior CTFE performance and better memory management, and eventually, more features.
A different "compile-time"
Coming back to the topic at hand, though, notice that when we speak of CTFE, we speak of "virtual machines" and "bytecode interpreters". This implies that by this point, the code has gone far enough through the compilation process that it is essentially ready to be turned into runtime executable code.
In particular, this means that it has long passed the AST manipulation stage. Which in turn implies that code that can be evaluated by CTFE can no longer make use of AST manipulation constructs like static if
. This is because in order for CTFE to work, things such as variables and memory must exist, otherwise there is nothing to execute or interpret. But in the AST manipulation phase, notions such as variables and memory don't exist yet -- we're still manipulating the structure of the program.
Thus, even though CTFE happens at "compile-time" just as AST manipulation happens at "compile-time", this is actually a different "compile-time". It is much closer to "runtime" than AST manipulation, because by this point, the code is practically ready to be transformed into machine code. AST manipulation represents a much earlier stage of the compilation process. This is why the terminology "compile-time" is confusing, because it gives the false impression that all of these features, AST manipulation and CTFE alike, are lumped together into a single, amorphous blob of time labelled "compile-time", and that the compiler can somehow make it all magically just work, by fiat.
A fixed principle
So, to summarize: D's "compile-time" capabilities can be grouped into two distinct phases, AST manipulation and CTFE, and AST manipulation happens before CTFE. We may denote this by a unidirection arrow between the two phases:
- AST manipulation → CTFE
meaning that code can only progress one way, from AST manipulation to CTFE, not the other way round. This is a fixed principle.
Of course, in practice, this simple picture is only part of the story. To understand how it all works, though, it's best to look at actual code examples. So let's now take a look at a few common pitfalls that D learners often run into, and see how this principle applies in practice.
Case Study: Reading CTFE variables at AST manipulation time
A rather common complaint that's brought up in the D forums from time to time pertains to code along these lines:
int ctfeFunc(bool b)
{
static if (b) // <--- compile error at this line
return 1;
else
return 0;
}
// N.B.: the enum forces the compiler to evaluate the function in CTFE
enum myInt = ctfeFunc(true);
If you try to compile the above code, the compiler will complain that the static if
cannot read the value of b
at "compile-time". Which almost certainly elicits the reaction, "What??! What do you mean you can't read b
at compile-time?! Aren't you running this code in CTFE, which is by definition compile-time, with a value of b
that's obviously known at compile-time?"
On the surface, this would appear to be a glaring bug in the compiler, or a glaring shortcoming in D's "compile-time" capabilities, and/or an astonishing lack of competence on the part of the D compiler authors in failing to notice a problem in such an obvious and simple use case for CTFE.
If we understand what's really going on, however, we would see why the compiler rejected this code. Remember that during the process of compilation, the D compiler first creates an AST of the code, evaluating any static if
s that may change the shape of the resulting AST.
So when the compiler first encounters the declaration of ctfeFunc
, it scans its body and sees the static if (b)
while building the AST for this function. If the value of b
is true
, then it would emit the AST tree that essentially corresponds with:
int ctfeFunc(bool b)
{
return 1;
}
(Recall that in the AST manipulation stage, the false branch of a static if
is discarded from the resulting AST, and it is as if it wasn't even there. So the return 0
doesn't even make it past this stage.)
If the value of b
is false
, then it would emit the AST tree that corresponds with:
int ctfeFunc(bool b)
{
return 0;
}
There is a problem here, however. The value of b
is unknown at this point. All the compiler knows about b
at this point is that it's an identifier representing a parameter of the function. Semantics such as what values it might hold haven't been attached to it yet. In fact, the compiler hasn't even gotten to the enum
line that calls ctfeFunc
with an argument of true
yet!
And even if the compiler did get to the enum
line, it wouldn't have been able to assign a value to b
, because the function's AST is still not fully processed yet. You can't assign values to identifiers in an AST that hasn't been fully constructed yet, because the meaning of the identifiers may change once the AST is altered. It is simply too early at this point to meaningfully assign any value to b
. The notion of assigning a value to a parameter is a semantic concept that can only be applied after the AST manipulation phase. But in order to fully process the AST, the compiler needs to know the value of b
. Yet the value of b
cannot be known until after the AST has been processed. This is an insoluble impasse, so the compiler gives up and emits a compile error.
Solution 1: Make it available during AST manipulation
In this case, a simple solution is to make the value of b
available during the AST manipulation phase. The simplest way to do this is to turn ctfeFunc
into a template function with b
as a template parameter, with the corresponding change in the enum
line to pass true
as a template argument rather than a runtime argument:
int ctfeFunc(bool b)() // N.B.: the first set of parentheses enclose template parameters
{
static if (b) // Now this compiles without error
return 1;
else
return 0;
}
enum myInt = ctfeFunc!true;
Since b
is now a template argument, its value is known during AST manipulation, and so the static if
can be compiled without any problems.
Solution 2: Do everything during AST manipulation instead
The foregoing solution appears to be nice and simple, and it works, but if we consider it more carefully, we see that after the AST manipulation phase, the function has essentially become:
int ctfeFunc() // N.B.: template parameters no longer exist after AST manipulation phase
{
return 1; // N.B.: else branch of static if has been discarded
}
This means that CTFE wasn't actually necessary to evaluate this function in the first place! We could have just as easily declared ctfeFunc as a template, completely evaluated in the AST manipulation phase (and we might as well name it something other than ctfeFunc
, since it would be no longer evaluated in CTFE, and no longer even a function):
template Value(bool b)
{
static if (b)
enum Value = 1;
else
enum Value = 0;
}
enum myVal = Value!true;
Now myVal
can be completely evaluated at the AST manipulation phase, and CTFE doesn't even need to be involved.
Solution 3: Move everything to CTFE
There is another solution, however. Although the example we have given is rather trivial, in practice CTFE functions tend to be a lot more involved than a mere if-condition over a boolean parameter. Some functions may not be amenable to be rewritten in template form. So what do we do in that case?
The answer may surprise you: get rid of the static if
, and replace it with a plain old "runtime" if
! Like this:
int ctfeFunc(bool b)
{
if (b) // <--- N.B.: regular if, not static if
return 1;
else
return 0;
}
// N.B.: the enum forces the compiler to evaluate the function in CTFE
enum myInt = ctfeFunc(true);
And, miracle of miracles, this code compiles without any error, and with the correct value for myInt
! But wait a minute. What's going on here? How can changing static if
to if
, which supposedly runs at "runtime", possibly work for a value that is needed at "compile-time"? Is the compiler cheating and secretly turning myInt
into a runtime computation behind our backs?
Actually, inspecting the executable code with a disassembler shows that no such cheating is happening. So how does this work?
Let's take a close look. This time, during the AST manipulation phase, nothing much happens besides the usual construction of the AST corresponding with ctfeFunc
. There are no static if
s or other template-related constructs anymore, so the resulting AST is as straightforward as it can get. Then the compiler sees the enum
declaration and realizes that it needs to evaluate ctfeFunc
at "compile-time".
Here is where something interesting happens. Based on the above discussion, you may think, the compiler is still in the "AST manipulation" stage (because it hasn't fully constructed the AST for myInt
yet), so wouldn't this fail, since ctfeFunc
hasn't gotten to the semantic analysis phase yet? Note, however, that while it is certainly true that the AST for myInt
hasn't been fully resolved yet (and therefore we can't read the value of myInt
from CTFE at this point in time), the AST for ctfeFunc
, by this point, is already ready to proceed to the next stage of compilation. Or, more precisely, the subtree of the program's AST corresponding with ctfeFunc
is complete, and can be handed over to semantic analysis now.
So the D compiler, being pretty smart about these things, realizes that it can go ahead with semantic analysis of ctfeFunc
independently of the fact that other parts of the program's AST, namely the declaration of myInt
, aren't quite done yet. This works, because ctfeFunc
's subtree of the AST can be semantically analyzed as a unit on its own, without depending on myInt
's part of the subtree at all!
Thus, the compiler processes the AST of ctfeFunc
and brings it to the point where it can be interpreted by the CTFE engine. The CTFE engine is then invoked to run ctfeFunc
with the value true
as its argument -- essentially simulating what it would have done at runtime. The return value of 1 is then handed back to the AST manipulation code that's still waiting to finish processing the AST for myInt
, upon which the latter AST also becomes fully constructed.
Perhaps this "smart" invocation of CTFE interleaved with AST manipulation is part of what imparts some validity to the term "compile-time" as referring to the entire process as a whole. However, hopefully by now you have also acquired a better understanding of why there are really (at least) two different "compile-times": an early phase where AST manipulation happens, and a later phase at CTFE which is very much like runtime except that it just so happens to be done inside the compiler.
Case Study: pragma(msg) and CTFE
Another common complaint by D learners arises from trying to debug CTFE functions, and pertains to the "strange" way in which pragma(msg)
behaves in CTFE.
First of all, if you're not familiar with it, pragma(msg)
is a handy compiler feature that lets you debug certain compile-time processes. Every time the compiler sees a pragma(msg)
directive during compilation, it outputs whatever message is specified as arguments to the pragma
. For example:
template MyTemplate(T)
{
pragma(msg, "instantiating MyTemplate with T=" ~ T.stringof);
// ... insert actual code here
}
This causes the compiler to print "instantiating MyTemplate with T=..." each time MyTemplate
is instantiated, which can be a useful debugging tool to trace exactly what instantiations are being used in the code.
So far so good.
Complaints, however, tend to arise when people attempt to do things like this:
int ctfeFunc(int x)
{
if (x < 100)
return x;
else
{
pragma(msg, "bad value passed in");
return -1;
}
}
enum x = ctfeFunc(50); // N.B.: enum forces CTFE on ctfeFunc
Even though the argument 50 is well within the bounds of what ctfeFunc
can handle, the compiler persistently prints "bad value passed in". And it does the same if the argument is changed to something the function ostensibly rejects, like 101. What gives?
"Compiler bug!" some would scream.
By now, however, hopefully you can guess at the answer.