Compile-time vs. compile-time

From D Wiki
Revision as of 16:56, 22 March 2017 by Quickfur (talk | contribs) (AST manipulation and CTFE)
Jump to: navigation, search
By H. S. Teoh, March 2017

One of D's oft-touted features is its awesome compile-time capabilities, which open up wonderful meta-programming opportunities, code-generation techniques, DSLs that are transformed into code at compile-time and therefore incur zero runtime overhead, and so on. Acronyms like CTFE have become common parlance amongst D circles.

However, said "compile-time" capabilities are also often the source of much confusion and misunderstanding, especially on the part of newcomers to D, often taking the form of questions posted to the discussion forum by frustrated users such as: "Why can't the compiler read this value at compile-time, since it's clearly known at compile-time?!", or "Why do I have to write it in this convoluted, roundabout way instead of the straightforward way?"

This article hopes to clear up most of these misunderstandings by explaining just what exactly D's "compile-time" capabilities are, give a brief overview of how it works, and thereby hopefully give newcomers to D a better handle on what exactly is possible, and what to do when you run into a snag.

There's compile-time, and then there's compile-time

Part of the confusion is no thanks to the overloaded term "compile-time". It sounds straightforward enough -- "compile-time" is just the time when the compiler performs its black magic of transforming human-written D code into machine-readable executables. Therefore, if feature X is a "compile-time" feature, and feature Y is another "compile-time" feature, then surely X and Y ought to be usable in any combination, right? Since, after all, it all happens at "compile-time", so the compiler ought to be able to just sort it all out magically.

The reality, of course, is a bit more involved than this. There are, roughly speaking, at least two distinct categories of D features that are commonly labelled "compile-time":

  • Template expansion, or abstract syntax tree (AST) manipulation; and
  • Compile-time function evaluation (CTFE).

While these two take place at "compile-time", they represent distinct phases in the process of compilation, and understanding this distinction is the key to understanding how D's "compile-time" features work.

(There are, of course, finer distinctions in the process of compilation than these two categories, but for our purposes, we don't have to worry about such details.)

Template expansion / AST manipulation

One of the first things the compiler does when it compiles your code, is to transform the text of the code into what is commonly known as the Abstract Syntax Tree (AST).

For example, this program:

import std.stdio;
void main(string[] args)
{
    writeln("Hello, world!");
}

is parsed into something resembling this:

AST.svg

(Note: this is not the actual AST created by the compiler; it is only a crude example to illustrate the point. The AST created by the compiler would be more detailed and have more information stored in each node.)

The AST represents the structure of the program as seen by the compiler, and contains everything the compiler needs to eventually transform the program into executable machine code.

One key point to note here is that in this AST, there are no such things as variables, memory, or input and output. At this stage of compilation, the compiler has only gone as far as building a model of the program structure. In this structure, we have identifiers like args and writeln, but the compiler has not yet attached semantic meanings to them yet. That will be done in a later stage of compilation.

Part of D's powerful "compile-time" capabilities stem from the ability to manipulate this AST (to some extent) as the program is being compiled. Among the features that D offers are templates and static if.

Templates

If you are already familiar with the basics of templates, you may want to skip to the following section.

One of D's powerful features is templates, which are similar to C++ templates. Templates can be thought of as code stencils, or stencils of a subtree of the AST, that can be used to generate AST subtrees. For example, consider the following template struct:

struct Box(T)
{
    T data;
}

In D, this is shorthand for:

template Box(T)
{
    struct Box
    {
        T data;
    }
}

Its corresponding AST tree looks something like this:

Template1.svg

When you instantiate the template with a declaration like:

Box!int intBox;

for example, the type Box!int may not yet be defined. However, the compiler automatically makes a copy of the AST subtree under the TemplateBody node and substitute int for every occurrence of T in it. This generated AST subtree is then included among the program's declarations:

Template-example1.svg

Which corresponds to this code fragment:

struct Box!int
{
    int data;
}

(Note that you cannot actually write this in your source code; the name Box!int is reserved for the template expansion process and cannot be directly defined by user code.)

Similarly, if you instantiate the same template with a different declaration, such as:

Box!float floatBox;

it is as if you had declared something like this:

struct Box!float
{
    int data;
}

Box!float floatBox;

In effect, you are creating "virtual AST subtrees" every time you instantiate a template, which get grafted into your program's AST when the template is instantiated. This feature is great for avoiding boilerplate code: you can factor out commonly-repeated bits of code into a template, and thereby adhere to the DRY (Don't Repeat Yourself) principle.

static if

D templates are only the beginning of what D is capable of doing. Another powerful tool in the AST manipulation phase of D compilation is static if. For example:

struct S(bool b)
{
    static if (b)
        int x;
    else
        float y;
}

The static if here means that the boolean parameter b is evaluated when the compiler is expanding the template S. The value of must be known at the time the template is being expanded. In D circles, we often say that the value must be known "at compile-time", but it is important to more precise. We will elaborate on this more later.

If the value of b is true, then the else branch of the static if is pruned away from the expanded template. That is, when you write:

S!true s;

it is as if you declared:

struct S!true
{
    int x;
}

Note that the else branch is completely absent from the expanded template. This is a very important point.

Similarly, when you write:

S!false t;

it is as if you had declared:

struct S!false
{
    float y;
}

Note that the if branch is completely absent from the expanded template. This is also a very important point.

In other words, static if is a choice that affects the effective AST seen by later compilation phases.

To drive this point home further: at this stage, when static if is evaluated, notions like variables, memory, and I/O don't exist yet. We are manipulating the structure of the program; not its execution. In the above code, x and y are merely identifiers; they have not yet been assigned any semantic meanings like variables or data fields that reside in a concrete offset in the struct. This is done in subsequent stages of compilation.

Why is this point so important? Because it relates to a common misunderstanding of D's "compile-time" capabilities as it relates to CTFE, or Compile-Time Function Evaluation. Let's talk about that next.

CTFE

CTFE stands for Compile-Time Function Evaluation. This is an extremely powerful feature that D offers, and is similar to C++'s constexpr feature (though in the author's opinion far more powerful).

The first and most important thing to understand about CTFE is that it happens after the AST manipulation phase has been completed. More precisely, it happens when the compiler has "finalized" the AST of that part of the program, and is now ready to assign semantic meaning to the various constructs represented by the AST. Identifiers are assigned meanings as modules, functions, function arguments, variables, and so on, and other semantic analyses such as VRP (Value Range Propagation) are performed.

Constant-folding, glorified

Part of this semantic analysis is constant-folding. For example, if we wrote something like this:

int i = 3*(5 + 7);

it would be a waste of computational resources to perform the indicated calculation (add 5 to 7, multiply the result by 3) at runtime, because all the arguments of the calculation are constants known to the compiler, and the result will never change at runtime. Of course, this particular example is trivial, but imagine if this line of code were inside a busy inner loop in a performance-critical part of the program. If we folded this constant expression into its resulting value, the program would run faster, since it wouldn't have to repeat this same calculation over and over, and indeed, it wouldn't even need to perform the calculation at all, since the answer is already known by the compiler.

The D compiler detects these constant expressions and replaces them with their value: it is as if we wrote:

int i = 36;

This process is called constant-folding, and is implemented by basically all modern compilers of any language today. So it is nothing to get excited about. The D compiler, however, takes this game to a whole new level. Consider, for example, this code:

int f(int x)
{
    return x + 1;
}

int i = 3*(5 + f(6));

Again, if we perform the calculation manually, we see that the value of the expression is 36. However, this time, the constants involved in the expression are masked by the function call to f. But since the definition of f is visible to the compiler, and all it does is to add a constant value 1 to its argument, the compiler should be able to deduce that, again, the expression is constant and does not need to be performed at runtime.

But what if f was more complicated than merely adding a constant to its argument, for example:

int f(int x)
{
    int a = 0;
    for (int i=1; i <= x/2; i++)
    {
        a += i;
    }
    return a + 1;
}

int i = 3*(5 + f(6));

Again, the value of f(6) is constant, but in order to know its value, the compiler has to effectively run this function during compilation. And in order to do that, the compiler essentially compiles the body of f into a state where it can be run inside what amounts to a D virtual machine inside the compiler.

This, in a nutshell, is how CTFE came about, in D's history. There is a limit as to how much this D virtual machine can do, but there's an ongoing trend of expanding its capabilities as far as possible. As of this writing, the D compiler is able to execute significant chunks of the Phobos standard library during compilation, thus making many library features accessible during compilation without needing to implement them the second time inside the compiler. Furthermore, a focused effort is being spearheaded by Stephen Koch to replace the current CTFE engine with an even more powerful one based on a bytecode interpreter, that promises superior CTFE performance and better memory management, and eventually, more features.

A different "compile-time"

Coming back to the topic at hand, though, notice that when we speak of CTFE, we speak of "virtual machines" and "bytecode interpreters". This implies that by this point, the code has gone far enough through the compilation process that it is essentially ready to be turned into runtime executable code.

In particular, this means that it has long passed the AST manipulation stage. Which in turn implies that code that can be evaluated by CTFE can no longer make use of AST manipulation constructs like static if. This is because in order for CTFE to work, things such as variables and memory must exist, otherwise there is nothing to execute or interpret. But in the AST manipulation phase, notions such as variables and memory don't exist yet -- we're still manipulating the structure of the program.

Thus, even though CTFE happens at "compile-time" just as AST manipulation happens at "compile-time", this is actually a different "compile-time". This is why the terminology "compile-time" is confusing, because it gives the false impression that all of these features, AST manipulation and CTFE alike, are lumped together into a single, amorphous blob of time labelled "compile-time", and that the compiler can somehow make it all magically just work, by fiat.

A fixed principle

So, to summarize: D's "compile-time" capabilities can be grouped into two distinct phases, AST manipulation and CTFE, and AST manipulation happens before CTFE. We may denote this by a unidirection arrow between the two phases:

AST manipulation → CTFE

meaning that code can only progress one way, from AST manipulation to CTFE, not the other way round. This is a fixed principle.

Let's take a look at a few common pitfalls that D learners often run into, and see how this principle applies in practice.

Case Study: Reading CTFE variables at AST manipulation time

A rather common complaint that's brought up in the D forums from time to time pertains to code along these lines:

int ctfeFunc(bool b)
{
    static if (b)    // <--- compile error at this line
        return 1;
    else
        return 0;
}

// N.B.: the enum forces the compiler to evaluate the function in CTFE
enum myInt = ctfeFunc(true);

If you try to compile the above code, the compiler will complain that the static if cannot read the value of b at "compile-time". Which almost certainly elicits the reaction, "What??! What do you mean you can't read b at compile-time?! Aren't you running this code in CTFE, which is by definition compile-time, with a value of b that's obviously known at compile-time?"

On the surface, this would appear to be a glaring bug in the compiler, and/or an astonishing lack of competence on the part of the D compiler authors in failing to notice a problem in such an obvious and simple use case for CTFE.

If we understand what's really going on, however, we would see what's the problem with this code. Remember that during the process of compilation, the D compiler first creates an AST of the code, evaluating any static ifs that may change the shape of the resulting AST.

So when the compiler first encounters the declaration of ctfeFunc, it scans its body and sees the static if (b) while building the AST for this function. If the value of b is true, then it would emit the AST tree that essentially corresponds with:

int ctfeFunc(bool b)
{
    return 1;
}

(Recall that in the AST manipulation stage, the false branch of a static if is discarded from the resulting AST, and it is as if it wasn't even there. So the return 0 doesn't even make it past this stage.)

If the value of b is false, then it would emit the AST tree that corresponds with:

int ctfeFunc(bool b)
{
    return 0;
}

There is a problem here, however. The value of b is unknown at this point. In fact, the compiler hasn't even gotten to the enum line yet! And even if it did, at the AST manipulation stage the identifier b wouldn't have been assigned a meaning yet -- it is, after all, a runtime parameter that only acquires a meaningful value during semantic analysis, which can only take place after AST manipulation has been completed. But since its value is needed at the AST manipulation stage, the compiler never gets far enough to be able to assign a value to b.

Hence the compile error.

Solution 1: Make it available during AST manipulation

In this case, a simple solution is to make the value of b available during the AST manipulation phase. The simplest way to do this is to turn ctfeFunc into a template function with b as a template parameter, with the corresponding change in the enum line to pass true as a template argument rather than a runtime argument:

int ctfeFunc(bool b)()    // N.B.: the first set of parentheses enclose template parameters
{
    static if (b)    // Now this compiles without error
        return 1;
    else
        return 0;
}

enum myInt = ctfeFunc!true;

Since b is now a template argument, its value is known during AST manipulation, and so the static if can be compiled without any problems.

Solution 2: Do everything during AST manipulation instead

The foregoing solution appears to be nice and simple, and it works, but if we consider it more carefully, we see that after the AST manipulation phase, the function has essentially become:

int ctfeFunc()    // N.B.: template parameters no longer exist after AST manipulation phase
{
    return 1;     // N.B.: else branch of static if has been discarded
}

This means that CTFE wasn't actually necessary to evaluate this function in the first place! We could have just as easily declared ctfeFunc as a template, completely evaluated in the AST manipulation phase (and we might as well name it something other than ctfeFunc, since it would be no longer evaluated in CTFE, and no longer even a function):

template Value(bool b)
{
    static if (b)
        enum Value = 1;
    else
        enum Value = 0;
}

enum myVal = Value!true;

Now myVal can be completely evaluated at the AST manipulation phase, and CTFE doesn't even need to be involved.

Solution 3: Move everything to CTFE

There is another solution, however. Although the example we have given is rather trivial, in practice CTFE functions tend to be a lot more involved than a mere if-condition over a boolean parameter. Some functions may not be amenable to be rewritten in template form. So what do we do in that case?

The answer may surprise you: get rid of the static if, and replace it with a plain old "runtime" if! Like this:

int ctfeFunc(bool b)
{
    if (b)    // <--- N.B.: regular if, not static if
        return 1;
    else
        return 0;
}

// N.B.: the enum forces the compiler to evaluate the function in CTFE
enum myInt = ctfeFunc(true);

And, miracle of miracles, this code compiles without any error, and with the correct value for myInt! But wait a minute. What's going on here? How can changing static if to if, which supposedly runs at "runtime", possibly work for a value that is needed at "compile-time"? Is the compiler cheating and secretly turning myInt into a runtime computation behind our backs?

Actually, inspecting the executable code with a disassembler shows that no such cheating is happening. So how does this work?

Let's take a close look. This time, during the AST manipulation phase, nothing much happens besides the usual construction of the AST corresponding with ctfeFunc. There are no static ifs or other template-related constructs anymore, so the resulting AST is as straightforward as it can get. Then the compiler sees the enum declaration and realizes that it needs to evaluate ctfeFunc at "compile-time".

Here is where something interesting happens. Based on the above discussion, you may think, the compiler is still in the "AST manipulation" stage (because it hasn't fully constructed the AST for myInt yet), so wouldn't this fail, since ctfeFunc hasn't gotten to the semantic analysis phase yet? Note, however, that while it is certainly true that the AST for myInt hasn't been fully resolved yet (and therefore we can't read the value of myInt from CTFE at this point in time), the AST for ctfeFunc, by this point, is already ready to proceed to the next stage of compilation. Or, more precisely, the subtree of the program's AST corresponding with ctfeFunc is complete, and can be handed over to semantic analysis now.

So the D compiler, being pretty smart about these things, realizes that it can go ahead with semantic analysis of ctfeFunc independently of the fact that other parts of the program's AST, namely the declaration of myInt, aren't quite done yet. This works, because ctfeFunc's subtree of the AST can be semantically analyzed as a unit on its own, without depending on myInt's part of the subtree at all!

Thus, the compiler processes the AST of ctfeFunc and brings it to the point where it can be interpreted by the CTFE engine. The CTFE engine is then invoked to run ctfeFunc with the value true as its argument -- essentially simulating what it would have done at runtime. The return value of 1 is then handed back to the AST manipulation code that's still waiting to finish processing the AST for myInt, upon which the latter AST also becomes fully constructed.

Perhaps this "smart" invocation of CTFE interleaved with AST manipulation is part of what imparts some validity to the term "compile-time" as referring to the entire process as a whole. However, hopefully by now you have also acquired a better understanding of why there are really (at least) two different "compile-times": an early phase where AST manipulation happens, and a later phase at CTFE which is very much like runtime except that it just so happens to be done inside the compiler.

Case Study: pragma(msg) and CTFE

Another common complaint by D learners arises from trying to debug CTFE functions, and pertains to the "strange" way in which pragma(msg) behaves in CTFE.

First of all, if you're not familiar with it, pragma(msg) is a handy compiler feature that lets you debug certain compile-time processes. Every time the compiler sees a pragma(msg) directive during compilation, it outputs whatever message is specified as arguments to the pragma. For example:

template MyTemplate(T)
{
    pragma(msg, "instantiating MyTemplate with T=" ~ T.stringof);
    // ... insert actual code here
}

This causes the compiler to print "instantiating MyTemplate with T=..." each time MyTemplate is instantiated, which can be a useful debugging tool to trace exactly what instantiations are being used in the code.

So far so good.

Complaints, however, tend to arise when people attempt to do things like this:

int ctfeFunc(int x)
{
    if (x < 100)
        return x;
    else
    {
        pragma(msg, "bad value passed in");
        return -1;
    }
}
enum x = ctfeFunc(50); // N.B.: enum forces CTFE on ctfeFunc

Even though the argument 50 is well within the bounds of what ctfeFunc can handle, the compiler persistently prints "bad value passed in". And it does the same if the argument is changed to something the function ostensibly rejects, like 101. What gives?

"Compiler bug!" some would scream.

By now, however, hopefully you can guess at the answer.