Compile-time vs. compile-time

From D Wiki
Revision as of 00:08, 22 March 2017 by Quickfur (talk | contribs) (copyedit, expand)
Jump to: navigation, search
By H. S. Teoh, March 2017

One of D's oft-touted features is its awesome compile-time capabilities, which open up wonderful meta-programming opportunities, code-generation techniques, compile-time introspection, DSLs that are transformed into code at compile-time and therefore incur zero runtime overhead, and plenty more. Acronyms like CTFE have become common parlance amongst D circles.

However, said "compile-time" capabilities are also often the source of much confusion and misunderstanding, especially on the part of newcomers to D, often taking the form of questions posted to the discussion forum by frustrated users such as: "Why doesn't the compiler let me do this?!", "Why doesn't this do what I think it should do?", "Why can't the compiler figure this simple thing out?! The compiler is so stupid!", and so on.

This article hopes to clear up most of these misunderstandings by explaining just what exactly D's "compile-time" capabilities are, give a brief overview of how it works, and thereby hopefully give newcomers to D a better handle on what exactly is possible, and what to do when you run into a snag.

There's compile-time, and then there's compile-time

Part of the confusion is no thanks to the overloaded term "compile-time". It sounds straightforward enough -- "compile-time" is simply the time when the compiler does whatever it does when it performs its black magic of transforming human-written D code into machine-readable executables. Therefore, if feature X is a "compile-time" feature, and feature Y is another "compile-time" feature, then X and Y ought to be usable in any combination, right? Since, after all, it all happens at "compile-time", so surely the compiler, with its access to black magic, should be able to just sort it all out, no problem.

The reality, of course, is a bit more involved than this. There are, roughly speaking, actually at least two distinct categories of D features that are commonly labelled "compile-time":

  • Template expansion, or abstract syntax tree (AST) manipulation; and
  • Compile-time function evaluation (CTFE).

While these two take place at "compile-time", they represent distinct phases in the process of compilation, and understanding this distinction is the key to understanding how D's "compile-time" features work.

(The D compiler, of course, has more distinct phases of compilation than these two, but for our purposes, we don't have to worry about the other phases.)

Template expansion / AST manipulation

One of the first things the compiler does when it compiles your code, is to transform the text of the code into what is commonly known as the Abstract Syntax Tree (AST).

For example, this program:

import std.stdio;
void main(string[] args)
{
    writeln("Hello, world!");
}

is parsed into something resembling this:

AST.svg

(Note: this is not the actual AST created by the compiler; it is only a simplified example. The actual AST created by the compiler would be more detailed and have more information stored in each node.)

The AST represents the structure of the program as seen by the compiler, and contains everything the compiler needs to eventually transform the program into executable machine code.

One key point to note here is that in this AST, there are no such things as variables, memory, or input and output. At this stage of compilation, the compiler has only gone as far as building a model of the program structure. In this structure, we have identifiers like args and writeln, but the compiler has not yet attached semantic meanings to them yet. That will be done in a later stage of compilation.

Part of D's powerful "compile-time" capabilities stem from the ability to manipulate this AST (to some extent) as the program is being compiled. Among the features that D offers are templates and static if.

Templates

If you are already familiar with the basics of templates, you may want to skip to the following section.

One of D's powerful features is templates, which are similar to C++ templates. Templates can be thought of as code stencils, or stencils of a subtree of the AST, that can be used to generate AST subtrees. For example, consider the following template struct:

struct Box(T)
{
    T data;
}

In D, this is shorthand for:

template Box(T)
{
    struct Box
    {
        T data;
    }
}

Its corresponding AST tree looks something like this:

Template1.svg

When you instantiate the template with a declaration like:

Box!int intBox;

for example, what the compiler effectively does is to make a copy of the AST subtree under the TemplateBody node and substitute int for every occurrence of T in it. So it is as if the compiler inserted this generated AST subtree into the program's AST at this point:

Template-example1.svg

Which corresponds to this code fragment:

struct Box!int
{
    int data;
}

(Note that you cannot actually write this in your source code; the name Box!int is reserved for the template expansion process and cannot be directly defined by user code.)

Similarly, if you instantiate the same template with a different declaration, such as:

Box!float intBox;

it is as if you had declared something like this:

struct Box!float
{
    int data;
}

Effectively, you are creating "virtual AST subtrees" every time you instantiate a template, which get grafted into your program's AST when the template is instantiated. This feature is great for avoiding boilerplate code: you can factor out the common bits of code into a template, and thereby adhere to the DRY (Don't Repeat Yourself) principle.

static if

D templates are only the beginning of what D is capable of doing. Another very powerful tool in the AST manipulation phase of D compilation is static if. For example:

struct S(bool b)
{
    static if (b)
        int x;
    else
        float y;
}

The static if here means that the boolean parameter b is evaluated when the compiler is expanding the template S. The value of must be known at the time the template is being expanded. In D circles, we often say that the value must be known "at compile-time", but it is very important to more precise. We will elaborate on this more later.

If the value is true, then the else branch of the static if is pruned away from the expanded template. That is, when you write:

S!true s;

it is as if you declared:

struct S!true
{
    int x;
}

Note that the else branch is completely absent from the expanded template. This is a very important point.

Similarly, when you write:

S!false t;

it is as if you had declared:

struct S!false
{
    float y;
}

Note that the if branch is completely absent from the expanded template. This is also a very important point.

In other words, static if is a choice that affects the effective AST seen by later compilation phases.