User:Schuetzm/scope2

From D Wiki
Revision as of 19:07, 11 March 2015 by Schuetzm (talk | contribs) (Scope inference)
Jump to: navigation, search

Overview

scope is a storage class. It applies to function parameters (including this), local variables, the return value (treated as if it were an out parameter), and member variables of aggregates. It participates in overloading.

With every variable, a particular lifetime (called scope) is associated. It usually refers to a local variable or parameter; additionally, an infinite scope is defined that corresponds to global/static variables or GC managed memory. Scopes and lifetimes are defined purely based on lexical scope and order of declaration of their corresponding variables. Therefore, for any two lifetimes, one is either completely contained in the other, or they are disjoint. By annotating a variable with scope, it's scope is defined to be equal to the variable's lifetime, instead of the default (infinity).

For any expression involving at least one scope value, two lifetimes (LHS lifetime and RHS lifetime) are computed in a way that ensures that the resulting RHS lifetime will not be greater than that of any of the expression's parts, and the LHS lifetime will not be smaller. The exact rules will be defined in one of the following sections. An assignment (i.e., = operator, passing to a function, returning from a function, capturing in a closure, throwing, etc.) involving scope values is only permissible, if the destination's LHS lifetime is fully contained in the source's RHS lifetime. Throwing is considered assignment to a variable with static lifetime.

The following invariant is enforced to always be true on every assignment: A location with scope a will never contain references pointing to values with a lifetime shorter than a.

To allow a function to return a value it received as a parameter, the parameter can be annotated with the return keyword, as in DIP25. It's also possible to express that a parameter escapes through another parameter (including this) by using the return!identifier syntax. Multiple such annotations can appear for each parameter. When a function is called, the compiler checks (for each argument and the return value) that only expressions with a lifetime longer than those of all the corresponding return annotations are passed in, and that the return value is used in a conforming way.

Because all relevant information about lifetimes is contained in the function signature, no explicit scope annotations for local variables are necessary; the compiler can figure them out by itself. Additionally, inference of annotations is done in the usual situations, i.e. nested functions and templates. In a subsequent section, an algorithm is presented that can be used for inference of scopes of local variables as well as annotations of parameters.

In parameters of @safe functions, all reference types (class references, pointers, slices, ref parameters) or aggregates containing references are implicitly treated as scope unless the parameter is annotated with the keyword static. This does however not apply to the return value.

ref and out parameters can also be treated as implicitly scoped, but this has the potential to break lots of code and needs to be considered carefully.

A scope annotation on a member variable shall be equivalent to a @property function returning a reference to that member, scoped to this.

Borrowing is the term used for assignment of a value to a variable with narrower scope. This typically happens on function calls, when a value is passed as an argument to a parameter annotated with (or inferred as) scope.

Implementation

Scope inference

This algorithm works at the function level. Scope inference for variables and parameters in one function is independent from all other functions.

It takes as input a list of variables whose scope is already fixed (by explicit annotations) and another list whose scopes are to be inferred. It will choose the narrowest possible scopes for them for which the function will still compile. This is based on the observation that variables that are read from need to have a scope at least as large as their destination. Therefore, we can start with the smallest possible scope, the lifetime of the variable itself, and extend it to the scope of the destination, if it isn't already larger.


1. Let Q be a list of all variables whose scopes are to be inferred. This includes template function parameters not otherwise annotated and all local variables.

2. Assign all elements of Q an initial scope equivalent to their own lifetime:

    foreach(var; Q) {
        var.scope := [var];
    }

3. For each ASSIGNMENT whose RHS_SCOPE depends on a variable in Q, expand that variable's scope to at least the LHS_SCOPE. For all variables the LHS_SCOPE depends on and that are in Q, record a dependency:

    foreach(ass; ASSIGNMENTS) {
        if(ass.rhs_scope.depends_on(Q)) {
            foreach(rhs_var; ass.rhs_scope.vars) {
                if(not rhs_var in Q)
                    continue;
                foreach(lhs_var; ass.lhs_scope.vars) {
                    rhs_var.scope |= ass.lhs_scope;
                    if(lhs_var in Q)
                        rhs_var.deps ~= lhs_var;
                }
            }
        }
    }

4. Remove all variables from Q that have no dependencies:

    foreach(var; Q) {
        if(var.deps.empty)
            Q.remove(var);
    }

5. If Q is empty, terminate, else remember length of Q:

    if(Q.empty)
        return;
    old_Q_len := Q.length;

6. Expand all variables' scopes to at least that of their dependencies; if a dependency has no dependencies itself, remove it from the variable's dependencies:

    foreach(var; Q) {
        foreach(dep; var.deps) {
            var.scope |= dep.scope;
            if(dep.deps.empty)
                var.deps.remove(dep);
        }
    }

7. If the length changed, we made progress. We can repeat from step 4. Otherwise we have a dependency loop. Find a cycle (for example using Tarjan's algorithm). Collect all elements in the cycle, remove their dependencies from DEPENDENCIES, and assign them all the union of their scopes:

    if(Q.length != old_Q_len)
        goto step4;
    cycle := tarjan(DEPENDENCIES);
    new_scope := []
    foreach(var; cycle) {
        new_scope |= var.scope;
        var.deps.remove_each(cycle);
    }
    foreach(var; cycle) {
        var.scope := new_scope;
    }

8. Go to step 4.


At this point, each variable will have a scope assigned. Now, all assignment can be checked to verify that they never place a reference in a location that outlives the reference's target.

Examples

RCArray

Walter's RCArray, adjusted to this proposal:

@safe:

struct RCArray(E) {
    @disable this();

    this(E[] a)
    {
        array = a.dup;
        count = new int;
        *count = 1;
    }

    ~this() @trusted
    {
        if (--*count == 0)
        {
            delete count;
            // either `delete` in `@system` code will accept `scope`:
            delete array;
            // or a helper needs to be used to remove `scope`:
            delete assumeStatic(array);
        }
    }

    this(this)
    {
        ++*count;
    }

    @property size_t length()
    {
        return array.length;
    }

    ref E opIndex(size_t i)
    {
        return array[i];
    }

    E[] opSlice(size_t lwr, size_t upr)
    {
        return array[lwr .. upr];
    }

    E[] opSlice()
    {
        return array[];
    }

private:
    scope E[] array;    // this is the only explicit annotation
    int* count;
}