User:Schuetzm/scope

From D Wiki
Revision as of 13:43, 13 July 2014 by Schuetzm (talk | contribs)
Jump to: navigation, search

What's this about?

Taking addresses of local variables is currently not allowed @safe code. [1] This is, however, a very broad restriction, that disallows many useful idioms. Examples include slicing of local arrays, passing local structures by reference for efficiency reasons, out parameters, and allocators with limited lifetimes. Additionally, GC avoidance techniques like reference counting and unique/owned objects need some kind of borrowing to work efficiently, to avoid the costs of reference incrementing/decrementing or move semantics, while at the same time still being provably memory safe.

The language already designates the scope concept for that purpose, but as of today it is unimplemented, and the original concept is generally seen as insufficient. This proposal intends to extend the design of scope and define it's semantics to be usable for the above-mentioned purposes.

Lifetimes

An important concept is that of the lifetime. Any object exists for a specific period of time, with a well defined beginning and end point: from the point it is created (constructed), to the point it is released. A reference that points to an object whose lifetime has ended is a dangling reference. Use of such references can cause all kinds of errors, and must therefore be prevented.

Because the lifetimes of actual manually managed objects are complex and unpredictable, a different concept of lifetime is hereby introduced, that only applies to named variables and is based purely on their lexical scope and order of declaration. By the following rules, a hierarchy of lifetimes is defined:

  • A variable's lifetime starts at the point of its declaration, and ends with the lexical scope it is defined in.
  • An (rvalue) expression's lifetime is temporary; it lives till the end of the statement that it appears in. (FIXME: provide a reference)
  • The lifetime of A is higher than that of B, if A appears in a higher scope than B, or if both appear in the same scope, but A comes lexically before B. This matches the order of destruction of local variables.
  • The lifetime of a function parameter is higher than that of that function's local variables, but lower than any variables in higher scopes. (FIXME: relative lifetimes among function parameters == order of destruction)

Ownership and borrowing

Taking a reference to a variable is called borrowing. The original variable is called the owner of the reference.

In @safe code, taking a reference to a local variable will be allowed, but the type of the resulting reference will contain information about the owner which is used by the compiler to decide whether assigning the reference to a particular variable is permissible or not. Assignment in this context means copying the reference, be it by assignment to another variable, passing it to a function, returning it from a function, or by throwing it as an exception. In general, scoped references may only be assigned to variables with lower (= shorter) lifetimes than their designated lifetime.

These restrictions apply only to reference types, i.e. pointers, slices, classes, and ref or out parameters. Non-reference types may be freely copied, because no memory-safety issues can arise.

For this purpose, scope needs to be changed into a type modifier (currently it is a storage class[2]), and the appropriate changes to the syntax need to be made, as detailed in the following sections.

Bare scope

This is the "normal", usual syntax, which will be used in most cases. Examples:

@safe:

int global_var;

void bar(scope(int*) input);

void foo() {
    scope(int*) a;
    a = &global_var;       // OK, `global_var` has higher lifetime than `a`
    scope b = &global_var; // OK, type deduction
    int c;

    if(...) {
        scope x = a;       // OK, copy of reference,`x` has shorter lifetime than `a`
        scope y = &c;      // OK, borrowing
        int z;
        b = &z;            // ERROR: `b` will outlive `z`
        int* d = a;        // ERROR: `d` is unscoped, but `a` is scoped
    }

    bar(a);                // OK, scoped reference is passed to scoped parameter
    bar(&c);               // OK, borrowing
    int* e;
    a = e;                 // OK, implicit conversion to '''scope'''
}

Non-scoped references are implicitly convertible to scope.

scope with owner(s)

Bare scope is already quite powerful, but for certain things, it still cannot be used. Typical examples are haystack-needle type functions that are passed in slices as input (the haystack and the needle), and return a slice of the haystack:

string findSubstring(scope(string) haystack, scope(string) needle) {
    // ... do the actual searching ...
    return haystack[found .. $];  // ERROR: needs to return `string`, not `scope(string)`
}

Changing the return type to `scope(string)` doesn't help, because it is a temporary that could not be stored anywhere, as temporaries' lifetimes are restricted to the current statement. Therefore, returning a bare scope(T) from a function doesn't make sense, and is disallowed, in order to avoid having to specify its behaviour.

Transitivity

Optional enhancements

scope!(const ...)

Automatic borrowing for pure functions

References