Difference between revisions of "DIP74"

From D Wiki
Jump to: navigation, search
(Rules)
(Rules)
Line 47: Line 47:
 
RCOs are handles as follows:
 
RCOs are handles as follows:
  
* Whenever a new reference to an object is created (e.g. <tt>auto a = b;</tt>), compiler inserts a call to <tt>opAddRef</tt> in the generated code. Call is inserted only if the reference is not <tt>null</tt>. The lowering of <tt>auto a = b;</tt> to pre-DIP74 code is conceptually as follows: <tt>auto a = { if (b) b.opAddRef(); return b; }();</tt>
+
* Whenever a new reference to an object is created (e.g. <tt>auto a = b;</tt>), compiler inserts a call to <tt>opAddRef</tt> in the generated code. Call is inserted only if the reference is not <tt>null</tt>. The lowering of <tt>auto a = lvalExpr;</tt> to pre-DIP74 code is conceptually as follows:
  
* There is no call inserted for the first reference created via a constructor (i.e. it is assumed the constructor already puts the object in the appropriate state). For example the lowering of <tt>auto a = new Widget;</tt> does not insert a call to <tt>opAddRef</tt>.
+
<syntaxhighlight lang=D>
 +
auto a = function(x) { if (x) x.opAddRef(); return x; }(lvalExpr);
 +
</syntaxhighlight>
 +
 
 +
* If a new reference is created from an rvalue (including a call to <tt>new</tt> or the result of a function), no call to <tt>opAddRef</tt> is inserted. As a consequence, there is no call inserted for the first reference created via a constructor (i.e. it is assumed the constructor already puts the object in the appropriate state). For example the lowering of <tt>auto a = new Widget;</tt> does not insert a call to <tt>opAddRef</tt>.
 +
 
 +
* Whenever a reference to an object is assigned (e.g. <tt>a = b</tt>), first <tt>b.opAddRef()</tt> is called and then <tt>a.opRelease()</tt> is called, followed by the reference assignment itself. Calls are only made if the respective objects are not <tt>null</tt>. So the lowering of e.g. <tt>lvalExprA = lvalExprB;</tt> to pre-DIP74 code is:
 +
 
 +
<syntaxhighlight lang=D>
 +
function(ref x, y) {
 +
    if (y) y.opAddRef();
 +
    scope(failure) y.opRelease();
 +
    if (x) x.opRelease();
 +
    x = y;
 +
}(lvalExprA, lvalExprB);
 +
</syntaxhighlight>
  
* Whenever a reference to an object is assigned (e.g. <tt>a = b</tt>), first <tt>b.opAddRef()</tt> is called and then <tt>a.opRelease()</tt> is called, followed by the reference assignment itself. Calls are only made if the respective objects are not <tt>null</tt>. So the lowering of e.g. <tt>a = b;</tt> is <tt>a = { if (b) b.opAddRef(); if (a) a.opRelease(); return b; }();</tt>
+
The complexity of this code underlies the importance of making <tt>opAddRef</tt> and especially <tt>opRelease</tt> <tt>nothrow</tt>. In that case the <tt>scope(failure)</tt> statement may be elided.
  
 
* Whenever a reference to an object goes out of scope, the compiler inserts an implicit call to <tt>opRelease</tt>. Call is inserted only if the reference is not <tt>null</tt>.
 
* Whenever a reference to an object goes out of scope, the compiler inserts an implicit call to <tt>opRelease</tt>. Call is inserted only if the reference is not <tt>null</tt>.
 +
 +
* <tt>struct</tt> types that have RCO members accommodate calls to <tt>Release</tt> during their destruction.
  
 
* The pass-by-value protocol for RCOs is as follows: the caller calls <tt>opAddRef</tt> and the callee calls </tt>opRelease</tt>. These calls are sequenced and handled the same as copy constructor calls and destructor calls, respectively, for <tt>struct</tt> objects. Example:
 
* The pass-by-value protocol for RCOs is as follows: the caller calls <tt>opAddRef</tt> and the callee calls </tt>opRelease</tt>. These calls are sequenced and handled the same as copy constructor calls and destructor calls, respectively, for <tt>struct</tt> objects. Example:
Line 78: Line 95:
 
# Function returns
 
# Function returns
  
* Functions that return an RCO call <tt>opAddRef</tt> against the returned reference, EXCEPT if the returned reference is an rvalue or a local variable. Functions that return a local variable do not call <tt>opRelease</tt> against it.
+
The lowering of a call <tt>fun(exprA, exprB, exprC)</tt> to pre-DIP74 code is:
 +
 
 +
<syntaxhighlight lang=D>
 +
fun(exprA, function(x) { if (x) x.opAddRef(); return x; }(exprB), exprC);
 +
</syntaxhighlight>
 +
 
 +
However, this translation is approximate. If <tt>exprC</tt> throws an exception (causing <tt>fun</tt> to not be entered), the compiler inserts code for calling <tt>opRelease()</tt> against the second argument. A more accurate lowering is:
 +
 
 +
<syntaxhighlight lang=D>
 +
{
 +
    typeof(exprB) t;
 +
    return fun(
 +
        exprA,
 +
        function(x) { t = x; if (t) t.opAddRef(); return t; }(exprB),
 +
        (){ scope(failure) if (t) t.opRelease(); return exprC; }
 +
    );
 +
}()
 +
</syntaxhighlight>
 +
 
 +
This lowering assumes left-to-right evaluation of function parameters. If <tt>fun</tt> itself throws, it is responsible for calling <tt>opRelease</tt> against its second argument.
 +
 
 +
* A function that returns a local RCO calls neither <tt>opAddRef</tt> nor <tt>opRelease</tt> against that value. Example:
 +
 
 +
<syntaxhighlight lang=D>
 +
Widget fun() {
 +
    auto a = new Widget;
 +
    return a; // no calls inserted
 +
}
 +
</syntaxhighlight>
 +
 
 +
Note: this is not an optimization. The compiler does not have the discretion to insert additional <tt>opAddRef</tt>/<tt>opRelease</tt>calls.
 +
 
 +
* A function that returns an RCO rvalue calls neither <tt>opAddRef</tt> nor <tt>opRelease</tt> against that value. Example:
 +
 +
<syntaxhighlight lang=D>
 +
Widget fun() {
 +
    return new Widget; // no calls inserted
 +
}
 +
</syntaxhighlight>
 +
 
 +
Note: this is not an optimization. The compiler does not have the discretion to insert additional <tt>opAddRef</tt>/<tt>opRelease</tt>calls.
 +
 
 +
* Functions that return an RCO (other than the two cases above) call <tt>opAddRef</tt> against the returned reference. Example:
 +
 
 +
<syntaxhighlight lang=D>
 +
Widget fun(ref Widget a) {
 +
    return a; // opAddRef inserted
 +
}
 +
</syntaxhighlight>
  
 
* The compiler considers that <tt>opRelease</tt> is the inverse of <tt>opAddRef</tt>, and therefore is at liberty to elide pairs of calls to <tt>opAddRef</tt>/<tt>opRelease</tt>. Example:
 
* The compiler considers that <tt>opRelease</tt> is the inverse of <tt>opAddRef</tt>, and therefore is at liberty to elide pairs of calls to <tt>opAddRef</tt>/<tt>opRelease</tt>. Example:

Revision as of 19:53, 26 February 2015

Title: Safe Reference Counted Class Objects
DIP: 74
Version: 1
Status: Draft
Created: 2015-02-23
Last Modified: 2015-02-26
Author: Walter Bright and Andrei Alexandrescu

Abstract

This DIP proposes @safe reference counted class objects (including exceptions) and interfaces for D.

Description

DIP25 allows defining struct types that own data and expose references to it, @safely, whilst controlling lifetime of that data. This proposal allows defining class objects that are safe yet use deterministic destruction for themselves and resources they own.

The compiler detects automatically and treats specially all classes and interfaces that define the following two methods:

class Widget {
    T1 opAddRef();
    T2 opRelease();
    ...
}

T1 and T2 may be any types (usually void or an integral type). The methods may or may not be final or inherited. Any attributes are allowed on these methods. They must be public. UFCS-expanded calls are not acceptable. If these two methods exist, the compiler categorizes this class or interface type as a reference counted object (RCO).

Rules

RCOs are handles as follows:

  • Whenever a new reference to an object is created (e.g. auto a = b;), compiler inserts a call to opAddRef in the generated code. Call is inserted only if the reference is not null. The lowering of auto a = lvalExpr; to pre-DIP74 code is conceptually as follows:
auto a = function(x) { if (x) x.opAddRef(); return x; }(lvalExpr);
  • If a new reference is created from an rvalue (including a call to new or the result of a function), no call to opAddRef is inserted. As a consequence, there is no call inserted for the first reference created via a constructor (i.e. it is assumed the constructor already puts the object in the appropriate state). For example the lowering of auto a = new Widget; does not insert a call to opAddRef.
  • Whenever a reference to an object is assigned (e.g. a = b), first b.opAddRef() is called and then a.opRelease() is called, followed by the reference assignment itself. Calls are only made if the respective objects are not null. So the lowering of e.g. lvalExprA = lvalExprB; to pre-DIP74 code is:
function(ref x, y) { 
    if (y) y.opAddRef();
    scope(failure) y.opRelease();
    if (x) x.opRelease();
    x = y;
}(lvalExprA, lvalExprB);

The complexity of this code underlies the importance of making opAddRef and especially opRelease nothrow. In that case the scope(failure) statement may be elided.

  • Whenever a reference to an object goes out of scope, the compiler inserts an implicit call to opRelease. Call is inserted only if the reference is not null.
  • struct types that have RCO members accommodate calls to Release during their destruction.
  • The pass-by-value protocol for RCOs is as follows: the caller calls opAddRef and the callee calls opRelease. These calls are sequenced and handled the same as copy constructor calls and destructor calls, respectively, for struct objects. Example:
struct A {
    this(this);
    ~this();
}
void fun(A x, Widget y, A z) {
}

In the code above, calling fun entails the sequence:

  1. All parameters are memcpy'd
  2. Postblit call for x
  3. y.opAddRef()
  4. Postblit call for z
  5. Function is entered
  6. Destructor call for z
  7. y.opRelease()
  8. Destructor call for x
  9. Function returns

The lowering of a call fun(exprA, exprB, exprC) to pre-DIP74 code is:

fun(exprA, function(x) { if (x) x.opAddRef(); return x; }(exprB), exprC);

However, this translation is approximate. If exprC throws an exception (causing fun to not be entered), the compiler inserts code for calling opRelease() against the second argument. A more accurate lowering is:

{
    typeof(exprB) t;
    return fun(
        exprA, 
        function(x) { t = x; if (t) t.opAddRef(); return t; }(exprB),
        (){ scope(failure) if (t) t.opRelease(); return exprC; }
    );
}()

This lowering assumes left-to-right evaluation of function parameters. If fun itself throws, it is responsible for calling opRelease against its second argument.

  • A function that returns a local RCO calls neither opAddRef nor opRelease against that value. Example:
Widget fun() {
    auto a = new Widget;
    return a; // no calls inserted
}

Note: this is not an optimization. The compiler does not have the discretion to insert additional opAddRef/opReleasecalls.

  • A function that returns an RCO rvalue calls neither opAddRef nor opRelease against that value. Example:
Widget fun() {
    return new Widget; // no calls inserted
}

Note: this is not an optimization. The compiler does not have the discretion to insert additional opAddRef/opReleasecalls.

  • Functions that return an RCO (other than the two cases above) call opAddRef against the returned reference. Example:
Widget fun(ref Widget a) {
    return a; // opAddRef inserted
}
  • The compiler considers that opRelease is the inverse of opAddRef, and therefore is at liberty to elide pairs of calls to opAddRef/opRelease. Example:
Widget fun() {
    auto a = new Widget;
    auto b = a;
    return b;
}

Applying the rules defined above would have fun's lowering insert one call to opAddRef (for creating b) and one call to opRelease (when a goes out of scope). However, these calls may be elided.

  • Implicit conversion to supertypes (class or interface) is allowed ONLY if the supertype is also a reference counted type. It follows that reference counted types cannot be converted to Object (unless Object itself defines the two methods).
  • Explicit casting to void* does not entail a call to opAddRef.
  • Typechecking methods of reference counted types is done the same as for structs. This is important because it limits what reference counted types. Consider:
@safe class Widget1 {
    private int data;
    ref int getData() { return data; } // fine
    ...
}

@safe class Widget2 {
    private int data;
    ref int getData1() { return data; } // ERROR
    ref int getData2() return { return data; } // fine
    ulong opAddRef();
    ulong opRelease();
    ...
}

This is because it is safe for a garbage collected object to escape references to its internal state. The same is not allowed for reference counted objects because they are expected to be deallocated in a deterministic manner (same as e.g. struct objects on the stack).

Defining a non-copyable reference type

Using @disable this(this); is a known idiom for creating struct objects that can be created and moved but not copied. The same is achievable with RCOs by means of @disable opAddRef();

Defining a reference counted object with deallocation

Classic reference counting techniques can be used with opAddRef and opRelease.

class Widget {
    private uint _refs = 1;
    void opAddRef() {
        ++refs;
    }
    void opRelease() {
        if (refs > 1) {
            --refs;
        } else {
            this.destroy();
            GC.free(cast(void*) this);
        }
    }
   ...
}

Usually such approaches also use private constructors and object factory to ensure the same allocation method is used during creation and destruction of the object.

If the object only needs to free this (and no other owned resources), the typechecking ensured by the compiler is enough to verify safety (however, @trusted needs to be applied to the call that frees this).

Defining a type that owns resources

TODO

Copyright

This document has been placed in the Public Domain.