Difference between revisions of "DIP25"

From D Wiki
Jump to: navigation, search
m (s/2nd try/3rd try/, s/&example,method/&example.method/)
 
(43 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
== DIP25: Sealed references ==
 
== DIP25: Sealed references ==
 +
 
{| class="wikitable"
 
{| class="wikitable"
 
!Title:
 
!Title:
Line 11: Line 12:
 
|-
 
|-
 
|Status:
 
|Status:
|Draft
+
|Approved for 2.067
 
|-
 
|-
 
|Created:
 
|Created:
Line 17: Line 18:
 
|-
 
|-
 
|Last Modified:
 
|Last Modified:
|2013-02-05
+
|{{REVISIONYEAR}}-{{REVISIONMONTH}}-{{REVISIONDAY}}
 
|-
 
|-
 
|Author:
 
|Author:
|Andrei Alexandrescu and Walter Bright
+
|Walter Bright and Andrei Alexandrescu
 
|-
 
|-
 
|Links:
 
|Links:
|
+
|See also: [http://wiki.dlang.org/DIP71 DIP71: 'noscope' and 'out!param' attributes]
 
|}
 
|}
  
 
== Abstract ==
 
== Abstract ==
  
D offers a number of features aimed at systems-level coding, such as unrestricted pointers, casting between integers and pointers, and the [http://dlang.org/function.html#system-functions <code>@system</code>] attribute. These means, combined with the other features of D, make it a complete and expressive language for systems-level tasks. On the other hand, economy of means should be exercised in defining such powerful but dangerous features. Most other features should offer good safety guarantees with little or no loss in efficiency or expressiveness. This proposal makes <code>ref</code> provide such a guarantee: with the proposed rules, it is impossible in safe code to have <code>ref</code> refer to a destroyed object. The restrictions introduced are not backward compatible, but disallow code that is stylistically questionable and that can be easily replaced either with equivalent and clearer code.
+
D offers a number of features aimed at systems-level coding, such as unrestricted pointers, casting between integers and pointers, and the [http://dlang.org/function.html#system-functions <code>@system</code>] attribute. These means, combined with the other features of D, make it a complete and expressive language for systems-level tasks. On the other hand, economy of means should be exercised in defining such powerful but dangerous features. Most other features should offer good safety guarantees with little or no loss in efficiency or expressiveness. This proposal makes <code>ref</code> provide such a guarantee: with the proposed rules, it is impossible in safe code to have <code>ref</code> refer to a destroyed object. The restrictions introduced are not entirely backward compatible, but disallow code that is stylistically questionable and that can be easily replaced either with equivalent and clearer code.
  
 
== In a nutshell ==
 
== In a nutshell ==
 +
 +
This DIP proposes that any <code>ref</code> parameter that a function received and also wants to return must be also annotated with <code>return</code>. Annotation are deduced for templates and lambdas, but must be explicit for all other declarations. Example:
 +
 +
<syntaxhighlight lang=D>
 +
@safe:
 +
ref int fun(ref int a) { return a; } // ERROR
 +
ref int gun(return ref int a) { return a; } // FINE
 +
ref T hun(T)(ref T a) { return a; } // FINE, templates use deduction
 +
</syntaxhighlight>
  
 
== Description ==
 
== Description ==
Line 44: Line 54:
 
   int x;
 
   int x;
 
   return x; // Error: escaping reference to local variable x  
 
   return x; // Error: escaping reference to local variable x  
 +
}
 +
 +
struct S {
 +
    int x;
 +
}
 +
 +
ref int hun() {
 +
  S s;
 +
  return s.x; // see https://issues.dlang.org/show_bug.cgi?id=13902
 +
}
 +
 +
ref int iun() {
 +
  int a[42];
 +
  return a[5]; // see https://issues.dlang.org/show_bug.cgi?id=13902
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
However, this enforcement is shallow. The following code compiles and allows reads and writes through defunct stack locations, bypassing scoping and lifetime rules:
+
However, this enforcement is shallow (even after fixing [https://issues.dlang.org/show_bug.cgi?id=13902 issue 13902]). The following code compiles and allows reads and writes through defunct stack locations, bypassing scoping and lifetime rules:
  
 
<syntaxhighlight lang=D>
 
<syntaxhighlight lang=D>
ref int id(ref int x) {
+
ref int identity(ref int x) {
   return x;  
+
   return x; // pass-through function that does nothing
 
}
 
}
  
 
ref int fun(int x) {
 
ref int fun(int x) {
   return id(x);  
+
   return identity(x); // escape the address of a parameter
 
}
 
}
  
 
ref int gun() {
 
ref int gun() {
 
   int x;
 
   int x;
   return id(x);  
+
   return identity(x); // escape the address of a local
 
}
 
}
</syntaxhighlight>
 
  
The escape pattern is obvious in this simple example with all code in sight, and may be found automatically. The problem is that generally the compiler cannot see the body of <code>id</code>. We need to devise a method for compiling such functions separately.
+
struct S {
 +
    int x;
 +
    ref int get() { return x; }
 +
}
  
We want to devise rules that allow us to pass objects by reference ''down'' into functions, and return references ''up'' from functions, while disallowing cases such as the above when a reference passed up ends up referring to a deallocated temporary.
+
ref int hun(S x) {
 +
  return x.get; // escape the address of a part of a parameter
 +
}
  
=== Typechecking rules ===
+
ref int iun() {
 
+
  S s;
The rules below discuss under what circumstances functions receiving and/or returning <code>ref T</code> may be called, where <code>T</code> is some arbitrary type. Let us also denote with <code>S<sub>T</sub></code> any <code>struct</code> that has a non-static member variable of type <code>T</code>.
+
  return s.get; // escape the address of part of a local
 
+
}
# An invocation of a function that takes a parameter of type <code>ref T</code> may pass one of the following:
 
## An lvalue of type <code>T</code>, including function arguments, array and <code>struct</code> members;
 
## An incoming <code>ref T</code> parameter or a member of type <code>T</code> of <code>S<sub>T</sub></code> received as <code>ref S<sub>T</sub></code>
 
## The result of a function returning <code>ref T</code>, or a member of <code>S<sub>T</sub></code> returned as <code>ref S<sub>T</sub></code>.
 
# A function that returns a <code>ref T</code> may return one of the following:
 
##  A static lvalue of type <code>T</code>, including members of static <code>struct</code> values;
 
## A member variable of type <code>T</code> belonging to a <code>class</code> object;
 
## A <code>ref T</code> parameter;
 
## A member of type <code>T</code> of <code>S<sub>T</sub></code> that has been passed as <code>ref S<sub>T</sub></code> into the function;
 
## The invocation of a function <code>fun</code> returning <code>ref T</code> IF <code>fun</code> does NOT take any parameters of type <code>T</code> or <code>S<sub>T</sub></code>.
 
## The invocation of a function <code>fun</code> returning <code>ref T</code> IF none of <code>fun</code>'s parameters of type <code>ref T</code> and <code>ref S</code> are bound to local variables.
 
 
 
=== Discussion and Examples ===
 
  
The rules allow unrestricted pass-down and conservatively restrict pass-up to avoid escaping values. Itemized discussion follows.
+
ref int jun() {
 
+
  return S().get; // worst contender: escape the address of a part of an rvalue
1.1 Regular lvalues can be passed down:
 
 
 
<syntaxhighlight lang=D>
 
void fun(ref T);
 
 
 
struct S { int a; T b; }
 
 
 
void caller(T v1, S v2) {
 
    static T v3;
 
    T v4;
 
    static S v5;
 
    S v6;
 
 
 
    // Fine: pass argument
 
    fun(v1);
 
    // Fine: pass member of argument
 
    fun(v2.b);
 
    // Fine: pass static lvalue
 
    fun(v3);
 
    // Fine: pass of stack variable
 
    fun(v4);
 
    // Fine: pass member of static struct
 
    fun(v5.b);
 
    // Fine: pass member of local struct
 
    fun(v6.b);
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
1.2. This rule allows forwarding references transitively.
+
The escape patterns are obvious in these simple examples that make all code available and use no recursion, and may be found automatically. The problem is that generally the compiler cannot see the body of <code>identity</code> or <code>S.get()</code>. We need to devise a method that derives enough information for safety analysis only given the function signatures, not their bodies.
 +
 
 +
This DIP devises rules that allow passing objects by reference ''down'' into functions, and return references ''up'' from functions, whilst disallowing cases such as the above when a reference passed up ends up referring to a deallocated temporary.
  
<syntaxhighlight lang=D>
+
=== Adding <tt>return</tt> as a parameter attribute ===
void fun(ref T);
 
  
struct S { int a; T b; }
+
The main issue is typechecking functions that return a <tt>ref T</tt> and accept some of their parameters by <tt>ref</tt>. Those that attempt to return locals or parts thereof are already addressed directly, contingent to [https://issues.dlang.org/show_bug.cgi?id=13902 Issue 13902]. The one case remaining is allowing a function returning <code>ref T</code> to return a (part of a) parameter passed by <code>ref</code>.
  
void caller(ref T v1, ref S v2) {
+
The key is to distinguish legal from illegal cases. One simple but overly conservative option would be to simply disallow returning a <code>ref</code> parameter or part thereof. That makes <code>identity</code> impossible to implement, and as a consequence accessing elements of a container by reference becomes difficult or impossible to typecheck properly. Also, heap-allocated structures with deterministic destruction (e.g. reference counted) must insert member copies for all accesses.  
    // Fine: pass ref argument
 
    fun(v1);
 
    // Fine: pass member of ref argument
 
    fun(v2.b);
 
}
 
</syntaxhighlight>
 
  
1.3. This rule enables passing down references obtained from other function calls.
+
This proposal promotes adding <code>return</code> as an attribute that propagates the lifetime of a parameter to the return value of a function. With the proposed semantics, a function is disallowed to return a <code>ref</code> parameter or a part thereof UNLESS the parameter is also annotated with <code>return</code>. Under the proposed semantics <code>identity</code> will be spelled as follows:
  
 
<syntaxhighlight lang=D>
 
<syntaxhighlight lang=D>
void fun(ref T);
+
@safe ref int wrongIdentity(ref int x) {  
ref T gun();
+
     return x; // ERROR! Cannot return a ref, please use "return ref"
struct S { int a; T b; }
 
ref S hun();
 
 
 
void caller() {
 
     // Fine: pass ref result
 
    fun(gun());
 
    // Fine: pass member of ref result
 
    fun(hun().b);
 
 
}
 
}
</syntaxhighlight>
+
@safe ref int identity(return ref int x) {  
 
+
     return x; // fine
2.1. Static lvalues can be returned:
 
 
 
<syntaxhighlight lang=D>
 
struct S { int a; T b; }
 
static T v1;
 
static S v2;
 
 
 
ref T caller(bool condition) {
 
     static T v3;
 
    static S v4;
 
    // Fine
 
    if (condition) return fun(v1);
 
    if (condition) return fun(v2);
 
    if (condition) return fun(v3);
 
    if (condition) return fun(v4);
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
2.2.Member variables of classes can be returned because they live on the garbage-collected heap:
+
Just by seeing the signature <code>ref int identity(return ref int x)</code> the compiler assumes that the result of identity must have a shorter or equal lifetime than <code>x</code> and typechecks callers accordingly. Example (given the previous definition of <code>identity</code>):
  
 
<syntaxhighlight lang=D>
 
<syntaxhighlight lang=D>
class C { int a; T b; }
+
@safe ref int fun(return ref int x) {  
 
+
    int a;
ref T caller() {
+
    return a; // ERROR per current language rules
     auto c = new C;
+
    static int b;
     return c.b;
+
    return b; // fine per current language rules
 +
    return identity(a); // ERROR, this may escape the address of a local
 +
    return x; // fine, propagate x's lifetime to output
 +
     return identity(x); // fine, propagate x's lifetime through identity to the output
 +
     return identity(identity(x)); // fine, propagate x's lifetime twice through identity to the output
 
}
 
}
</syntaxhighlight>
 
  
2.3. This rule allows returning back an incoming parameter, which in turn allows implementing the <code>identity</code> function and idioms derived from it.
+
@safe ref int gun(ref int input) {
 
+
    static int[42] data;
<syntaxhighlight lang=D>
+
     return data[input]; // works, can always return static-lived data
ref T fun(ref T v1) {
 
     return v1;
 
 
}
 
}
</syntaxhighlight>
 
  
2.4. As above, but for members of <code>structs</code>.
+
@safe struct S {
 
+
    private int x;
<syntaxhighlight lang=D>
+
    ref int get() return { return x; } // should work, see next section
struct S { int a; T b; }
 
ref T fun(ref S v1) {
 
    return v1.b;
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
2.5. This allows to pass up the result of a function that has no chance at all to return a reference to a local.
+
===Interaction with <tt>auto ref</tt>===
  
<syntaxhighlight lang=D>
+
Syntactically it is illegal to use <tt>auto ref</tt> and <tt>return ref</tt> on the same parameter. Deduction of the <tt>return</tt> attribute still applies as discussed below.
// Assume T is not double or string
 
ref T fun(double, ref string);
 
struct S { int a; T b; }
 
ref S gun(double, ref string);
 
  
ref T caller(bool condition, ref T v1) {
+
===Deduction===
    string s = "asd";
 
    if (condition) return fun(1, s);
 
    return gun(1, s).b;
 
}
 
</syntaxhighlight>
 
  
2.6. This is the most sophisticated rule. It allows passing up the result of a function while disallowing those cases in which the function may actually return a reference to a local.
+
Deduction of the <tt>return</tt> attribute will be effected under the same conditions as for <tt>pure</tt> (currently for generic and lambda functions). That means the generic <tt>identity</tt> function does not require the <tt>return</tt> attribute:
  
 
<syntaxhighlight lang=D>
 
<syntaxhighlight lang=D>
ref T fun(T);
+
auto ref T identity(auto ref T x) {
ref T gun(ref T);
+
     return x; // correct, no need for return
struct S { int a; T b; }
 
ref S hun(S);
 
ref S iun(ref S);
 
 
 
ref T caller(bool condition, ref T v1, ref S v2, T v3, S v4) {
 
    T v5;
 
    S v6;
 
 
 
    // Fine, no ref parameters
 
    if (condition) return fun(v1);
 
    if (condition) return fun(v2.b);
 
    if (condition) return fun(v3);
 
     if (condition) return fun(v4.b);
 
    if (condition) return fun(v5);
 
    if (condition) return fun(v6.b);
 
 
 
    // Fine, bound to ref parameters
 
    if (condition) return gun(v1);
 
    if (condition) return gun(v2.b);
 
 
 
    // Not fine, bound to locals
 
    // if (condition) return gun(v3);
 
    // if (condition) return gun(v4.b);
 
    // if (condition) return gun(v5);
 
    // if (condition) return gun(v6.b);
 
 
 
    // Fine, no ref at all
 
    if (condition) return hun(v2).b;
 
    if (condition) return hun(v4).b;
 
    if (condition) return hun(v6).b;
 
 
 
    // Fine, ref bound to ref argument
 
    if (condition) return iun(v2).b;
 
 
 
    // Not fine, bound to locals
 
    // if (condition) return iun(v4);
 
    // if (condition) return iun(v6);
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
=== Member functions ===
+
===Types of Result vs. Parameters===
 
 
The rules above apply to member functions as well, considering that the <code>this</code> special parameter in a method belonging to type <code>A</code> is passed as a <code>ref A</code> parameter. This may cause problems with rvalue structs. (Currently, D allows calling a method against a struct rvalue.) Special rules concerning struct rvalues may be necessary.
 
  
== Taking address ==
+
Consider:
 
 
This proposal introduces a related restriction: taking the address of the following entities shall be disallowed, even in <code>@system</code>.
 
 
 
* Parameters (either value or <code>ref</code>)
 
* Stack-allocated locals.
 
* Member variables of a <code>struct</code> if the <code>struct</code> is a parameter (either value or <code>ref</code>) or stack-allocated.
 
** Note that using a pointer to a <code>struct</code> does allow taking the address of a member.
 
** Also note that a <code>struct</code> that is part of a <code>class</code> object also allows address taking.
 
* The result of functions that return <code>ref</code>.
 
 
 
This is because escaping pointers away from expressions is too dangerous and should be more explicit. The capability must still be present, otherwise very simple uses are not possible anymore. Consider:
 
  
 
<syntaxhighlight lang=D>
 
<syntaxhighlight lang=D>
bool parse1(ref double v) {
+
@safe ref int fun(return ref float x);
    // Use C's scanf
+
</syntaxhighlight>
    return scanf("%f", &v) == 1; // Error: cannot take the address of v
 
}
 
  
double parse2() {
+
This function arguably cannot return a value scoped within the lifetime of its argument for the simple reason it's impossible to find an <code>int</code> somewhere in a <code>float</code> (apart from unsafe address manipulation). However, this DIP ignores types; if a parameter is <code>return ref</code>, it is always considered potentially escaped as a result. It is in fact possible that the author of <code>fun</code> wants to constrain its output's lifetime for unrelated reasons.
    // Use C's scanf, 2nd try
 
    double v;
 
    enforce(scanf("%f", &v) == 1); // Error: cannot take the address of v
 
    return v;
 
}
 
  
double parse3() {
+
Future versions of this DIP may relax this rule.
    // Use C's scanf, 3rd try
 
    auto pv = new double;
 
    enforce(scanf("%f", pv) == 1); // Fine
 
    return *pv;
 
}
 
</syntaxhighlight>
 
  
That would force many variables to exist on the heap even though it's easy to figure that the code is safe since the semantics of <code>scanf</code> is understood by the programmer. To address this issue, this proposal fosters introducing a standard function with the signature:
+
===Multiple Parameters===
  
<syntaxhighlight lang=D>
+
If multiple <code>return ref</code> parameters are present, the result's lifetime is conservatively assumed to be enclosed in the lifetime of the shortest-lived of those arguments.
@system T* addressOf(ref T value);
 
</syntaxhighlight>
 
  
The function returns the address of <code>value</code> and can only be used in <code>@system</code> or <code>@trusted</code> code. <code>addressOf</code> itself cannot use the <code>&</code> address-of operator because it's forbidden even in <code>@system</code> code. But there are many possible implementations, including escaping into C or assembler. One possible portable implementation is:
+
===Member Functions===
 +
Member functions of <code>struct</code>s must qualify <code>this</code> with <code>return</code> if they want to return a result by <code>ref</code> that won't outlive <code>this</code>. Example:
  
 
<syntaxhighlight lang=D>
 
<syntaxhighlight lang=D>
@system T* addressOf(ref T value) {
+
@safe struct S {
     static T* id(T* p) { return p; }
+
     static int a;
     auto pfun = cast(T* function(ref T)) id;
+
    int b;
     return *pfun(value);
+
    ref int fun() { return a; } // fine, callers assume infinite lifetime
 +
     ref int gun() { return b; } // ERROR! Cannot return a direct member
 +
     ref int hun() return { return b; } // fine, result is scoped within this
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
This relies on the fact that at binary level a <code>ref</code> parameter is passed as a pointer.
+
===@safe===
 
+
For the initial release, the requirement of returns for <code>ref</code> parameter data to be marked with <code>return</code> will only apply to <code>@safe</code> functions. The reasons for this are to avoid breaking existing code, and because it's not yet clear whether this feature will interfere with valid constructs in a system language.
With this function available as part of the standard library, efficient code can be written that forwards to <code>scanf</code> without the compiler knowing its semantics:
 
  
 
<syntaxhighlight lang=D>
 
<syntaxhighlight lang=D>
@trusted bool parse1(ref double v) {
+
@safe  ref int fun(ref int x)       { return x;} // Error
    // Use C's scanf
+
@safe  ref int gun(return ref int x) { return x;} // OK
    return scanf("%f", addressOf(v)) == 1; // Fine
+
@system ref int hun(ref int x)       { return x;} // OK for now, @system code.
}
+
@system ref int jun(return ref int x) { return x;} // preferred, gives more hints to compiler for lifetime of return value
 
 
@trusted double parse2() {
 
    // Use C's scanf, 2nd try
 
    double v;
 
    enforce(scanf("%f", addressOf(v)) == 1); // Fine
 
    return v;
 
}
 
 
</syntaxhighlight>
 
</syntaxhighlight>
 
===Note: Isn't replacing <code>&</code> with <code>addressOf</code> just shuffling? How does it mark an improvement?===
 
 
Forbidding use of <code>&</code> against specific objects has two positive effects. First, it eliminates by design some thorny syntactic ambiguities discussed in [[DIP23]]. In the expression <code>&fun</code> or <code>&expression.method</code>, the <code>&</code> may apply to either the function/method itself or to the value returned by the function/method (which doesn't compile if the result is an rvalue, but does and is unsafe if the result is a <code>ref</code>). Forbidding the unsafe case leaves only one meaning for <code>&</code> in this context: take the address of the function or delegate. To get the address of the result, one would write <code>addressof(fun)</code> or <code>addressof(expr.method)</code>, which has unsurprising syntax and semantics.
 
 
The second beneficial effect is that <code>addressOf</code> is annotated appropriately with <code>@system</code> and as such integrates naturally with the rest of the type system without a need to ascribe special rules and exceptions to <code>&</code>.
 
  
 
== Copyright ==
 
== Copyright ==
 
This document has been placed in the Public Domain.
 
This document has been placed in the Public Domain.
 +
 +
[[Category: DIP]]

Latest revision as of 19:51, 1 September 2015

DIP25: Sealed references

Title: Sealed references
DIP: 25
Version: 1
Status: Approved for 2.067
Created: 2013-02-05
Last Modified: 2015-09-1
Author: Walter Bright and Andrei Alexandrescu
Links: See also: DIP71: 'noscope' and 'out!param' attributes

Abstract

D offers a number of features aimed at systems-level coding, such as unrestricted pointers, casting between integers and pointers, and the @system attribute. These means, combined with the other features of D, make it a complete and expressive language for systems-level tasks. On the other hand, economy of means should be exercised in defining such powerful but dangerous features. Most other features should offer good safety guarantees with little or no loss in efficiency or expressiveness. This proposal makes ref provide such a guarantee: with the proposed rules, it is impossible in safe code to have ref refer to a destroyed object. The restrictions introduced are not entirely backward compatible, but disallow code that is stylistically questionable and that can be easily replaced either with equivalent and clearer code.

In a nutshell

This DIP proposes that any ref parameter that a function received and also wants to return must be also annotated with return. Annotation are deduced for templates and lambdas, but must be explicit for all other declarations. Example:

@safe:
ref int fun(ref int a) { return a; } // ERROR
ref int gun(return ref int a) { return a; } // FINE
ref T hun(T)(ref T a) { return a; } // FINE, templates use deduction

Description

Currently, D has some provisions for avoiding dangling references:

ref int fun(int x) {
  return x; // Error: escaping reference to local variable x 
}

ref int gun() {
  int x;
  return x; // Error: escaping reference to local variable x 
}

struct S {
    int x;
}

ref int hun() {
  S s;
  return s.x; // see https://issues.dlang.org/show_bug.cgi?id=13902
}

ref int iun() {
  int a[42];
  return a[5]; // see https://issues.dlang.org/show_bug.cgi?id=13902
}

However, this enforcement is shallow (even after fixing issue 13902). The following code compiles and allows reads and writes through defunct stack locations, bypassing scoping and lifetime rules:

ref int identity(ref int x) {
  return x; // pass-through function that does nothing 
}

ref int fun(int x) {
  return identity(x); // escape the address of a parameter 
}

ref int gun() {
  int x;
  return identity(x); // escape the address of a local
}

struct S {
    int x;
    ref int get() { return x; }
}

ref int hun(S x) {
  return x.get; // escape the address of a part of a parameter 
}

ref int iun() {
  S s;
  return s.get; // escape the address of part of a local
}

ref int jun() {
  return S().get; // worst contender: escape the address of a part of an rvalue
}

The escape patterns are obvious in these simple examples that make all code available and use no recursion, and may be found automatically. The problem is that generally the compiler cannot see the body of identity or S.get(). We need to devise a method that derives enough information for safety analysis only given the function signatures, not their bodies.

This DIP devises rules that allow passing objects by reference down into functions, and return references up from functions, whilst disallowing cases such as the above when a reference passed up ends up referring to a deallocated temporary.

Adding return as a parameter attribute

The main issue is typechecking functions that return a ref T and accept some of their parameters by ref. Those that attempt to return locals or parts thereof are already addressed directly, contingent to Issue 13902. The one case remaining is allowing a function returning ref T to return a (part of a) parameter passed by ref.

The key is to distinguish legal from illegal cases. One simple but overly conservative option would be to simply disallow returning a ref parameter or part thereof. That makes identity impossible to implement, and as a consequence accessing elements of a container by reference becomes difficult or impossible to typecheck properly. Also, heap-allocated structures with deterministic destruction (e.g. reference counted) must insert member copies for all accesses.

This proposal promotes adding return as an attribute that propagates the lifetime of a parameter to the return value of a function. With the proposed semantics, a function is disallowed to return a ref parameter or a part thereof UNLESS the parameter is also annotated with return. Under the proposed semantics identity will be spelled as follows:

@safe ref int wrongIdentity(ref int x) { 
    return x; // ERROR! Cannot return a ref, please use "return ref"
}
@safe ref int identity(return ref int x) { 
    return x; // fine
}

Just by seeing the signature ref int identity(return ref int x) the compiler assumes that the result of identity must have a shorter or equal lifetime than x and typechecks callers accordingly. Example (given the previous definition of identity):

@safe ref int fun(return ref int x) { 
    int a;
    return a; // ERROR per current language rules
    static int b;
    return b; // fine per current language rules
    return identity(a); // ERROR, this may escape the address of a local
    return x; // fine, propagate x's lifetime to output
    return identity(x); // fine, propagate x's lifetime through identity to the output
    return identity(identity(x)); // fine, propagate x's lifetime twice through identity to the output
}

@safe ref int gun(ref int input) {
    static int[42] data;
    return data[input]; // works, can always return static-lived data
}

@safe struct S {
    private int x;
    ref int get() return { return x; } // should work, see next section 
}

Interaction with auto ref

Syntactically it is illegal to use auto ref and return ref on the same parameter. Deduction of the return attribute still applies as discussed below.

Deduction

Deduction of the return attribute will be effected under the same conditions as for pure (currently for generic and lambda functions). That means the generic identity function does not require the return attribute:

auto ref T identity(auto ref T x) {
    return x; // correct, no need for return
}

Types of Result vs. Parameters

Consider:

@safe ref int fun(return ref float x);

This function arguably cannot return a value scoped within the lifetime of its argument for the simple reason it's impossible to find an int somewhere in a float (apart from unsafe address manipulation). However, this DIP ignores types; if a parameter is return ref, it is always considered potentially escaped as a result. It is in fact possible that the author of fun wants to constrain its output's lifetime for unrelated reasons.

Future versions of this DIP may relax this rule.

Multiple Parameters

If multiple return ref parameters are present, the result's lifetime is conservatively assumed to be enclosed in the lifetime of the shortest-lived of those arguments.

Member Functions

Member functions of structs must qualify this with return if they want to return a result by ref that won't outlive this. Example:

@safe struct S {
    static int a;
    int b;
    ref int fun() { return a; } // fine, callers assume infinite lifetime
    ref int gun() { return b; } // ERROR! Cannot return a direct member
    ref int hun() return { return b; } // fine, result is scoped within this
}

@safe

For the initial release, the requirement of returns for ref parameter data to be marked with return will only apply to @safe functions. The reasons for this are to avoid breaking existing code, and because it's not yet clear whether this feature will interfere with valid constructs in a system language.

@safe   ref int fun(ref int x)        { return x;} // Error
@safe   ref int gun(return ref int x) { return x;} // OK
@system ref int hun(ref int x)        { return x;} // OK for now, @system code.
@system ref int jun(return ref int x) { return x;} // preferred, gives more hints to compiler for lifetime of return value

Copyright

This document has been placed in the Public Domain.