Difference between revisions of "DIP35"

From D Wiki
Jump to: navigation, search
(Created page with "== DIP35: Addition to DIP25: Sealed References == {| class="wikitable" !Title: !'''Sealed References Amendment''' |- |DIP: |35 |- |Version: |1 |- |Status: |Draft |- |Created...")
 
 
(11 intermediate revisions by one other user not shown)
Line 18: Line 18:
 
|-
 
|-
 
|Last Modified:
 
|Last Modified:
|2013-04-07
+
|2013-04-20
 
|-
 
|-
 
|Author:
 
|Author:
|Zach the Mystic
+
|Zach Tollen
 
|-
 
|-
 
|Links:
 
|Links:
Line 29: Line 29:
 
== Abstract ==
 
== Abstract ==
  
[http://wiki.dlang.org/DIP25 DIP25: Sealed References] makes important inroads toward safe references in D. It consists of two parts, the first one about making 'ref' safe, and the second about '&' and 'addressOf'. The present DIP focuses mainly on the first part, proposing using 'out'> as a function attribute and 'scope' parameters to add flexibility to the restrictions imposed by DIP25, but also makes a small proposal regarding the '&' operator.
+
[http://wiki.dlang.org/DIP25 DIP25: Sealed References] makes important inroads toward safe references in D. It consists of two parts, the first one about making 'ref' safe, and the second about '&' and 'addressOf'. The present DIP focuses mainly on the first part, proposing using 'out' as a function attribute and 'scope' parameters to add flexibility to the restrictions imposed by DIP25, but also makes a small proposal regarding the '&' operator.
  
 
In this proposal, I will assume that the compiler tracks the locality of each variable with three categories: 'local', which is known to be stack-allocated; 'global', which means static or heap-allocated; and 'reference parameter', which is unknown, but could be either. (The eventual intent should be to group every action involving references into one of the three categories: '@safe', 'error', and '@system'.)
 
In this proposal, I will assume that the compiler tracks the locality of each variable with three categories: 'local', which is known to be stack-allocated; 'global', which means static or heap-allocated; and 'reference parameter', which is unknown, but could be either. (The eventual intent should be to group every action involving references into one of the three categories: '@safe', 'error', and '@system'.)
Line 68: Line 68:
  
 
<syntaxhighlight lang="d">
 
<syntaxhighlight lang="d">
ref T copy(scope T a) {
+
ref T copy(scope ref T a) {
 
   T* t = new T;
 
   T* t = new T;
   *t = p;
+
   *t = a;
 
   return *t;
 
   return *t;
 
   //return a; // Error: may not return a 'scope' parameter
 
   //return a; // Error: may not return a 'scope' parameter
Line 80: Line 80:
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
While in many cases it would be attractive to be able to have 'scope' imply 'ref', since 'scope' must always refer to at least one reference, the several existing reference types in D, e.g. objects, arrays, delegates, would make that a bad idea.
  
 
If a function has more than one parameter, 'scope' could be used to fine-tune what may or may not be returned from it:
 
If a function has more than one parameter, 'scope' could be used to fine-tune what may or may not be returned from it:
  
 
<syntaxhighlight lang="d">
 
<syntaxhighlight lang="d">
ref T add(ref T a, scope T b) {
+
ref T add(ref T a, scope ref T b) {
 
   a = a + b; // assume T implements opAdd
 
   a = a + b; // assume T implements opAdd
 
   return a;
 
   return a;
Line 105: Line 107:
 
struct S {
 
struct S {
 
   int t;
 
   int t;
   ref S copyAdd(scope S p) scope {
+
   ref S copyAdd(scope ref S p) scope {
 
     S* s = new S;
 
     S* s = new S;
 
     s.t = t + p.t;
 
     s.t = t + p.t;
Line 119: Line 121:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
One could even imagine foregoing 'scope' altogether in favor of 'out', although function parameters could not then be individually marked. Still, 'out' is a cosmetic addition and not strictly necessary. The minor problem of placing it after a declaration and confusing it with an 'out' contract is solved by either ensuring that 'out' is not the last attribute listed, or by inserting 'body' between 'out' and '{'.
+
It may be desirable to divide the possible meanings of 'scope' into more than one attribute. Escaping a reference parameter by returning it is generally safer than assigning it to a global variable.
 +
 
 +
<syntaxhighlight lang="d">
 +
ref T func(ref T a) {
 +
  return a; // Escape by return, @safe if tracked at call site
 +
  static T* s = &a; // Escape by global pointer, @system
 +
}
 +
</syntaxhighlight>
 +
 
 +
'out' as a function attribute could indicate that no reference parameter will be returned by reference, while 'scope' would only ensure that the parameter may not escape by any other means, while still being returnable by reference. This use of 'scope' would invalidate its use in fine-tuning parameters, which may require introducing another parameter attribute, '@noreturn', for example, with 'out' now implicitly marking parameters this instead.
 +
 
 +
<syntaxhighlight lang="d">
 +
ref T func(scope ref T a, ref T b, @noreturn ref T c) {
 +
  return a; // pass
 +
  return b; // pass
 +
  return c; // error
 +
  T* p = &a; // Escape to local pointer, @safe? if *p is tracked
 +
  static T* s = &a; // error
 +
}
 +
out T func(scope ref T a, ref T b, @noreturn ref T c) {
 +
  return a; // error
 +
  return b; // error
 +
  return c; // error
 +
  return new T; // pass
 +
  static T t;
 +
  return t; // pass
 +
}
 +
</syntaxhighlight>
 +
 
 +
There may be significant advantage in separating the two aspects of ref parameters into two different function attributes. Attribute inference may be a way of alleviating the burden on programmers of adding all the necessary attributes to their functions.
 +
 
 +
The minor problem of placing it after a declaration and confusing it with an 'out' contract is solved by either ensuring that 'out' is not the last attribute listed, or by inserting 'body' between 'out' and '{'.
  
 
<syntaxhighlight lang="d">
 
<syntaxhighlight lang="d">
Line 125: Line 158:
 
T func(ref T a) out body { return *new T; }
 
T func(ref T a) out body { return *new T; }
 
</syntaxhighlight>
 
</syntaxhighlight>
 
A final issue is that sometimes it is a class object which is being returned, in which case 'scope' and 'out' should not imply 'ref' but should still provide the relevant safety guarantees.
 
  
 
== The '&' operator and C Functions ==
 
== The '&' operator and C Functions ==
Line 145: Line 176:
 
== Copyright ==
 
== Copyright ==
  
This document has been placed in the Public Domain.'
+
This document has been placed in the Public Domain.
 +
 
 +
[[Category: DIP]]

Latest revision as of 19:51, 1 September 2015

DIP35: Addition to DIP25: Sealed References

Title: Sealed References Amendment
DIP: 35
Version: 1
Status: Draft
Created: 2013-04-07
Last Modified: 2013-04-20
Author: Zach Tollen
Links: http://wiki.dlang.org/DIP25

Abstract

DIP25: Sealed References makes important inroads toward safe references in D. It consists of two parts, the first one about making 'ref' safe, and the second about '&' and 'addressOf'. The present DIP focuses mainly on the first part, proposing using 'out' as a function attribute and 'scope' parameters to add flexibility to the restrictions imposed by DIP25, but also makes a small proposal regarding the '&' operator.

In this proposal, I will assume that the compiler tracks the locality of each variable with three categories: 'local', which is known to be stack-allocated; 'global', which means static or heap-allocated; and 'reference parameter', which is unknown, but could be either. (The eventual intent should be to group every action involving references into one of the three categories: '@safe', 'error', and '@system'.)

'scope' Parameters

ref T func(ref T p) {
  T a;          // locality: 'local'
  static T b;   // locality: 'global'
  p;            // locality: 'ref parameter'

  return a; // Error: a local may not be returned by ref
  return b; // @safe: returning a global by ref
  return p; // @safe, so long as caller keeps track of result!
  return *new T; // @safe, heap == global
}

Above, the reference parameter 'p' may be returned. The safety measures suggested by DIP25 will ensure that the return value will not itself be escaped if a local was passed in. This allows the "identity" function to work perfectly. But what if you wanted to create a "copy" function, which was guaranteed to return a non-local, regardless of what was passed in?

ref T identity(ref T a) { return a; }
ref T copy(ref T p) {
  T* t = new T;
  *t = p;
  return *t;
}

ref T testFuncs() {
  T t;
  return identity(t); // Error, as it should
  return copy(t);     // @safe, but error according to DIP25
}

'ref' alone is not powerful enough to deal with this issue efficiently. Without any more information available to the caller, the result of 'copy(t)' must be assumed local, even if it's meant to be global. If you could mark each parameter with additional information, using the 'scope' attribute, the return value could be accurately identified as global, and because the information is found in the signature of the function, it is available both when being called and when compiling the function.

ref T copy(scope ref T a) {
  T* t = new T;
  *t = a;
  return *t;
  //return a; // Error: may not return a 'scope' parameter
}

ref T testCopy() {
  T t;
  return copy(t); // Pass: t is 'scope', return *must* be global
}

While in many cases it would be attractive to be able to have 'scope' imply 'ref', since 'scope' must always refer to at least one reference, the several existing reference types in D, e.g. objects, arrays, delegates, would make that a bad idea.

If a function has more than one parameter, 'scope' could be used to fine-tune what may or may not be returned from it:

ref T add(ref T a, scope ref T b) {
  a = a + b; // assume T implements opAdd
  return a;
  //return b; // Error: may not return 'scope' parameter
}

ref T test() {
  T c;
  static T d;
  //return add(c, d); // Error, 'c' is local
  return add(d, c); // Pass, only 'd' may be returned
}

The 'out' Function Attribute

While 'scope' parameters are attractive, marking a lot of parameters 'scope' might be tedious. One option is to imagine the 'out' keyword as a function attribute, guaranteeing that no reference may be escaped, in effect marking all parameters 'scope'. It is slightly more intuitive than 'scope', especially as 'scope' would be necessary to add outside the parameter list of member functions, to describe the hidden 'this' reference.

struct S {
  int t;
  ref S copyAdd(scope ref S p) scope {
    S* s = new S;
    s.t = t + p.t;
    return s;
  }
  // Same effect, but more intuitive
  out S copyAdd(ref S p) {
    S* s = new S;
    s.t = t + p.t;
    return s;
  }
}

It may be desirable to divide the possible meanings of 'scope' into more than one attribute. Escaping a reference parameter by returning it is generally safer than assigning it to a global variable.

ref T func(ref T a) {
  return a; // Escape by return, @safe if tracked at call site
  static T* s = &a; // Escape by global pointer, @system
}

'out' as a function attribute could indicate that no reference parameter will be returned by reference, while 'scope' would only ensure that the parameter may not escape by any other means, while still being returnable by reference. This use of 'scope' would invalidate its use in fine-tuning parameters, which may require introducing another parameter attribute, '@noreturn', for example, with 'out' now implicitly marking parameters this instead.

ref T func(scope ref T a, ref T b, @noreturn ref T c) {
  return a; // pass
  return b; // pass
  return c; // error
  T* p = &a; // Escape to local pointer, @safe? if *p is tracked
  static T* s = &a; // error
}
out T func(scope ref T a, ref T b, @noreturn ref T c) {
  return a; // error
  return b; // error
  return c; // error
  return new T; // pass
  static T t;
  return t; // pass
}

There may be significant advantage in separating the two aspects of ref parameters into two different function attributes. Attribute inference may be a way of alleviating the burden on programmers of adding all the necessary attributes to their functions.

The minor problem of placing it after a declaration and confusing it with an 'out' contract is solved by either ensuring that 'out' is not the last attribute listed, or by inserting 'body' between 'out' and '{'.

T func(ref T a) out @trusted { return *new T; }
T func(ref T a) out body { return *new T; }

The '&' operator and C Functions

This is a very minor additional suggestion regarding DIP25, and I don't expect it to resolve the whole issue. While '&' is generally dangerous, backward compatibility with C functions would make it inconvenient to disallow. Therefore, some value may be gained by permitting its use when calling C functions.

void func1(T* t) {}
extern (C) void func2(T* t) {}

void test() {
  T a;
  func1(&a); // Error, according to DIP25
  func2(&a); // Pass, for backward compatibility reasons
}

Copyright

This document has been placed in the Public Domain.