DIP39

From D Wiki
Revision as of 02:08, 11 May 2013 by Timotheecour (talk | contribs) (Safety)
Jump to: navigation, search

DIP 39: Safe rvalue references: backwards compatible, safe against ref/nonref code evolution, compatible with UFCS and DIP38

Title: Safe rvalue references: backwards compatible, safe against ref/nonref code evolution, compatible with UFCS and DIP38.
DIP: 39
Version: 1
Status: Draft
Created: 2013-05-10
Last Modified: 2013-05-10
Author: Timothee Cour
Links:

Abstract

We propose to introduce rvalue references that are:

  • safe: guarantees memory safety so that references will always point to valid memory.
  • compatible with DIP38: can use same inref/outref internal compiler annotation for input references that can be returned by ref by a function.
  • backwards compatible: current valid D code will continue to work without change. In addition, additional code becomes valid with call site rvalue ref annotation.
  • safe against ref/nonref code evolution: call site rvalue ref compulsory annotation turns ref/nonref changes into compile errors instead of silently changing code behavior.
  • both const ref or ref can be used with rvalue refs (more flexible than C++)
  • no call site ref annotation when input ref argument is already an lvalue (different from C#), for backwards compatibility (and making it less verbose)
  • compatible with UFCS

Details

Suppose we have a function that takes an input by ref:

T2 funRef(ref T a);

We can use it as before with an lvalue LV (backwards compatible):

funRef(LV);

We can use it with an rvalue expression 'RV' and a compulsory call site annotation indicating to convert the rvalue to an lvalue via a temporary:

I propose the yet unused symbol '^' to denote this (unused in D), although there are alternatives, see section: 'alternative symbols for call site rvalue annotation'.

funRef(RV^);

The rule is simple: 'fun(x^)' is used if and only if fun takes x by ref and x is an rvalue (not an lvalue). Otherwise use fun(x). with funRef(ref T a):

  • LV.funRef(); //ok
  • RV^.funRef(); //ok
  • LV^.funRef(); //error
  • RV.funRef(); //error

with funNonRef(ref T a):

  • LV.funNonRef(); //ok
  • RV.funNonRef(); //ok
  • LV^.funNonRef(); //error
  • RV^.funNonRef(); //error

The LRL (Lvalue-Rvalue-Lvalue) problem is avoided by disallowing binding if the initial value bound is an lvalue (as pointed out by Andrei):

void fix(ref double x) { if (isnan(x)) x = 0; }
float a;
fix(a);

Implementation details

The compiler will create a temporary whose lifetime shall survive the entire expression where this RV^ occurs:

expr ( funRef(RV^)  )
//rewritten by compiler as:
auto _tmp=RV;
expr ( funRef(_tmp)  );

UFCS

The rule for UFCS is the same:

with 'funRef(ref T a)':

  • LV.funRef(); //ok
  • RV^.funRef(); //ok
  • LV^.funRef(); //error
  • RV.funRef(); //error

with 'funNonRef(ref T a)':

  • LV.funNonRef(); //ok
  • RV.funNonRef(); //ok
  • LV^.funNonRef(); //error
  • RV^.funNonRef(); //error

So current valid code will stay valid, and new code becomes possible in a safe way.*

Safety against return by ref/nonref code evolution

This problem was pointed out by Andrei.

struct A{ref T opIndex(int i){...}}
void fix(ref T x){if (isnan(x)) x = 0; }
A a;
fix(a[0]);//ok since a[0] is an lvalue.
//later on the code changes: opIndex returns by value
fix(a[0]);//now becomes an error since a[0] becomes an rvalue! so we're prevented from accepting code that silently does the wrong thing.

Also: Safety against another type of code evolution: proposal to error on ignored return value

This proposal is independant of this DIP39 as it is independent of rvalue refs but I mention it here. Actually another problem that has nothing to do with rvalue refs can also be addressed:

//suppose now the 'fix' code changes to :
T fix(T x){if (isnan(x)) x = 0; return x;}
fix(a[0]); //compiles but does nothing, regardless of opIndex returning value or ref.

I propose to make it an error to ignore a return value, and to add a function 'ignore' for convenience that consumes and does nothing:

void ignore(T)(T a){}
// can be used as:
[1,2,3].array; // error: return value of array is ignored
[1,2,3].array.ignore; // ok

That'll address this issue.

Safety

Memory safety would be the same as the current situation in D with same existing pitfalls and no new pitfalls introduced. In conjunction with DIP 38, memory safety would be guaranteed at compile time. With the one introduced in Dconf13, it would be guaranteed with a runtime check.

Alternative symbols for call site rvalue annotation

2 things to decide on : prefix or postfix annotation, and which annotation to use:

prefix vs postfix:

  • postfix fun(RV^): (proposed): compatible with left-to-right pipelines in D: [1,2].sort.map!fun.uniq
  • prefix fun(^RV): compatible with '&' location wrt RV argument

This can affect ease of disambiguation wrt existing symbols.

which annotation to use (regardless of prefix/postfix):

  • fun(RV^);//unused in D, reminds of a C++ special reference extension
  • fun(ref RV);//reminds of C# call site annotation, and reminds of function signature
  • fun(RV@);//@ has UDA meaning in D, but that could be made unambiguous
  • fun(RV#); //# has a special line reordering meaning in D, but that could be made unambiguous
  • fun(RV?); //? has a special (a?b:c) meaning in D, but that could be made unambiguous
  • fun(RV&); //probably a bad idea, since for a templated function fun(T)(ref Ta ) this could call fun!(typeof(RV*))(RV&)

Copyright

This document has been placed in the Public Domain.