Difference between revisions of "DIP38"
Timotheecour (talk | contribs) (→Safe ref validation at compile time) |
Timotheecour (talk | contribs) (→Abstract) |
||
Line 28: | Line 28: | ||
== Abstract == | == Abstract == | ||
− | + | We introduce 2 types of references for ref input arguments of ref return functions: inref and outref (the exact keywords can be discussed later) to distinguish whether a given input argument can be escaped or not (possibly via field accesses): | |
+ | ref fun(int inref a, int outref b, int outref c, int d); | ||
+ | indicates that b and c can be escaped by ref return (there could indeed be multiple return statements), but not a. | ||
− | + | We argue that these annotations (inref and outref) are sufficient for guaranteeing ref safety, simply by typechecking a program under a set of allowable conversions. | |
− | + | We propose two schemes: | |
+ | |||
+ | * scheme A: the user annotates the ref return functions on his own (just a choice between inref or outref is needed for each ref input arg) | ||
+ | * scheme B: the compiler takes care of the annotations via a proposed procedural analysis | ||
+ | |||
+ | If the function is a method or internal function, the function itself is marked as inref or outref as reference to the implicit 'this' parameter. | ||
+ | |||
+ | The annotations are part of the type system and written in the automatically generated di interface files. | ||
== Internal Compiler Annotation == | == Internal Compiler Annotation == |
Revision as of 05:29, 7 May 2013
Contents
DIP38: Safe references and rvalue references without runtime checks.
Title: | Safe references and rvalue references without runtime checks |
---|---|
DIP: | 38 |
Version: | 1 |
Status: | Draft |
Created: | 2013-05-06 |
Last Modified: | 2013-05-06 |
Author: | Timothee Cour |
Links: |
Abstract
We introduce 2 types of references for ref input arguments of ref return functions: inref and outref (the exact keywords can be discussed later) to distinguish whether a given input argument can be escaped or not (possibly via field accesses): ref fun(int inref a, int outref b, int outref c, int d); indicates that b and c can be escaped by ref return (there could indeed be multiple return statements), but not a.
We argue that these annotations (inref and outref) are sufficient for guaranteeing ref safety, simply by typechecking a program under a set of allowable conversions.
We propose two schemes:
- scheme A: the user annotates the ref return functions on his own (just a choice between inref or outref is needed for each ref input arg)
- scheme B: the compiler takes care of the annotations via a proposed procedural analysis
If the function is a method or internal function, the function itself is marked as inref or outref as reference to the implicit 'this' parameter.
The annotations are part of the type system and written in the automatically generated di interface files.
Internal Compiler Annotation
Dconf13 introduced safe references enabled by a runtime check (see email thread from Walter: 'Rvalue references - The resolution'). I propose a formulation that is safe, yet doesn't require any runtime check. The compiler automatically annotates any ref return function with inref/outref on ref input arguments:
eg:
struct U{T x;}
ref T foo(ref T a, ref T b, ref U c){
static T d;
if(condition)
return a;
else if(condition(b))
return c.x;
else
return d;
}
will be rewritten internally by the compiler as having the signature:
ref T foo(outref T a, inref T b, outref U c);
indicating that ref depends on ref arguments a and c (dependency on c is via field access). The other input ref arguments are marked 'inref' because they can't be returned by ref.
Second example: when the function is a member (say of a struct), the 'this' parameter is implicit, and the same rules apply:
struct S { T t; ref T fooc(ref T a) { if(condition) return t; else return a;} }
will be rewritten internally by the compiler as having the signature:
struct S { T t; ref T fooc(outref T a) outref; }
because there's a ref dependency on a and the implicit 'this' argument (the annotation for 'this' is at the method level, as const would be).
The di files will have to write those annotations written down, which shall be done automatically.
Safe ref validation at compile time
Given those inref/outref annotations, it is easy to validate/invalidate ref safety; we simply check whether the program typechecks under the following conversion rules:
Allowed type conversions:
- global => outref //global: gc-allocated, static, etc.
- output of ref-return function call => outref
- outref . field => outref // field access
- local => inref
- global => inref
- outref => inref
- temporary => inref
Examples: taken from Walter's above mentioned email:
//Case A:
ref T fooa(ref T t) { return t; }
//=> ref T fooa(outref T t);
ref T bar() { T t; return fooa(t); } // error: wrong conversion local => outref
//Case B:
ref T foob(ref U u) { return u.t; }
//=>ref T foob(outref U u) { return u.t; }
ref U bar() { T t; return foob(t); } // error: wrong conversion local => outref
//Case C:
struct S { T t; ref T fooc() { return t; } }
//=>struct S { T t; ref T fooc() outref; } //outref refers to hidden this parameter
ref T bar() { S s; return s.fooc(); } // error: wrong conversion local => outref
//Case D:
ref T food() {
T t;
ref T bar() { return t; }
//=>ref T bar() outref; //outref refers to hidden this parameter, this could be rewritten as: ref T bar(outref void*this) ;
return bar(); // error: wrong conversion local => outref (since 'this' refers to local stack)
}
//case E:
Transitively calling other functions:
ref T fooe(T t) { return ref fooa(t); } //error because of conversion local=>outref when attempting to call fooa(t).
Algorithmic details for ref dependency analysis
Let's take the following example for illustration:
ref T foo1(ref T a, T b, ref T c) { if(...) return foo2(a); else return foo2(c); }
ref T foo2(ref T a) { return a; }
The propagation algorithm goes as follows
- initialize each ref argument of ref-return functions with 'ref' (ie we don't know yet whether it's inref or outref)
- construct an oriented graph:
* nodes are ref-return functions * edges are ref-return dependencies (one edge per return statement in a ref return function): with example above, there is a graph with 2 nodes (foo1 and foo2) and a single edge (foo1 -> foo2).
- while some ref annotations have changed do:
* for each node with a 'ref' annotation * recompute annotations and remove '?' if all outgoing edges have no '?' annotations
- case A1) if there are no nodes with '?' annotations, then we have succeeded in compile time inference of ref dependency
- case A2) otherwise, for each node with '?' annotations, then there are loops in the graph, and for these nodes we fall back in runtime check on return addresses as proposed in Dconf13. This case should be rare in practice. However there might be a slightly more complex algorithm in that case too that doesn't require runtime check (I will think about it).
For the above example we have: iteration 0(initialization): foo1:ref()? foo2:ref()?
iteration 1: foo1:ref()? foo2:ref(0)
iteration 2: foo1:ref(0,2) foo2:ref(0)
Loops in the graph (case A2) correspond to the case of mutually recursive ref return functions. For example:
ref T foo1(ref T a, T b, ref T c) { if(...) return foo2(a,0,c); else return a; }
ref T foo2(ref T a, T b, ref T c) { if(...) return foo1(a,1,c); else return c; }
Rvalue references
As for rvalue references, the compiler shall introduce a temporary variable before calling the ref function as has been discussed elsewhere. The same rules apply.
Copyright
This document has been placed in the Public Domain.