Difference between revisions of "DIP49"

From D Wiki
Jump to: navigation, search
(Remove rule [c1] and [c5])
(Fix typo.)
Line 109: Line 109:
 
* immutable to const [i1]
 
* immutable to const [i1]
 
* immutable to immutable [i2]
 
* immutable to immutable [i2]
You can regard the [i2] case as that the generated immutable copy is referred by const reference.
+
You can regard the [i1] case as that the generated immutable copy is referred by const reference.
  
 
==== Overload resolution (1) ====
 
==== Overload resolution (1) ====

Revision as of 11:24, 10 November 2013

Title: Define qualified postblit
DIP: 49
Version: 1
Status: Draft
Created: 2013-11-10
Last Modified: 2013-11-10
Author: Hara Kenji

Abstract

If an object has some indirections, postblit should handle them correctly. This DIP will classify postblits into four types, and will resolve the postblit design issues.

Description

Mutable Postblit

Signature: this(this);

Modifying indirections via mutable fields is allowed.

struct SM {
    int var;
    int[] arr;
    this(this) {
        static assert(typeof(this.var) == int);
        static assert(typeof(this.arr) == int[]);
        var = 1;     // OK
        arr[] += 1;  // OK
    }
}

The indirections may be shared with original objects, so mutable postblit may rewrite the representation of the source object.

SM sm1 = SM(1, [1,2,3]);
SM sm2 = sm1;  // mutable postblit is called
assert(sm2.marr == [2,3,4]);
assert(sm1.marr == [2,3,4]); // modified
assert(sm1.arr.ptr == sm2.arr.ptr);

Mutable postblit will be invoked when the source object is mutable.

  • mutable to mutable [m1]
  • mutable to const [m2]

You can regard the [m2] case as that the generated mutable copy is referred by const reference.

Constant Postblit

Signature: this(this) const;

After blit copy is done, you cannot modify indirections because they are qualified at least with const. On the other hand, you can re-initialize references,

struct SC {
    int var;
    int[] arr;
    this(this) const {
        static assert(typeof(this.var) == const int);
        static assert(typeof(this.arr) == const int[]);
        var = 1;     // OK
      //arr[] += 1;  // NG
        arr = this.arr.dup; // OK
    }
}

Inside constant postblit, indirections may be shared with the original object, so modifying them is disallowed. On the other hand, you can re-initialize references.

Constant postblit will be invoked for the copies that qualifier transition is equal or weaken.

  • mutable to mutable [c1] ... may break type system
  • mutable to const [c2]
  • const to const [c3]
  • immutable to const [c4]
  • immutable to immutable [c5] ... may break type system

Immutable Postblit

Signature: this(this) immutable;

Inside immutable postblit, you cannot modify indirections, because they are qualified with immutable.

struct SI {
    int var;
    int[] arr;
    this(this) immutable {
        static assert(typeof(this.var) == immutable int);
        static assert(typeof(this.arr) == immutable int[]);
        var = 1;     // OK
      //arr[] += 1;  // NG
        arr = this.arr.idup; // OK
    }
}

Immuatble postblit will be invoked when the source object is immutable.

  • immutable to const [i1]
  • immutable to immutable [i2]

You can regard the [i1] case as that the generated immutable copy is referred by const reference.

Overload resolution (1)

These three postblits can be overloaded each other. At that time, following rule will be applied for overload resolution:

a. If both [m2] and [c2] are possible, [c2] is preferred.
[m2] will invoke 2-step operations - copy an mutable object to mutable, then qualify the copy by const.
On the other hand, [c2] will invoke 1-step operation - copy mutable object to const object.
So, shorter distance operation c2 will be chosen.
b. If both [c4] and [i1] are possible, [c4] is preferred.
[i2] will invoke 2-step operations - copy an immutable object to immutable, then qualify the copy by const.
On the other hand, [c4] will invoke 1-step operation - copy immutable object to const object.
So, shorter distance operation c4 will be chosen.

From the practical view, the rule has the advantage to be able to take copy-on-write strategy. Even if your struct type needs deep copy for mutable objects, by defining this(this) const {}, you can elide the cost when you need a const copy.

Unique Postblit

Signature: this(this) inout;

Above three postblits will extend the plain old postblit definition naturally, by adding a perspective about qualifier conversion in there. However it is still insufficient for the copy operations of between incompatible qualifiers (eg. mutable to immutable, const to mutable, etc).

To fix the last issue, I propose "unique postblit" concept.

The main issue of plain old postblit concept is that postblit will do nothing for the type qualifiers reinterpretation of the object indirections. For example:

struct X {
    int[] arr;
    this(this) {}
}

If you want to copy mutable X to immutable, compiler automatically memcopy the object image before the postblit call. However the user-defined postblit will do nothing for the 'arr' field. Then, the int[] arr will be interpreted to immutable(int[]) by copy.

int[] arr = [1,2,3];
X m = X(arr);
immutable X i = m;  // IF this copy invoke X.this(this)
static assert(is(typeof(i.arr) == immutable));
assert(i.arr == [1,2,3]);
arr[] = 100;
assert(i.arr == [1,2,3]);  // fails!

If the postblit call can guarantee that the copied result won't share any indirections with others, the problem will be disappeared. In other words, compiler should provide a way to guarantee that a postblit will generate a "unique" copy from the source. For that postblit, I assign the signature this(this) inout;. And to satisfy the requirement, compiler will enforce following rule

  • Inside unique postblit, all of non-immutable indirections should be re-initialized by Unique Expressions.

For example:

struct SU {
    int var;
    int[] arr;
    this(this) inout {
        static assert(typeof(this.var) == inout int);
        static assert(typeof(this.arr) == inout int[]);
        // In each control flow paths:
        if (true) {
            arr = arr;  // rhs is not an unique expression, so compiler will reject this path.
        } else if (true) {
            ;           // also rejected if do nothing for the arr field
        }
            arr = arr.dup;  // arr.dup makes unique expression, so compiler accepts this path.
        }
    }
}

The definition of Unique Expressions is:

  1. Basic literal values (integers, complexes, characters)
  2. Complex literal values (struct literals, array literals, AA literals)
    If the literal has subsequent elements, the sub expressions should also be unique.
  3. Expressions that has no indirections
    For example, multiply integers returns integer rvalue, and integer has no indirections, so multiply expression will be unique.
    int a, b, c = a * b; // the multiply will become unique expression
  4. An unique object constructed by unique constructor
  5. An unique object constructed by unique postblit
  6. A field variable of unique object
    unique_obj.var is also unique.
  7. An address of unique object
    &unique_obj is also unique.
  8. A copy of an array
    iff the element type supports generating unique copy.
    • unique_array[n]
    • unique_array[n .. m]
  9. An element(s) of unique array
    • unique_array[n]
    • unique_array[n .. m]
  10. Concatenation of arrays
    By definition, concat expression will always create a newly allocated array. So iff the element type has no reference, the result will be unique.
  11. Pure function call which returns unique object
  12. New expression with unique arguments
    If a struct type is new-ed with literal syntax, same as "literal values" case.
    If a class type is new-ed, the called constructor should be unique constructor.

(maybe this list is not complete)

Enforcing the rule will make the result of calling this(this) inout unique. And it will guarantee that the copy has no shared indirections. Therefore, implicit casting the copy to any qualifier is valid and won't break type system.

SU sm = S(1, [1,2,3]);
immutable SU si = sm;   // unique postblit is called
    // In here, si.arr is always re-allocated by the unique postblit.
assert(cast(void*)si.arr.ptr !is cast(void*)sm.arr.ptr);    // OK

Note that, if an object indirection will be finally marked as immutable or const, initializing it by non-unique immutable data is allowed.

immutable(int)[] global;
struct S {
    immutable(int)[] iarr;
    const(int)[] carr;
    this(this) inout {
        iarr = global;   // OK
        carr = global;   // OK
    }
}

Overload resolution (2)

You may overload unique postblit with other three postblits. If so, it will have less priority than others.

struct SX {
    this(this) const { ... }
    this(this) inout { ... }
}
SX sm;
SX sm2 = sm;            // constant postblit is called
const SX sc = sm;       // constant postblit is called
immutable SX si = sm;   // unique postblit is called

Concatenation of field postblits

If a struct has a field which has postblit, compiler will generate postblit implicitly for the enclosing struct.

struct A {
    this(this);
}
struct S1 {
    A a;
    // Compiler will generate this(this); implicitly
}

If struct fields have incompatible postblits, compiler implicitly mark the enclosing struct uncopyable.

struct B {
    this(this) immutable;
}
struct S2 {
    A a;
    immutableB b;
    // a.this(this); is callable only for the copy S2 to S2.
    // b.this(this) immutable is callable only for the copy immutable(S2) to immutable(S2)
    // Therefore compiler cannot generate appropriate postblit implicitly for S2.
    // Then S2 will be marked as uncopyable.
}

To make S2 copyable, you need to define postblit by hand.

struct S3 {
    A a;
    immutableB b;
    this(this) { // or const or immutable or inout, as you needed
        // When this postblit is invoked, Both a and b are immediately after the bitwise copy.
        // So re-initializing both fields will be enforced by compiler.
        a = A();  // Re-initializing must be required
        b = B();  // Re-initializing must be required
    }
}

Fix for TypeInfo

TypeInfo.postblit(in void* p); is invoked on array copy/concatenation by druntime. So it must support qualified postblits. For that, following change is necessary.

If a struct S exists:

  • typeid(S).postblit(&obj) will call "mutable postblit"
  • typeid(const S).postblit(&obj) will call "constant postblit"
  • typeid(immutable S).postblit(&obj) will call "immutable postblit"
  • typeid(inout S).postblit(&obj) will call "unique postblit"

If S does not support corresponding postblit, TypeInfo.postblit will throw Error in runtime.

struct S {
    this(this) immutable;
}
typeid(S).postblit(&obj);   // will throw Error

Impact to the existing code

Currently, if a struct has no indirection fields, the user-defined postblit will be invoked on incompatible qualifier copies unrelated to its qualifier.

struct S
{
    int value;  // has no indirection
    this(this) { printf("postblit\n"); }
}
void main()
{
    S sm;
    immutable S si;
    S sm2 = si;            // invoke S.this(this)
    immutable S si2 = sm;  // invoke S.this(this)
}

But after qualified postblit introduced, it won't work anymore. To fix the issue, you need to change the postblit signature to this(this) inout.

Other changes will be undefined behavior, because until now D language hadn't defined well about qualified postblits (Including this(this) const;).

Why use 'inout' keyword for 'unique' postblit?

Because, the unique postblit concept is strongly related to inout type qualifier.

Consider a case that copying inout struct inside inout function.

struct S {
    int[] arr;
    this(this) ??? { }
}
int[] foo(inout S src)
{
    S dst = src; // copy inout S to S
    return dst.arr;
}

If the struct S has postblit, what sholud be done for "copying S from inout to mutable"?

1. You cannot modify elements of arr field, because originally it may be immutable. 2. You must re-initialize arr field by unique expression, otherwise it may break type system

The requirements are exactly same as what necessary for unique postblit. Even if you want to copy inout struct to immutable, there's no difference.

So therefore, "creating unique copy from arbitrary qualified source" is exactly same as "treating the copy source as inout" essentially.

Rationale

The definition of Unique Constructor could be improved by using Unique Expression definition.

Copyright

This document has been placed in the Public Domain.