DIP89

From D Wiki
Revision as of 18:49, 21 February 2016 by Schuetzm (talk | contribs) (fixes)
Jump to: navigation, search
Title: @mutable members in immutable data structures
DIP: 85
Version: 1
Status: Draft
Created: 2016-02-21
Last Modified: 2016-02-21
Author: Marc Schütz
Links: Forum thread

Abstract

This DIP proposes an officially sanctioned way to mutate members in immutable data structures.

Rationale

D's immutable signifies - in constrast to C++'s const - physical immutability, i.e. a guarantee that memory occupied by immutable variables will never change during their lifetime. This contrasts with logical immutability, which means that the underlying memory can change as long as the object represented by it remains semantically unchanged. Applications of logical immutability include lazy initialization, mutexes, reference counters or other embedded house-keeping data, as well as members changed for debugging purposes.

Because D's immutable is as strict as it is (casting it away results in undefined behaviour), and const data may actually be immutable, in order to use the above-mentioned techniques, variables must either be mutable (which, because of const and immutable's transitivity implies that many other variables and parameters cannot be marked as const either), or storing the mutable parts outside of the structures, which has considerable complexity, runtime and memory overhead, and can even be unsafe in combination with implicit sharing of immutable data (see below).

With the proposed change, logical immutability (i.e. no changes are observable from the outside) can be achieved without provoking undefined behaviour while still having some basic statically enforced safety.

Description

A new annotation @mutable for member variables and aggregate types is proposed. It is neither a type constructor, nor a storage class; it can be implemented as a compiler-recognized UDA. A member annotated as @mutable triggers the following behaviours:

  1. It is required to be private
  2. Access to it is @system
  3. No static immutable objects with a @mutable member may be created
  4. Dynamically created immutable objects with @mutable members are allowed if all @mutable members are marked as shared (analogously for implicit conversion of unique objects to immutable)

These rules are enforced statically. Rationale for the rules:

  • The first rule (private) enforces encapsulation. This is the basic property of logical const-ness: an observer must never observe a change to an immutable object.
  • The second rule (@system) prevents accidental accesses that violate the above guarantee. This includes not just actual mutation of @mutable members, but even reads from them, because these can leak changed data to the outside. (If desired, this rule can be relaxed: reads in non-pure methods can be @safe.)
  • The third rule (no static immutables) is necessary because static immutable objects could be placed in physically read-only memory by the linker and therefore cannot be modified. Even though existing memory can be made read-only after initialization (using system calls like mmap(2)), doing this is not supposed to be prevented by the type system, because the mmaped region can just as well contain normal mutable data.
  • The fourth rule (shared) prevents race conditions for implicitly shared immutable objects. Access to shared @mutable members must be atomic or otherwise synchronized.

The compiler needs to make sure not to apply optimizations based on the assumption that a @mutable member never changes. Because D supports opaque data structures (struct S;), the @mutable annotation can also be attached to struct declarations: @mutable struct S;.

To enable introspection, two traits are added: isMutable, and hasMutableMembers. The latter determines whether a types contains any mutable members, either directly, or embedded through another member.

Usage

struct S {
    @safe int expensiveComputation();
    private @mutable int bar_;
    @trusted @property bar() const {
        if(!bar_)
            bar_ = expensiveComputation();
        return bar_;
    }
}

About the AA solution

It has been proposed to place the mutable members into an external associate array, with the object as a key. This approach is surprisingly complex: not only does it have a considerable computational and memory cost (including caching effects), it also requires lifetime management of the AA's values.

Additionally, it can have unexpected effects with shared objects (including immutable ones, which are implicitly shareable): while strictly speaking, it doesn't really violate safety by itself, it can have surprising consequences that the compiler is unable to guard against, because the associative array and the objects themselves can have non-matching shared-ness, as there is no formal relationship between the two. Take a reference counted immutable object as an example:

int[const(RCObject)] refcounts;
struct RCObject {
    @disable this();
    static make() {
        immutable(RCObject) result;
        refcounts[result] = 1;
        return result;
    }
    this(this) immutable {
        refcounts[this]++;
    }
    ~this() immutable {
        if(--refcounts[this] == 0)
             releaseResources();
    }
}
void foo() {
    immutable(RCObject) o = RCObject.make();
    send(otherTid, o);
}

Because refcounts in the example above is not marked as shared, it will be a thread-local instance. An immutable object sent to another thread will not have an entry in that thread's AA. The correct solution in this case would be to make the AA shared and to use atomic operations on its values. On the other hand, if it's guaranteed that the objects never cross a thread-boundary, the code is sufficient as-is. Unfortunately, the compiler cannot enforce the correct solution here.

Now, using the changes proposed in this DIP, the code can be made safe by providing a shareable and a thread-local implementation of RCObject. Should the user choose the wrong one, the compiler will reject it because of rule 4.

Copyright

This document has been placed in the Public Domain.