DIP66

From D Wiki
Revision as of 14:27, 2 November 2014 by IgorStepanov (talk | contribs) (Description)
Jump to: navigation, search
Title: (Multiple) alias this
DIP: 66
Version: 1
Status: Draft
Created: 2014-10-09
Last Modified: 2014-10-19
Author: Igor Stepanov
Links:

Abstract

An AliasThis declaration names a member to subtype. Multiple AliasThis declarations are allowed. Order of AliasThis declarations does not matter.

Description

In the code below...

    struct Foo
    {
        //...
        alias symbol this;
    }

... the construction alias symbol this; means that wherever typeof(Foo.symbol) is needed, obj (object of type Foo) can be substituted with obj.symbol. This rule applies to implicit and explicit conversion, .member access expression, operator overloading, foreach expression (foreach(args; obj)) etc. symbol can be an any symbol when obj.symbol is a valid expression. If more than one alias this can be used to solve the same lookup, the compiler should raise an error.

    struct A
    {
        int i;
        alias i this;
    }

    struct B
    {
        int i;
        alias i this;
    }

    struct C
    {
        A a;
        B b;

        alias a this;
        alias b this;
    }
    
    void test()
    {
        C c;
        int i = c; //Error: c.a.i vs c.b.i
    }
    
    static assert(is(C : int)); //Ok, because C is subtype of int anyway.

Alias this and l-values

As mentioned above, alias this symbol may be an field (which is l-value) or method (which may be r-value). Subtyped struct may be passed to a function as a ref and used as a l-value if its alias this symbol is l-value. When called function is overloaded and may take r-value and l-value argument, compiler prefer l-value if alias this symbol is l-value.

    struct A
    {
        int a;
        alias a this;
    }

    struct B
    {
        int foo() { return 1; };
        alias foo this;
    }

    int testX(ref int x)
    {
        return 1;
    }

    int testX(int x)
    {
        return 2;
    }

    void test()
    {
        A a;
        B b;
        assert(testX(a) == 1); //a.a is l-value
        assert(testX(b) == 2); //b.foo is r-value
    }

However, when type D can be converted to B in several ways and if a first way gets a l-value but a second way gets a r-value and if D passed to function foo which takes l-value, compiler will not prefer l-value path and will raise an error:

    struct A
    {
        int a;
        alias a this;
    }

    struct B
    {
        int foo() { return 1; };
        alias foo this;
    }

    struct C
    {
        A a;
        B b;
        alias a this;
        alias b this;
    }

    int testX(ref int x)
    {
        return 1;
    }

    void test()
    {
        C c;
        testX(c); //Error: multiple ways to convert C to int: C.a.a and C.b.foo
    }

This is done because alias this provides subtyping and A and B have the same subtype: int. L-value modifier is not a part of type and statement "A is a subtype of l-value int" doesn't make sense. "A is a subtype of int" is correct assertion.

Method overloadong

There are to important cases of overloading: foo(X) tries to overload base type foo(Y) and basetype2.foo(X) tries to overload basetype2.foo(Y).

At the first case semantic rule says: "Derived type methods hides base type methods."

    struct A
    {
        int foo(int) { return 1; }
        int foo(string) { return 1; }
    }

    struct B
    {
        int foo(double) { return 3; };
        A a;
        alias a this;
    }

    void test()
    {
        B b;
        b.foo(2.0);      //Ok, call B.foo(double);
        b.foo(2);        //Ok, call B.foo(double); A.foo(int) is hidden
        b.foo("string"); //Error, unable to convert string to double. A.foo(string) is hidden
    }

At the second case semantic rule says: "When parameter set can be applied only to one base type overloaded method, compiler will choise it. However, if parameter set can be applied to several base type overloaded methods (even if one matching is better than others), compiler should raise an error."

    struct A
    {
        char foo(int)
        {
            return 'I';
        }
    }

    struct B
    {
        char foo(string)
        {
            return 'S';
        }

        double foo(double)
        {
            return 'D';
        }
    }

    struct C
    {
        A a;
        B b;
        alias a this;
        alias b this;
    }

    void test()
    {
        C c;
        assert(c.foo("string") == 'S'); //Ok. Only c.b.foo(string) is matching.
        assert(c.foo(1.2) == 'D');      //Ok. Only c.b.foo(double) is matching.
        c.foo(1);                       //Error: there are two base methods may be used: c.b.foo(double) and c.a.foo(int)
                                        //No matter that c.a.foo(int) is matches better.
    }

Semantics

Multiple alias this can cause conflicts. This section explains how the compiler should resolve them. At the AliasThis declaration semantic stage, the compiler can perform the initial checks and reject the obviously incorrect AliasThis declarations.

    struct Test1
    {
        int a;
        int b;
        alias a this;
        alias b this; // Error: alias b this conflicts with alias a this;
    }

    class Test2a
    {
    }

    class Test2b : Test2a
    {
    }

    class Test2 : Test2b
    {
        Test2a a;
        alias a this; //Error: alias a this tries to hide inherited type Test2a; 
    }

The other checks will be done when alias this is needed for typing expressions. When the compiler types an expression such as fun(a), it can resolve it as fun(a.aliasThisSymbol). (Hereinafter fun(a) means any case when alias this can be used: type conversion, .member expression, operator expression etc.) However compiler will try fun(a.aliasThisSymbol) only if the expression cannot be typed otherwise.

More precisely, this is the order in which obj.xyz is looked up:

  1. If xyz is a symbol (member, method, enum etc) defined inside typeof(obj) then lookup is done.
  2. Otherwise, if xyz is a symbol introduced in the base class (where applicable), then lookup is done.
  3. Otherwise, if opDispatch!"xyz" exists, then lookup is done.
  4. Otherwise, alias this is attempted transitively, and if xyz is found, then lookup is done.
  5. Otherwise an UFCS rewrite is effected.

When the compiler is trying to resolve alias this it iterates all alias this declarations and tries to apply each. For each successful application, the compiler adds the result expression into a result set. If application fails, the compiler tries to recursively resolve the alias this expression. The following pseudo-code illustrates this:

   resolveAliasThis(obj, ex):
       Set resultSet;
       foreach currentAliasThis in obj.aliasThisSymbols do
           if try(`ex(obj.currentAliasThis))` == Success then
               resultSet.add(`ex(obj.currentAliasThis)`)
           else
               resultSet.add(resolveAliasThis(`obj.currentAliasThis`, ex))
       if obj is class then
           foreach currentBaseClass in obj.baseClasses do
               resultSet.add(resolveAliasThis(`cast(currentBaseClass)obj`, ex))
       return resultSet

Finally, if resultSet contains only one candidate, the compiler will accept it. If resultSet is empty, compiler tries another ways to resolve ex(obj): UFCS et c. If resultSet contains more then one candidates, the compiler raises an error.

Recursive alias this may occur:

    class A
    {
        C c;
        alias c this;
    }

    class B
    {
        A a;
        alias a this;
    }

    class C
    {
        B b;
        alias b this;
    }


For resolving this situation, the resolveAliasThis function stores a set of types (visitedTypes), which can be visited higher in the call stack. If visitedTypes contains typeof(obj), compiler will not check obj's subtypes.

When compiler resolves binary expressions, where both arguments have a alias this declarations, compiler proceeds as follows: At the first stage compiler tries to resolve alias this only for one term: binex(a, b) -> binex(a.aliasthis, b) binex(a, b) -> binex(a, b.aliasthis)

If there is only one candidate, compiler chooses it, if there are many candidates, compiler raises an error. If there isn't candidates, compiler tries to resolve both terms: binex(a, b) -> binex(a.aliasthis, b.aliasthis) If there is only one candidate, compiler chooses it. If there are several candidates, compiler raises an error.