Template Instantiation Strategy

From D Wiki
Jump to: navigation, search

There's two distinct purposes.

  1. Minimize amount of generated instance code (makes obj/lib files smaller)
  2. Minimize amount of semantic analysis for instance code (speed up compilation with heavily templated libraries)

--

For the purpose 1, a new concept has been introduced by Walter Bright in Pull Request #2550.

I have worked on compiler implementation improvement to become that concept reliable algorithm.


1a. If a root module `x.d` and non-root module `y.d` exists,

 module x; // x.d
 import y, z;
 alias a = A!();
 module y; // y.d
 import z;
 alias a = A!();
 module z;
 template A() { alias b = B!(); ... }
 template B() { alias c = C!(); ... }
 template C() { ... }
 template D() { alias b = B!(); ... }

The template instance chain `A!() -> B!() -> C!()` appears both in root and non-root modules. With the command line `dmd -c x.d`, the three instances won't appear in `x.obj`.

That's handled by TemplateInstance::minst field. In module x, TemplateInstance('A!()') object gets Module('x') from the scope and save it in its minst field. Similarly TemplateInstance('A!()') in module y sets its minst.

The two identical TemplateInstance objects are chained by TemplateInstance::tnext field. After the whole semantic analysis completed, TemplateInstance::needsCodegen() will iterate the linked list, then if one of the identical instance has been instantiated in non-root module, compiler skips it's code generation.


1b. If a template is instantiated in speculative context, it's codegen also unneeded.

 module x;
 static if (is(typeof(A!()))) {...}
 // A!(), and indirectly instantiated B!() and C!() need not be stored in x.obj

To represent it, an TemplateInstance in the speculative context will set its minst field NULL. Then TemplateInstance::needsCodegen() treats it as same as non-root instances.


1c. Why TemplateInstance::tnext is necessary? Because a speculative instantiation chain might need to be changed to root instantiation chain.

 module x;
 static if (is(typeof(B!())))
   // B!() and C!() are marked as speculative instances.
 {
   alias a = D!();
   // all instances in the chain D!() -> B!() -> C!() should be marked as root instantiation.
   // But semantic analysis of B!() is already finished, so A!() cannot know the C!() needs to be remarked.
 }

In above code, the instantiation chain will be the graph:

 A!()[minst: NULL] ------+--> B!()[minst: NULL] --> C!()[minst: NULL]
                         |
 D!()[minst: module x] --+

We already have `TemplateInstance::tinst` field that holds the information, which template has instantiated that instance. In TemplateInstance('C!()'), `tinst` points TemplateInstance('B!()'), but how B!() points both A!() and D!() by using one tinst field?

To store the information "B!() is instantiated in both speculative and root-module contexts", the two identical TemplateInstance('B!()') objects have different `tinst`s and are chained by using `tnext` field.

Finally TemplateInstance('C!()')::needsCodegen() will use them, then determine it's instantiated from root module.


For the purpose 2, we use existing behavior: compiler won't invoke semantic3 pass for the non-root module members.


2a.

  • If a template instance will be inserted in the member of root module, its semantic3 pass will get called.
  • If a template instance will be inserted in the member of non-root module, its semantic3 pass won't get called.
 module x;
 import y;
 module y;
 alias a = A!();
 // A!(), B!(), and C!() will be inserted in the member of moduel y,
 // then their semantic3 pass won't get called.
 Note: if the aliased name 'a' is not actually used from any root modules, the A!() instantiation itself might be skippable.
 It's sometimes called a 'lazy instantiation'.

2b. But if both a root module and non-root module have mutual imports, that would be link-failure.

 module x;
 import y;
 import z;
 alias c = C!();
 module y;
 import x;
 import z;
 alias c = C!();
 dmd -c x.d -ofx.obj
 dmd -c y.d -ofy.obj
 dmd x.obj y.obj -ofxy.exe
 // -> Undefined symbol C!()

This is what had occured in issue 2644. When you compile x.d, the imported module y will be analyzed befor x, then the instantiation C!() in y will be inserted in the member of non-root module y. So that code won't be stored in x.obj.

When you compile y.d, the imported module x will be analyzed befor y, then the instantiation C!() in x will be inserted in the member of non-root module x. So that code won't be stored in y.obj.

Neither x.obj nor y.obj has C!(), then the link fails.

Even if the previous instantiation had existed in non-root module, the current root instantiation will need to get a codegen chance. For that, the cached TemplateInstance will be inserted in root module member again.

 https://github.com/D-Programming-Language/dmd/pull/4784/files#diff-0477a1d81a6a920c99362954179c59c8R5974

The additional codegen is checked in TemplateInstance::needsCodegen(), and will happen only for the instance that actually instantiated by the mutual imported modules.