GDC/Hacking
Contents
The GDC Hackers Guide
This page is meant as a resource for all of us that wants to help Walter develop the D language by developing a modified DMD frontend that can make use of GCC's middle and back ends. In order for us to do this, we must learn how to understand and edit the GDC/GCC sources.
The paperwork is complete and efforts are currently underway to merge GDC into the official GCC codebase, which represents a great step forward for D. However, more than ever, GDC is in need of contributors to keep it up to date with both the D frontend and the trunk development of the GCC backend.
The primary development repository for GDC can be found at https://github.com/D-Programming-GDC/GDC. Development of GDC is generally discussed in the D.gnu newsgroup at http://forum.dlang.org/group/D.gnu, and on the Freenode IRC network in #d.gdc (irc://chat.freenode.net/%23d.gdc).
Quicklinks
Possibly out of date:
- Patching GDB to understand D code/mangling
- DStress, test cases for D compilers
GDC call for contributors
GDC is currently developed by a very small group (as the commit history on BitBucket shows). While the addition of a D frontend to GCC represents a great step forward for D, it also represents an informal promise by the D community to keep the D frontend up to date with the latest GCC development.
Thanks to Bitbucket and Mercurial, contributing to GDC is as easy as forking the repository and submitting a pull request (although this workflow will likely change when the merge is officially complete), and filing a bug report is a simple web form.
If you use GDC, we encourage you to try to contribute, whether by submitting pull requests or bug reports. In the past, GDC has nearly died due to poor communication and lack of development. Avoiding those issues is easier than ever before, but GDC will always need a community that's willing to give back.
GCC Structure
Here we gather some texts which can help out in order to understand GCC/GDC. GCC is very complex, and unless we acquire good documentation many will surely give up very soon (if anyone knows of some good books, add them too).
I will give a short overview of the structure of GCC (for the newbies). GCC is a compiler for many languages and many targets, so it is divided into pieces.
- frontend - Turn the source code into an internal representation - GENERIC).
- middleend - Convert the GENERIC to GIMPLE and perform optimizations.
- backend - Turn GIMPLE into target-specific ASM instructions.
What we know as "GDC" is only an implementation of the frontend part of GCC. The middleend uses callbacks to interface with the frontend. GDC is located within its own subfolder in the "core" GCC source tree - (srcdir)/gcc/d/. It is within this subfolder that we must perform all changes to the language. GCC has other frontends such c (C), cp (C++), java (Java), objc (<n>Objective-C</n>), Fortran, Ada. One can look at these for advice, but one probably shouldn't... (one exception: the "c++" package is currently also required to build GDC, since the bundled recls library uses it)
Note that GDC is currently not an official language for GCC, but a "third party" addition. As such, it is similar to GPC (GNU Pascal Compiler), see http://www.gnu-pascal.de/
Work is underway to merge GDC into the official GCC codebase, at which point this will no longer be the case.
The frontend contains the lexer and parser - these together turn the source file into GENERIC. The GDC frontend relies heavily on the DMD sources to perform this work, and you will find the entire DMD sources in a subfolder.
Sadly, GCC is in a very poor state as far as code readability is concerned. Complex macros and source code generators litter the middle and backends. The source is well commented, but that really doesn't help... Well, I'll let you find out that by yourselves :)
The documentation (that I have read) is very hard to understand, so if anyone have any good resources, or tips, write them here. Happy hacking!
GDC Structure
DMD Front End
File | Function |
---|---|
aav.c | Associative array |
access.c | Access check (private, public, package ...) |
aliasthis.c | Implements the alias this D symbol. |
argtypes.c | Convert types for argument passing (e.g. char are passed as ubyte). |
array.c | Dynamic array |
arrayop.c | [DigitalMars:d/2.0/arrays.html#array-operations Array operations] (e.g. a[] = b[] + c[]). |
async.c | Asynchronous input |
attrib.c | [DigitalMars:d/2.0/attribute.html Attributes] i.e. storage class (const, @safe ...), linkage (extern(C) ...), protection (private ...), alignment (align(1) ...), anonymous aggregate, pragma, static if and mixin. |
builtin.c | Identify and evaluate built-in functions (e.g. std.math.sin) |
cast.c | Implicit cast, implicit conversion, and explicit cast (cast(T)), combining type in binary expression, integer promotion, and value range propagation. |
class.c | Class declaration |
clone.c | Define the implicit opEquals, opAssign, post blit and destructor for struct if needed, and also define the copy constructor for struct. |
cond.c | Evaluate compile-time conditionals, i.e. debug, version, and static if. |
constfold.c | Constant folding |
cppmangle.c | Mangle D types according to Intel's Itanium C++ ABI. |
dchar.c | Convert UTF-32 character to UTF-8 sequence |
declaration.c | Miscellaneous declarations, including typedef, alias, variable declarations including the implicit this declaration, type tuples, ClassInfo, ModuleInfo and various TypeInfos. |
delegatize.c | Convert an expression expr to a delegate { return expr; } (e.g. in lazy parameter). |
doc.c | [DigitalMars:d/ddoc.html Ddoc] documentation generator (NG:digitalmars.D.announce/1558) |
dsymbol.c | D symbols (i.e. variables, functions, modules, ... anything that has a name). |
dump.c | Defines the Expression::dump method to print the content of the expression to console. Mainly for debugging. |
entity.c | Defines the named entities to support the "\&Entity;" escape sequence. |
enum.c | Enum declaration |
expression.c | Defines the bulk of the classes which represent the AST at the expression level. |
func.c | Function declaration, also includes function/delegate literals, function alias, (static/shared) constructor/destructor/post-blit, invariant, unittest and [DigitalMars:d/2.0/class.html#allocators allocator/deallocator]. |
gnuc.c | Implements functions missing from GCC, specifically stricmp and memicmp. |
hdrgen.c | Generate headers (*.di files) |
identifier.c | Identifier (just the name). |
idgen.c | Make id.h and id.c for defining built-in Identifier instances. Compile and run this before compiling the rest of the source. (NG:digitalmars.D/17157) |
impcvngen.c | Make impcnvtab.c for the implicit conversion table. Compile and run this before compiling the rest of the source. |
imphint.c | Import hint, e.g. prompting to import std.stdio when using writeln. |
import.c | Import. |
init.c | [DigitalMars:d/2.0/declaration.html#Initializer Initializers] (e.g. the 3 in int x = 3). |
inline.c | Compute the cost and perform inlining. |
interpret.c | All the code which evaluates CTFE |
json.c | Generate JSON output |
lexer.c | Lexically analyzes the source (such as separate keywords from identifiers) |
lstring.c | Length-prefixed UTF-32 string. |
macro.c | Expand DDoc macros |
mangle.c | Mangle D types and declarations |
mars.c | Analyzes the command line arguments (also display command-line help) |
module.c | Read modules. |
mtype.c | All D types. |
opover.c | Apply operator overloading |
optimize.c | Optimize the AST |
parse.c | Parse tokens into AST |
rmem.c | Implementation of the storage allocator uses the standard C allocation package. |
root.c | Basic functions (deal mostly with strings, files, and bits) |
scope.c | Scope |
speller.c | Spellchecker |
statement.c | Handles while, do, for, foreach, if, pragma, staticassert, switch, case, default, break, return, continue, synchronized, try/catch/finally, throw, volatile, goto, and label |
staticassert.c | static assert. |
stringtable.c | String table |
struct.c | Aggregate (struct and union) declaration. |
template.c | Everything related to template. |
todt.c | Generate data structures to initialize static variables added to the object file. |
toobj.c | Generate the object file for Dsymbol and declarations except functions. |
traits.c | __traits. |
typinf.c | Get TypeInfo from a type. |
unialpha.c | Check if a character is a Unicode alphabet. |
unittests.c | Run functions related to unit test. |
utf.c | UTF-8. |
version.c | Handles version |
GDC bindings between DMD and GCC
File | Function |
---|---|
asmstmt.cc | Builds inline assembler and extended inline assembler statements. |
d-apple-gcc.c | Deprecated - stub functions for any dependencies that can't be linked in from Apple-GCC objects. |
d-asm-i386.h | Implements D Inline assembler for x86 and x86_64. |
d-bi-attrs.h | Supported GCC function and type attributes. |
d-builtins2.cc | Handles importing of special modules (ie: gcc.builtins, core.vararg) in the runtime library, anything related to builtin intrinsics of GDC. |
d-builtins.c | Handles GCC backend init routines for building all common and builtin trees of GCC. |
d-codegen.c | Code generation utilities, emit instructions, static chain/closure creation and passing, expand frontend builtins. |
d-convert.cc | Convert between basic D types, and conversions to boolean value for conditions. |
d-c-stubs.cc | Deprecated - stub functions for any dependencies that can't be linked in from GCC objects. |
d-decls.cc | Based on tocsym.c - builds and returns back end reference to a declaration or object. |
d-dmd-gcc.h | Contains declarations used by the modified DMD front-end to interact with GCC-specific code. |
d-gcc-complex_t.h | Same as DMD's complex_t., but use GCC's REAL_VALUE_TYPE-based real_t instead of long double. |
d-gcc-includes.h | Headers included from GCC. |
d-gcc-real.cc | Object-oriented layer for interacting with GCC's REAL_VALUE_TYPE-based real_t. |
d-gcc-tree.h | Declaration of tree and tree_node for files that cannot include d-gcc-includes.h |
d-glue.cc | Builds GCC trees for all functions, statements, and expressions. Also convert D types into GCC types. |
d-gt.c | For linking with the GCC garbage collector |
d-incpath.c | Adds import paths for frontend to scan. |
d-irstate.cc | Contains the core functionality of IRState class in d-codegen.cc |
d-lang.cc | Implementation of GCC back-end callbacks and data structures. Main entry point for the D compiler (cc1d) to compile sources. |
d-objfile.cc | Setup and emit global variables and functions to send to GCC backend for processing. |
d-spec.c | The GDC frontend driver for processing command-line options passed to the main application. |
dt.cc | Implements backend functions called from todt.c in the DMD frontend. |
d-todt.c | Implements methods removed from todt.c as require special treatment for GDC. |
d-tree.def | All GDC specific tree codes are defined here. |
lang.opt | All GDC specific command-line flags are defined here |
symbol.cc | Implements Symbol class for d-decls.cc. |
Intermediate Representation
%% To be written here: briefly describe how GDC builds tree representations of D types, expressions, etc.
Extensions to DMD Frontend
%% To be written here: describe in more detail areas where GDC splits away from DMD frontend.