GDC/Hacking

From D Wiki
Jump to: navigation, search

The GDC Hackers Guide

This page is meant as a resource for all of us that wants to help Walter develop the D language by developing a modified DMD frontend that can make use of GCC's middle and back ends. In order for us to do this, we must learn how to understand and edit the GDC/GCC sources.

Quicklinks

  • GCC (the GNU Compiler Collection)
  • GDB (the GNU Debugger)

Possibly out of date:

GCC Structure

Here we gather some texts which can help out in order to understand GCC/GDC. GCC is very complex, and unless we acquire good documentation many will surely give up very soon (if anyone knows of some good books, add them too).

I will give a short overview of the structure of GCC (for the newbies). GCC is a compiler for many languages and many targets, so it is divided into pieces.

  • frontend - Turn the source code into an internal representation - GENERIC).
  • middleend - Convert the GENERIC to GIMPLE and perform optimizations.
  • backend - Turn GIMPLE into target-specific ASM instructions.

A brief overview of GENERIC and GIMPLE was presented at DConf 2013.

What we know as "GDC" is only an implementation of the frontend part of GCC. The middleend uses callbacks to interface with the frontend. GDC is located within its own subfolder in the "core" GCC source tree - (srcdir)/gcc/d/. It is within this subfolder that we must perform all changes to the language. GCC has other frontends such c (C), cp (C++), java (Java), objc (Objective-C), Fortran, Ada. One can look at these for advice, but one probably shouldn't... (one exception: the "c++" package is currently also required to build GDC, since the bundled recls library uses it)

Note that GDC is currently not an official language for GCC, but a "third party" addition. As such, it is similar to GPC (GNU Pascal Compiler), see http://www.gnu-pascal.de/. Work is underway to merge GDC into the official GCC codebase, at which point this will no longer be the case.

The frontend contains the lexer and parser - these together turn the source file into GENERIC. The GDC frontend relies heavily on the DMD sources to perform this work, and you will find the entire DMD sources in a subfolder.

Sadly, GCC is in a very poor state as far as code readability is concerned. Complex macros and source code generators litter the middle and backends. The source is well commented, but that really doesn't help... Well, I'll let you find out that by yourselves :)

The documentation (that I have read) is very hard to understand, so if anyone have any good resources, or tips, write them here. Happy hacking!

GDC Structure

DMD Front End

File Function
aav.c Associative array
access.c Access check (private, public, package ...)
aliasthis.c Implements the alias this D symbol.
argtypes.c Convert types for argument passing (e.g. char are passed as ubyte).
array.c Dynamic array
arrayop.c Array operations (e.g. a[] = b[] + c[]).
async.c Asynchronous input
attrib.c Attributes i.e. storage class (const, @safe ...), linkage (extern(C) ...), protection (private ...), alignment (align(1) ...), anonymous aggregate, pragma, static if and mixin.
builtin.c Identify and evaluate built-in functions (e.g. std.math.sin)
cast.c Implicit cast, implicit conversion, and explicit cast (cast(T)), combining type in binary expression, integer promotion, and value range propagation.
class.c Class declaration
clone.c Define the implicit opEquals, opAssign, post blit and destructor for struct if needed, and also define the copy constructor for struct.
cond.c Evaluate compile-time conditionals, i.e. debug, version, and static if.
constfold.c Constant folding
cppmangle.c Mangle D types according to Intel's Itanium C++ ABI.
dchar.c Convert UTF-32 character to UTF-8 sequence
declaration.c Miscellaneous declarations, including typedef, alias, variable declarations including the implicit this declaration, type tuples, ClassInfo, ModuleInfo and various TypeInfos.
delegatize.c Convert an expression expr to a delegate { return expr; } (e.g. in lazy parameter).
doc.c Ddoc documentation generator (http://forum.dlang.org/post/dgnng9$2tb8$1@digitaldaemon.com)
dsymbol.c D symbols (i.e. variables, functions, modules, ... anything that has a name).
dump.c Defines the Expression::dump method to print the content of the expression to console. Mainly for debugging.
entity.c Defines the named entities to support the "\&Entity;" escape sequence.
enum.c Enum declaration
expression.c Defines the bulk of the classes which represent the AST at the expression level.
func.c Function declaration, also includes function/delegate literals, function alias, (static/shared) constructor/destructor/post-blit, invariant, unittest and allocator/deallocator.
gnuc.c Implements functions missing from GCC, specifically stricmp and memicmp.
hdrgen.c Generate headers (*.di files)
identifier.c Identifier (just the name).
idgen.c Make id.h and id.c for defining built-in Identifier instances. Compile and run this before compiling the rest of the source. (http://forum.dlang.org/post/cvergn$2asd$1@digitaldaemon.com)
impcvngen.c Make impcnvtab.c for the implicit conversion table. Compile and run this before compiling the rest of the source.
imphint.c Import hint, e.g. prompting to import std.stdio when using writeln.
import.c Import.
init.c Initializers (e.g. the 3 in int x = 3).
inline.c Compute the cost and perform inlining.
interpret.c All the code which evaluates CTFE
json.c Generate JSON output
lexer.c Lexically analyzes the source (such as separate keywords from identifiers)
lstring.c Length-prefixed UTF-32 string.
macro.c Expand DDoc macros
mangle.c Mangle D types and declarations
mars.c Analyzes the command line arguments (also display command-line help)
module.c Read modules.
mtype.c All D types.
opover.c Apply operator overloading
optimize.c Optimize the AST
parse.c Parse tokens into AST
rmem.c Implementation of the storage allocator uses the standard C allocation package.
root.c Basic functions (deal mostly with strings, files, and bits)
scope.c Scope
speller.c Spellchecker
statement.c Handles while, do, for, foreach, if, pragma, staticassert, switch, case, default, break, return, continue, synchronized, try/catch/finally, throw, volatile, goto, and label
staticassert.c static assert.
stringtable.c String table
struct.c Aggregate (struct and union) declaration.
template.c Everything related to template.
todt.c Generate data structures to initialize static variables added to the object file.
toobj.c Generate the object file for Dsymbol and declarations except functions.
traits.c __traits.
typinf.c Get TypeInfo from a type.
unialpha.c Check if a character is a Unicode alphabet.
unittests.c Run functions related to unit test.
utf.c UTF-8.
version.c Handles version

GDC bindings between DMD and GCC

File Function
asmstmt.cc Builds inline assembler and extended inline assembler statements.
d-apple-gcc.c Deprecated - stub functions for any dependencies that can't be linked in from Apple-GCC objects.
d-asm-i386.h Implements D Inline assembler for x86 and x86_64.
d-bi-attrs.h Supported GCC function and type attributes.
d-builtins2.cc Handles importing of special modules (ie: gcc.builtins, core.vararg) in the runtime library, anything related to builtin intrinsics of GDC.
d-builtins.c Handles GCC backend init routines for building all common and builtin trees of GCC.
d-codegen.c Code generation utilities, emit instructions, static chain/closure creation and passing, expand frontend builtins.
d-convert.cc Convert between basic D types, and conversions to boolean value for conditions.
d-c-stubs.cc Deprecated - stub functions for any dependencies that can't be linked in from GCC objects.
d-decls.cc Based on tocsym.c - builds and returns back end reference to a declaration or object.
d-dmd-gcc.h Contains declarations used by the modified DMD front-end to interact with GCC-specific code.
d-gcc-complex_t.h Same as DMD's complex_t., but use GCC's REAL_VALUE_TYPE-based real_t instead of long double.
d-gcc-includes.h Headers included from GCC.
d-gcc-real.cc Object-oriented layer for interacting with GCC's REAL_VALUE_TYPE-based real_t.
d-gcc-tree.h Declaration of tree and tree_node for files that cannot include d-gcc-includes.h
d-glue.cc Builds GCC trees for all functions, statements, and expressions. Also convert D types into GCC types.
d-gt.c For linking with the GCC garbage collector
d-incpath.c Adds import paths for frontend to scan.
d-irstate.cc Contains the core functionality of IRState class in d-codegen.cc
d-lang.cc Implementation of GCC back-end callbacks and data structures. Main entry point for the D compiler (cc1d) to compile sources.
d-objfile.cc Setup and emit global variables and functions to send to GCC backend for processing.
d-spec.c The GDC frontend driver for processing command-line options passed to the main application.
dt.cc Implements backend functions called from todt.c in the DMD frontend.
d-todt.c Implements methods removed from todt.c as require special treatment for GDC.
d-tree.def All GDC specific tree codes are defined here.
lang.opt All GDC specific command-line flags are defined here
symbol.cc Implements Symbol class for d-decls.cc.

Intermediate Representation

%% To be written here: briefly describe how GDC builds tree representations of D types, expressions, etc.

Extensions to DMD Frontend

%% To be written here: describe in more detail areas where GDC splits away from DMD frontend.