Difference between revisions of "GDC/Hacking"
(added text about how to fork and issue pull requests) |
|||
(16 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
=The GDC Hackers Guide= | =The GDC Hackers Guide= | ||
This page is meant as a resource for all of us that wants to help Walter develop the D language by developing a modified DMD frontend that can make use of GCC's middle and back ends. In order for us to do this, we must learn how to understand and edit the GDC/GCC sources. | This page is meant as a resource for all of us that wants to help Walter develop the D language by developing a modified DMD frontend that can make use of GCC's middle and back ends. In order for us to do this, we must learn how to understand and edit the GDC/GCC sources. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
==Quicklinks== | ==Quicklinks== | ||
Line 16: | Line 9: | ||
*[http://www.dsource.org/projects/gdb-patches Patching GDB] to understand D code/mangling | *[http://www.dsource.org/projects/gdb-patches Patching GDB] to understand D code/mangling | ||
*[http://dstress.kuehne.cn/www/dstress.html DStress], test cases for D compilers | *[http://dstress.kuehne.cn/www/dstress.html DStress], test cases for D compilers | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
=GCC Structure= | =GCC Structure= | ||
Line 33: | Line 18: | ||
* backend - Turn GIMPLE into target-specific ASM instructions. | * backend - Turn GIMPLE into target-specific ASM instructions. | ||
− | What we know as "GDC" is only an implementation of the frontend part of GCC. The middleend uses callbacks to interface with the frontend. GDC is located within its own subfolder in the "core" GCC source tree - (srcdir)/gcc/d/. It is within this subfolder that we must perform all changes to the language. GCC has other frontends such c (C), cp (C++), java (Java), objc ( | + | A [http://www.youtube.com/watch?v=vfgOF5-ztDs&feature=player_detailpage#t=1536 brief overview of GENERIC and GIMPLE] was presented at DConf 2013. |
+ | |||
+ | What we know as "GDC" is only an implementation of the frontend part of GCC. The middleend uses callbacks to interface with the frontend. GDC is located within its own subfolder in the "core" GCC source tree - (srcdir)/gcc/d/. It is within this subfolder that we must perform all changes to the language. GCC has other frontends such c (C), cp (C++), java (Java), objc (Objective-C), Fortran, Ada. One can look at these for advice, but one probably shouldn't... | ||
(one exception: the "c++" package is currently also required to build GDC, since the bundled [http://recls.org recls] library uses it) | (one exception: the "c++" package is currently also required to build GDC, since the bundled [http://recls.org recls] library uses it) | ||
− | Note that GDC is currently '''not''' an official language for GCC, but a "third party" addition. | + | Note that GDC is currently '''not''' an official language for GCC, but a "third party" addition. As such, it is similar to GPC (GNU Pascal Compiler), see http://www.gnu-pascal.de/. Work is underway to merge GDC into the official GCC codebase, at which point this will no longer be the case. |
− | As such, it is similar to GPC (GNU Pascal Compiler), see http://www.gnu-pascal.de/ | ||
− | |||
− | Work is underway to merge GDC into the official GCC codebase, at which point this will no longer be the case. | ||
The frontend contains the lexer and parser - these together turn the source file into GENERIC. The GDC frontend relies heavily on the DMD sources to perform this work, and you will find the entire DMD sources in a subfolder. | The frontend contains the lexer and parser - these together turn the source file into GENERIC. The GDC frontend relies heavily on the DMD sources to perform this work, and you will find the entire DMD sources in a subfolder. | ||
Line 64: | Line 48: | ||
|array.c || Dynamic array | |array.c || Dynamic array | ||
|- | |- | ||
− | |arrayop.c || [ | + | |arrayop.c || [http://dlang.org/arrays.html#array-operations Array operations] (e.g. ''a[] = b[] + c[]''). |
|- | |- | ||
|async.c || Asynchronous input | |async.c || Asynchronous input | ||
|- | |- | ||
− | |attrib.c || [ | + | |attrib.c || [http://dlang.org/attribute.html Attributes] i.e. storage class ('''const''', '''@safe''' ...), linkage ('''extern(C)''' ...), protection ('''private''' ...), alignment ('''align(1)''' ...), anonymous aggregate, '''pragma''', '''static if''' and '''mixin'''. |
|- | |- | ||
|builtin.c || Identify and evaluate built-in functions (e.g. '''std.math.sin''') | |builtin.c || Identify and evaluate built-in functions (e.g. '''std.math.sin''') | ||
Line 90: | Line 74: | ||
|delegatize.c || Convert an expression ''expr'' to a delegate ''{ return expr; }'' (e.g. in '''lazy''' parameter). | |delegatize.c || Convert an expression ''expr'' to a delegate ''{ return expr; }'' (e.g. in '''lazy''' parameter). | ||
|- | |- | ||
− | |doc.c || [ | + | |doc.c || [http://dlang.org/ddoc.html Ddoc] documentation generator (http://forum.dlang.org/post/dgnng9$2tb8$1@digitaldaemon.com) |
|- | |- | ||
|dsymbol.c || D symbols (i.e. variables, functions, modules, ... anything that has a name). | |dsymbol.c || D symbols (i.e. variables, functions, modules, ... anything that has a name). | ||
Line 102: | Line 86: | ||
|expression.c || Defines the bulk of the classes which represent the AST at the expression level. | |expression.c || Defines the bulk of the classes which represent the AST at the expression level. | ||
|- | |- | ||
− | |func.c || Function declaration, also includes function/delegate literals, function alias, (static/shared) constructor/destructor/post-blit, '''invariant''', '''unittest''' and [ | + | |func.c || Function declaration, also includes function/delegate literals, function alias, (static/shared) constructor/destructor/post-blit, '''invariant''', '''unittest''' and [http://dlang.org/class.html#allocators allocator/deallocator]. |
|- | |- | ||
|gnuc.c || Implements functions missing from GCC, specifically ''stricmp'' and ''memicmp''. | |gnuc.c || Implements functions missing from GCC, specifically ''stricmp'' and ''memicmp''. | ||
Line 110: | Line 94: | ||
|identifier.c || Identifier (just the name). | |identifier.c || Identifier (just the name). | ||
|- | |- | ||
− | |idgen.c || Make id.h and id.c for defining built-in Identifier instances. Compile and run this before compiling the rest of the source. ( | + | |idgen.c || Make id.h and id.c for defining built-in Identifier instances. Compile and run this before compiling the rest of the source. (http://forum.dlang.org/post/cvergn$2asd$1@digitaldaemon.com) |
|- | |- | ||
|impcvngen.c || Make impcnvtab.c for the implicit conversion table. Compile and run this before compiling the rest of the source. | |impcvngen.c || Make impcnvtab.c for the implicit conversion table. Compile and run this before compiling the rest of the source. | ||
Line 118: | Line 102: | ||
|import.c || Import. | |import.c || Import. | ||
|- | |- | ||
− | |init.c || [ | + | |init.c || [http://dlang.org/declaration.html#Initializer Initializers] (e.g. the ''3'' in ''int x = 3''). |
|- | |- | ||
|inline.c || Compute the cost and perform inlining. | |inline.c || Compute the cost and perform inlining. | ||
Line 246: | Line 230: | ||
== Extensions to DMD Frontend == | == Extensions to DMD Frontend == | ||
%% To be written here: describe in more detail areas where GDC splits away from DMD frontend. | %% To be written here: describe in more detail areas where GDC splits away from DMD frontend. | ||
+ | |||
+ | |||
+ | [[Category:GDC Compiler]] |
Latest revision as of 20:04, 1 September 2015
Contents
The GDC Hackers Guide
This page is meant as a resource for all of us that wants to help Walter develop the D language by developing a modified DMD frontend that can make use of GCC's middle and back ends. In order for us to do this, we must learn how to understand and edit the GDC/GCC sources.
Quicklinks
Possibly out of date:
- Patching GDB to understand D code/mangling
- DStress, test cases for D compilers
GCC Structure
Here we gather some texts which can help out in order to understand GCC/GDC. GCC is very complex, and unless we acquire good documentation many will surely give up very soon (if anyone knows of some good books, add them too).
I will give a short overview of the structure of GCC (for the newbies). GCC is a compiler for many languages and many targets, so it is divided into pieces.
- frontend - Turn the source code into an internal representation - GENERIC).
- middleend - Convert the GENERIC to GIMPLE and perform optimizations.
- backend - Turn GIMPLE into target-specific ASM instructions.
A brief overview of GENERIC and GIMPLE was presented at DConf 2013.
What we know as "GDC" is only an implementation of the frontend part of GCC. The middleend uses callbacks to interface with the frontend. GDC is located within its own subfolder in the "core" GCC source tree - (srcdir)/gcc/d/. It is within this subfolder that we must perform all changes to the language. GCC has other frontends such c (C), cp (C++), java (Java), objc (Objective-C), Fortran, Ada. One can look at these for advice, but one probably shouldn't... (one exception: the "c++" package is currently also required to build GDC, since the bundled recls library uses it)
Note that GDC is currently not an official language for GCC, but a "third party" addition. As such, it is similar to GPC (GNU Pascal Compiler), see http://www.gnu-pascal.de/. Work is underway to merge GDC into the official GCC codebase, at which point this will no longer be the case.
The frontend contains the lexer and parser - these together turn the source file into GENERIC. The GDC frontend relies heavily on the DMD sources to perform this work, and you will find the entire DMD sources in a subfolder.
Sadly, GCC is in a very poor state as far as code readability is concerned. Complex macros and source code generators litter the middle and backends. The source is well commented, but that really doesn't help... Well, I'll let you find out that by yourselves :)
The documentation (that I have read) is very hard to understand, so if anyone have any good resources, or tips, write them here. Happy hacking!
GDC Structure
DMD Front End
File | Function |
---|---|
aav.c | Associative array |
access.c | Access check (private, public, package ...) |
aliasthis.c | Implements the alias this D symbol. |
argtypes.c | Convert types for argument passing (e.g. char are passed as ubyte). |
array.c | Dynamic array |
arrayop.c | Array operations (e.g. a[] = b[] + c[]). |
async.c | Asynchronous input |
attrib.c | Attributes i.e. storage class (const, @safe ...), linkage (extern(C) ...), protection (private ...), alignment (align(1) ...), anonymous aggregate, pragma, static if and mixin. |
builtin.c | Identify and evaluate built-in functions (e.g. std.math.sin) |
cast.c | Implicit cast, implicit conversion, and explicit cast (cast(T)), combining type in binary expression, integer promotion, and value range propagation. |
class.c | Class declaration |
clone.c | Define the implicit opEquals, opAssign, post blit and destructor for struct if needed, and also define the copy constructor for struct. |
cond.c | Evaluate compile-time conditionals, i.e. debug, version, and static if. |
constfold.c | Constant folding |
cppmangle.c | Mangle D types according to Intel's Itanium C++ ABI. |
dchar.c | Convert UTF-32 character to UTF-8 sequence |
declaration.c | Miscellaneous declarations, including typedef, alias, variable declarations including the implicit this declaration, type tuples, ClassInfo, ModuleInfo and various TypeInfos. |
delegatize.c | Convert an expression expr to a delegate { return expr; } (e.g. in lazy parameter). |
doc.c | Ddoc documentation generator (http://forum.dlang.org/post/dgnng9$2tb8$1@digitaldaemon.com) |
dsymbol.c | D symbols (i.e. variables, functions, modules, ... anything that has a name). |
dump.c | Defines the Expression::dump method to print the content of the expression to console. Mainly for debugging. |
entity.c | Defines the named entities to support the "\&Entity;" escape sequence. |
enum.c | Enum declaration |
expression.c | Defines the bulk of the classes which represent the AST at the expression level. |
func.c | Function declaration, also includes function/delegate literals, function alias, (static/shared) constructor/destructor/post-blit, invariant, unittest and allocator/deallocator. |
gnuc.c | Implements functions missing from GCC, specifically stricmp and memicmp. |
hdrgen.c | Generate headers (*.di files) |
identifier.c | Identifier (just the name). |
idgen.c | Make id.h and id.c for defining built-in Identifier instances. Compile and run this before compiling the rest of the source. (http://forum.dlang.org/post/cvergn$2asd$1@digitaldaemon.com) |
impcvngen.c | Make impcnvtab.c for the implicit conversion table. Compile and run this before compiling the rest of the source. |
imphint.c | Import hint, e.g. prompting to import std.stdio when using writeln. |
import.c | Import. |
init.c | Initializers (e.g. the 3 in int x = 3). |
inline.c | Compute the cost and perform inlining. |
interpret.c | All the code which evaluates CTFE |
json.c | Generate JSON output |
lexer.c | Lexically analyzes the source (such as separate keywords from identifiers) |
lstring.c | Length-prefixed UTF-32 string. |
macro.c | Expand DDoc macros |
mangle.c | Mangle D types and declarations |
mars.c | Analyzes the command line arguments (also display command-line help) |
module.c | Read modules. |
mtype.c | All D types. |
opover.c | Apply operator overloading |
optimize.c | Optimize the AST |
parse.c | Parse tokens into AST |
rmem.c | Implementation of the storage allocator uses the standard C allocation package. |
root.c | Basic functions (deal mostly with strings, files, and bits) |
scope.c | Scope |
speller.c | Spellchecker |
statement.c | Handles while, do, for, foreach, if, pragma, staticassert, switch, case, default, break, return, continue, synchronized, try/catch/finally, throw, volatile, goto, and label |
staticassert.c | static assert. |
stringtable.c | String table |
struct.c | Aggregate (struct and union) declaration. |
template.c | Everything related to template. |
todt.c | Generate data structures to initialize static variables added to the object file. |
toobj.c | Generate the object file for Dsymbol and declarations except functions. |
traits.c | __traits. |
typinf.c | Get TypeInfo from a type. |
unialpha.c | Check if a character is a Unicode alphabet. |
unittests.c | Run functions related to unit test. |
utf.c | UTF-8. |
version.c | Handles version |
GDC bindings between DMD and GCC
File | Function |
---|---|
asmstmt.cc | Builds inline assembler and extended inline assembler statements. |
d-apple-gcc.c | Deprecated - stub functions for any dependencies that can't be linked in from Apple-GCC objects. |
d-asm-i386.h | Implements D Inline assembler for x86 and x86_64. |
d-bi-attrs.h | Supported GCC function and type attributes. |
d-builtins2.cc | Handles importing of special modules (ie: gcc.builtins, core.vararg) in the runtime library, anything related to builtin intrinsics of GDC. |
d-builtins.c | Handles GCC backend init routines for building all common and builtin trees of GCC. |
d-codegen.c | Code generation utilities, emit instructions, static chain/closure creation and passing, expand frontend builtins. |
d-convert.cc | Convert between basic D types, and conversions to boolean value for conditions. |
d-c-stubs.cc | Deprecated - stub functions for any dependencies that can't be linked in from GCC objects. |
d-decls.cc | Based on tocsym.c - builds and returns back end reference to a declaration or object. |
d-dmd-gcc.h | Contains declarations used by the modified DMD front-end to interact with GCC-specific code. |
d-gcc-complex_t.h | Same as DMD's complex_t., but use GCC's REAL_VALUE_TYPE-based real_t instead of long double. |
d-gcc-includes.h | Headers included from GCC. |
d-gcc-real.cc | Object-oriented layer for interacting with GCC's REAL_VALUE_TYPE-based real_t. |
d-gcc-tree.h | Declaration of tree and tree_node for files that cannot include d-gcc-includes.h |
d-glue.cc | Builds GCC trees for all functions, statements, and expressions. Also convert D types into GCC types. |
d-gt.c | For linking with the GCC garbage collector |
d-incpath.c | Adds import paths for frontend to scan. |
d-irstate.cc | Contains the core functionality of IRState class in d-codegen.cc |
d-lang.cc | Implementation of GCC back-end callbacks and data structures. Main entry point for the D compiler (cc1d) to compile sources. |
d-objfile.cc | Setup and emit global variables and functions to send to GCC backend for processing. |
d-spec.c | The GDC frontend driver for processing command-line options passed to the main application. |
dt.cc | Implements backend functions called from todt.c in the DMD frontend. |
d-todt.c | Implements methods removed from todt.c as require special treatment for GDC. |
d-tree.def | All GDC specific tree codes are defined here. |
lang.opt | All GDC specific command-line flags are defined here |
symbol.cc | Implements Symbol class for d-decls.cc. |
Intermediate Representation
%% To be written here: briefly describe how GDC builds tree representations of D types, expressions, etc.
Extensions to DMD Frontend
%% To be written here: describe in more detail areas where GDC splits away from DMD frontend.