D binding for C

From D Wiki
Revision as of 11:58, 13 March 2014 by Rowland0210 (talk | contribs) (Fixed wrong type information for Windows.)
Jump to: navigation, search

While D cannot directly compile C source code, it can easily interface to C code, be linked with C object files, and call C functions in DLLs. The interface to C code is normally found in C .h files. So, the trick to connecting with C code is in converting C .h files to D modules. This turns out to be difficult to do mechanically since inevitably some human judgement must be applied. This is a guide to doing such conversions.

Preprocessor

.h files can sometimes be a bewildering morass of layers of macros, #include files, #ifdef's, etc. D doesn't include a text preprocessor like the C preprocessor, so the first step is to remove the need for it by taking the preprocessed output. For DMC (the Digital Mars C/C++ compiler), the command:

dmc -c program.h -e -l

will create a file program.lst which is the source file after all text preprocessing.

Remove all the #if, #ifdef, #include, etc. statements.

Linkage

Generally, surround the entire module with:

extern (C)
{
     /* ...file contents... */
}

to give it C linkage.

Types

A little global search and replace will take care of renaming the C types to D types. The following tables show typical mappings for 32 bit and 64 bit C code. Note that there is a difference between them according to the type long. For convencience D offers the type alias core.stdc.config.c_ulong and core.stdc.config.c_long.

For 32 bit systems:

Mapping C type to D type
C type D type
long double real
unsigned long long ulong
long long long
unsigned long uint
long int
unsigned uint
unsigned short ushort
signed char byte
unsigned char ubyte
wchar_t wchar or dchar
bool bool, byte, int
size_t size_t
ptrdiff_t ptrdiff_t

For 64 bit systems:

Mapping C type to D type
C type D type
long double real
unsigned long long ulong
long long long
unsigned long uint (Windows) / ulong (Unix)
long int (Windows) / long (Unix)
unsigned uint
unsigned short ushort
signed char byte
unsigned char ubyte
wchar_t wchar or dchar
bool bool, byte, int
size_t size_t
ptrdiff_t ptrdiff_t

NULL

NULL and ((void*)0) should be replaced with null. Numeric Literals Any ‘L’ or ‘l’ numeric literal suffixes should be removed, as a C long is (usually) the same size as a D int. Similarly, ‘LL’ suffixes should be replaced with a single ‘L’. Any ‘u’ suffix will work the same in D.

String Literals

In most cases, any ‘L’ prefix to a string can just be dropped, as D will implicitly convert strings to wide characters if necessary. However, one can also replace:

L"string"

with:

"string"w	// for 16 bit wide characters
"string"d	// for 32 bit wide characters

Macros

Lists of macros like:

#define FOO	1
#define BAR	2
#define ABC	3
#define DEF	40

can be replaced with:

enum
{   FOO = 1,
    BAR = 2,
    ABC = 3,
    DEF = 40
}

or with:

const int FOO = 1;
const int BAR = 2;
const int ABC = 3;
const int DEF = 40;

Function style macros, such as:

#define MAX(a,b) ((a) < (b) ? (b) : (a))

can be replaced with functions:

int MAX(int a, int b) { return (a < b) ? b : a; }

The functions, however, won't work if they appear inside static initializers that must be evaluated at compile time rather than runtime. To do it at compile time, a template can be used:

#define GT_DEPTH_SHIFT  (0)
#define GT_SIZE_SHIFT   (8)
#define GT_SCHEME_SHIFT (24)
#define GT_DEPTH_MASK   (0xffU << GT_DEPTH_SHIFT)
#define GT_TEXT         ((0x01) << GT_SCHEME_SHIFT)

/* Macro that constructs a graphtype */
#define GT_CONSTRUCT(depth,scheme,size) \
	((depth) | (scheme) | ((size) << GT_SIZE_SHIFT))

/* Common graphtypes */
#define GT_TEXT16  GT_CONSTRUCT(4, GT_TEXT, 16)

The corresponding D version would be:

const uint GT_DEPTH_SHIFT  = 0;
const uint GT_SIZE_SHIFT   = 8;
const uint GT_SCHEME_SHIFT = 24;
const uint GT_DEPTH_MASK   = 0xffU << GT_DEPTH_SHIFT;
const uint GT_TEXT         = 0x01 << GT_SCHEME_SHIFT;

// Template that constructs a graphtype
template GT_CONSTRUCT(uint depth, uint scheme, uint size)
{
 // notice the name of the const is the same as that of the template
 const uint GT_CONSTRUCT = (depth | scheme | (size << GT_SIZE_SHIFT));
}

// Common graphtypes
const uint GT_TEXT16 = GT_CONSTRUCT!(4, GT_TEXT, 16);

Declaration Lists

D doesn't allow declaration lists to change the type. Hence:

int *p, q, t[3], *s;

should be written as:

int* p, s;
int q;
int[3] t;

Void Parameter Lists

Functions that take no parameters:

int foo(void);

are in D:

int foo();

Extern Global C Variables

Whenever a global variable is declared in D, it is also defined. But if it's also defined by the C object file being linked in, there will be a multiple definition error. To fix this problem, use the extern storage class. For example, given a C header file named foo.h:

struct Foo { };
struct Foo bar;

It can be replaced with the D modules, foo.d:

struct Foo { }
extern (C)
{
    extern Foo bar;
}

Typedef

alias is the D equivalent to the C typedef:

typedef int foo;

becomes:

alias foo = int;

Structs

Replace declarations like:

typedef struct Foo
{   int a;
    int b;
} Foo, *pFoo, *lpFoo;

with:

struct Foo
{   int a;
    int b;
}
alias pFoo  = Foo*;
alias lpFoo = Foo*;

Struct Member Alignment

A good D implementation by default will align struct members the same way as the C compiler it was designed to work with. But if the .h file has some #pragma's to control alignment, they can be duplicated with the D align attribute:

#pragma pack(1)
struct Foo
{
    int a;
    int b;
};
#pragma pack()

becomes:

struct Foo
{
  align (1):
    int a;
    int b;
}

Nested Structs

struct Foo
{
    int a;
    struct Bar
    {
	int c;
    } bar;
};

struct Abc
{
    int a;
    struct
    {
	int c;
    } bar;
};

becomes:

struct Foo
{
    int a;
    struct Bar
    {
	int c;
    }
    Bar bar;
}

struct Abc
{
    int a;
    struct
    {
	int c;
    }
}

__cdecl, __pascal, __stdcall

int __cdecl x;
int __cdecl foo(int a);
int __pascal bar(int b);
int __stdcall abc(int c);

becomes:

extern (C) int x;
extern (C) int foo(int a);
extern (Pascal) int bar(int b);
extern (Windows) int abc(int c);

__declspec(dllimport)

__declspec(dllimport) int __stdcall foo(int a);

becomes:

export extern (Windows) int foo(int a);

__fastcall

Unfortunately, D doesn't support the __fastcall convention. Therefore, a shim will be needed, either written in C:

int __fastcall foo(int a);

int myfoo(int a)
{
    return foo(int a);
}

and compiled with a C compiler that supports __fastcall and linked in, or compile the above, disassemble it with obj2asm and insert it in a D myfoo shim with inline assembler.