D binding for C

From D Wiki
Jump to: navigation, search

Introduction

While D cannot directly compile C source code, it can easily interface to C code, be linked with C object files, and call C functions in DLLs.

The interface to C code is normally found in C .h files. So, the trick to connecting with C code is in converting C .h files to D modules. This turns out to be difficult to do mechanically since inevitably some human judgement must be applied. This is a guide to doing such conversions.


Preprocessor

.h files can sometimes be a bewildering morass of layers of macros, #include files, #ifdef's, etc. D doesn't include a text preprocessor like the C preprocessor, so the first step is to remove the need for it by taking the preprocessed output. For DMC (the Digital Mars C/C++ compiler), the command:

dmc -c program.h -e -l

will create a file program.lst which is the source file after all text preprocessing.

For gcc (GNU Compiler Collection), use the command:

gcc -E -P program.h > program.lst


Remove all the #if, #ifdef, #include, etc. statements.

Linkage

Generally, surround the entire module with:

extern (C)
{
     /* ...file contents... */
}

to give it C linkage.

Global variables

Global variables need to have an extra extern and the __gshared storage.

The C Way

int a;

The D Way

extern (C) extern __gshared int a;

For TLS variables __gshared is not used.


Types

A little global search and replace will take care of renaming the C types to D types. The following tables show typical mappings for 32 bit and 64 bit C code. Note that there is a difference between them according to the type long. For convencience D offers the type alias core.stdc.config.c_ulong and core.stdc.config.c_long.

Also note that the following lists sometimes show the implicit C variant, e.g., long long instead of its equivalent explicit variant long long int.

For 32 bit systems:

Mapping C type to D type
C type D type
long double real
unsigned long long ulong
long long long
unsigned long uint
long int
unsigned int uint
int int
unsigned short ushort
signed char byte
unsigned char ubyte
wchar_t wchar or dchar
bool bool, byte, int
size_t size_t
ptrdiff_t ptrdiff_t

For 64 bit systems:

Mapping C type to D type
C type D type
long double real
unsigned long long ulong
long long long
unsigned long uint (Windows) / ulong (Unix)
long int (Windows) / long (Unix)
unsigned uint
unsigned int int
unsigned short ushort
signed char byte
unsigned char ubyte
wchar_t wchar or dchar
bool bool, byte, int
size_t size_t
ptrdiff_t ptrdiff_t

NULL

NULL and ((void*)0) should be replaced with null. Numeric Literals Any ‘L’ or ‘l’ numeric literal suffixes should be removed, as a C long is (usually) the same size as a D int. Similarly, ‘LL’ suffixes should be replaced with a single ‘L’. Any ‘u’ suffix will work the same in D.

String Literals

In most cases, any ‘L’ prefix to a string can just be dropped, as D will implicitly convert strings to wide characters if necessary.

However, one can also replace:

The C Way

L"string"

with:

The D Way

"string"w	// for 16 bit wide characters
"string"d	// for 32 bit wide characters

Macros

Lists of macros like:

The C Way

#define FOO	1
#define BAR	2
#define ABC	3
#define DEF	40

can be replaced with:

The D Way

enum
{   FOO = 1,
    BAR = 2,
    ABC = 3,
    DEF = 40
}

or with:

enum int FOO = 1;
enum int BAR = 2;
enum int ABC = 3;
enum int DEF = 40;

Function style macros, such as:

The C Way

#define MAX(a,b) ((a) < (b) ? (b) : (a))

can be replaced with functions:

The D Way

int MAX(int a, int b) { return (a < b) ? b : a; }

The functions, however, won't work if they appear inside static initializers that must be evaluated at compile time rather than runtime. To do it at compile time, a template can be used:

The C Way

#define GT_DEPTH_SHIFT  (0)
#define GT_SIZE_SHIFT   (8)
#define GT_SCHEME_SHIFT (24)
#define GT_DEPTH_MASK   (0xffU << GT_DEPTH_SHIFT)
#define GT_TEXT         ((0x01) << GT_SCHEME_SHIFT)

/* Macro that constructs a graphtype */
#define GT_CONSTRUCT(depth,scheme,size) \
	((depth) | (scheme) | ((size) << GT_SIZE_SHIFT))

/* Common graphtypes */
#define GT_TEXT16  GT_CONSTRUCT(4, GT_TEXT, 16)

The corresponding D version would be:

The D Way

enum uint GT_DEPTH_SHIFT  = 0;
enum uint GT_SIZE_SHIFT   = 8;
enum uint GT_SCHEME_SHIFT = 24;
enum uint GT_DEPTH_MASK   = 0xffU << GT_DEPTH_SHIFT;
enum uint GT_TEXT         = 0x01 << GT_SCHEME_SHIFT;

// Template that constructs a graphtype
template GT_CONSTRUCT(uint depth, uint scheme, uint size)
{
 // notice the name of the const is the same as that of the template
 enum uint GT_CONSTRUCT = (depth | scheme | (size << GT_SIZE_SHIFT));
}

// Common graphtypes
enum uint GT_TEXT16 = GT_CONSTRUCT!(4, GT_TEXT, 16);

Declaration Lists

D doesn't allow declaration lists to change the type. Hence:

The C Way

int *p, q, t[3], *s;

should be written as:

The D Way

int* p, s;
int q;
int[3] t;

Void Parameter Lists

Functions that take no parameters:

The C Way

int foo(void);

are in D:

The D Way

int foo();

Extern Global C Variables

Whenever a global variable is declared in D, it is also defined. But if it's also defined by the C object file being linked in, there will be a multiple definition error. To fix this problem, use the extern storage class. For example, given a C header file named foo.h:

The C Way

struct Foo { };
struct Foo bar;

It can be replaced with the D modules, foo.d:

The D Way

struct Foo { }
extern (C)
{
    extern Foo bar;
}

Typedef

alias is the D equivalent to the C typedef:

The C Way

typedef int foo;

becomes:

The D Way

alias foo = int;

Function pointers

With function pointers there are (at least) two cases where an alias have to be used, instead of a function pointer.

  • When declaring function parameters with a specific linkage.
  • When using a cast with a specific linkage. You won't see this in a binding, if you're not converting inline functions.

Function parameters

The following is syntactically invalid in D:

The C Way

void foo (extern(C) void function () callback);

Use an alias:

The D Way

alias Callback = extern (C) void function(); 
void foo (Callback callback);

Cast

You won't see this in a binding, if you're not converting inline functions.

This is invalid in D as well:

void* foo;
...
auto bar = cast(extern (C) void function ()) foo;

Use the same approach as above:

alias Callback = extern (C) void function(); 
...
auto bar = cast(Callback) foo;

Structs

Replace declarations like:

The C Way

typedef struct Foo
{   int a;
    int b;
} Foo, *pFoo, *lpFoo;

with:

The D Way

struct Foo
{   int a;
    int b;
}
alias pFoo  = Foo*;
alias lpFoo = Foo*;

Anonymous structs

If an anonymous struct is used directly to declare a variable you're forced to invent a name for the struct in D, since D doesn't support anonymous structs.

The C Way

struct
{
   int a;
   int b;
} c;

Translate to:

The D Way

struct _AnonymousStruct1
{
   int a;
   int b;
}
 
_AnonymousStruct1 c;

Any name can be used in this case.

Struct Member Alignment

A good D implementation by default will align struct members the same way as the C compiler it was designed to work with. But if the .h file has some #pragma's to control alignment, they can be duplicated with the D align attribute:

The C Way

#pragma pack(1)
struct Foo
{
    int a;
    int b;
};
#pragma pack()

becomes:

The D Way

struct Foo
{
  align (1):
    int a;
    int b;
}

Nested Structs

The C Way

struct Foo
{
    int a;
    struct Bar
    {
	int c;
    } bar;
};

struct Abc
{
    int a;
    struct
    {
	int c;
    } bar;
};

becomes:

The D Way

struct Foo
{
    int a;
    struct Bar
    {
	int c;
    }
    Bar bar;
}

struct Abc
{
    int a;
    struct
    {
	int c;
    }
}

__cdecl, __stdcall

The C Way

int __cdecl x;
int __cdecl foo(int a);
int __stdcall abc(int c);

becomes:

The D Way

extern (C) int x;
extern (C) int foo(int a);
extern (Windows) int abc(int c);

__declspec(dllimport)

The C Way

__declspec(dllimport) int __stdcall foo(int a);

becomes:

The D Way

export extern (Windows) int foo(int a);

__fastcall

Unfortunately, D doesn't support the __fastcall convention. Therefore, a shim will be needed, either written in C:

The C Way

int __fastcall foo(int a);

int myfoo(int a)
{
    return foo(int a);
}

and compiled with a C compiler that supports __fastcall and linked in, or compile the above, disassemble it with obj2asm and insert it in a D myfoo shim with inline assembler.


See also