Converting C .h Files to D Modules

From D Wiki
Jump to: navigation, search

While D cannot directly compile C source code, it can easily interface to C code, be linked with C object files, and call C functions in DLLs. The interface to C code is normally found in C .h files. So, the trick to connecting with C code is in converting C .h files to D modules. This turns out to be difficult to do mechanically since inevitably some human judgement must be applied. This is a guide to doing such conversions.

See also

Preprocessor

.h files can sometimes be a bewildering morass of layers of macros, #include files, #ifdef's, etc. D doesn't include a text preprocessor like the C preprocessor, so the first step is to remove the need for it by taking the preprocessed output. For DMC (the Digital Mars C/C++ compiler), the command:

dmc -c program.h -e -l

will create a file program.lst which is the source file after all text preprocessing.

For gcc (GNU Compiler Collection), use the command:

gcc -E -P program.h > program.lst


Remove all the #if, #ifdef, #include, etc. statements.

Linkage

Generally, surround the entire module with:

extern (C)
{
     /* ...file contents... */
}

to give it C linkage.

Types

A little global search and replace will take care of renaming the C types to D types. The following tables show typical mappings for 32 bit and 64 bit C code. Note that there is a difference between them according to the type long. For convencience D offers the type alias core.stdc.config.c_ulong and core.stdc.config.c_long.

Also note that the following lists sometimes show the implicit C variant, e.g., long long instead of its equivalent explicit variant long long int.

For 32 bit systems:

Mapping C type to D type
C type D type
long double real
unsigned long long ulong
long long long
unsigned long uint
long int
unsigned int uint
int int
unsigned short ushort
signed char byte
unsigned char ubyte
wchar_t wchar or dchar
bool bool, byte, int
size_t size_t
ptrdiff_t ptrdiff_t

For 64 bit systems:

Mapping C type to D type
C type D type
long double real
unsigned long long ulong
long long long
unsigned long uint (Windows) / ulong (Unix)
long int (Windows) / long (Unix)
unsigned uint
unsigned int int
unsigned short ushort
signed char byte
unsigned char ubyte
wchar_t wchar or dchar
bool bool, byte, int
size_t size_t
ptrdiff_t ptrdiff_t

NULL

NULL and ((void*)0) should be replaced with null. Numeric Literals Any ‘L’ or ‘l’ numeric literal suffixes should be removed, as a C long is (usually) the same size as a D int. Similarly, ‘LL’ suffixes should be replaced with a single ‘L’. Any ‘u’ suffix will work the same in D.

String Literals

In most cases, any ‘L’ prefix to a string can just be dropped, as D will implicitly convert strings to wide characters if necessary. However, one can also replace:

L"string"

with:

"string"w	// for 16 bit wide characters
"string"d	// for 32 bit wide characters

Macros

Lists of macros like:

#define FOO	1
#define BAR	2
#define ABC	3
#define DEF	40

can be replaced with:

enum
{   FOO = 1,
    BAR = 2,
    ABC = 3,
    DEF = 40
}

or with:

const int FOO = 1;
const int BAR = 2;
const int ABC = 3;
const int DEF = 40;

Function style macros, such as:

#define MAX(a,b) ((a) < (b) ? (b) : (a))

can be replaced with functions:

int MAX(int a, int b) { return (a < b) ? b : a; }

The functions, however, won't work if they appear inside static initializers that must be evaluated at compile time rather than runtime. To do it at compile time, a template can be used:

#define GT_DEPTH_SHIFT  (0)
#define GT_SIZE_SHIFT   (8)
#define GT_SCHEME_SHIFT (24)
#define GT_DEPTH_MASK   (0xffU << GT_DEPTH_SHIFT)
#define GT_TEXT         ((0x01) << GT_SCHEME_SHIFT)
 
/* Macro that constructs a graphtype */
#define GT_CONSTRUCT(depth,scheme,size) \
	((depth) | (scheme) | ((size) << GT_SIZE_SHIFT))
 
/* Common graphtypes */
#define GT_TEXT16  GT_CONSTRUCT(4, GT_TEXT, 16)

The corresponding D version would be:

const uint GT_DEPTH_SHIFT  = 0;
const uint GT_SIZE_SHIFT   = 8;
const uint GT_SCHEME_SHIFT = 24;
const uint GT_DEPTH_MASK   = 0xffU << GT_DEPTH_SHIFT;
const uint GT_TEXT         = 0x01 << GT_SCHEME_SHIFT;
 
// Template that constructs a graphtype
template GT_CONSTRUCT(uint depth, uint scheme, uint size)
{
 // notice the name of the const is the same as that of the template
 const uint GT_CONSTRUCT = (depth | scheme | (size << GT_SIZE_SHIFT));
}
 
// Common graphtypes
const uint GT_TEXT16 = GT_CONSTRUCT!(4, GT_TEXT, 16);

Declaration Lists

D doesn't allow declaration lists to change the type. Hence:

int *p, q, t[3], *s;

should be written as:

int* p, s;
int q;
int[3] t;

Void Parameter Lists

Functions that take no parameters:

int foo(void);

are in D:

int foo();

Extern Global C Variables

Whenever a global variable is declared in D, it is also defined. But if it's also defined by the C object file being linked in, there will be a multiple definition error. To fix this problem, use the extern storage class. For example, given a C header file named foo.h:

struct Foo { };
struct Foo bar;

It can be replaced with the D modules, foo.d:

struct Foo { }
extern (C)
{
    extern Foo bar;
}

Typedef

alias is the D equivalent to the C typedef:

typedef int foo;

becomes:

alias foo = int;

Structs

Replace declarations like:

typedef struct Foo
{   int a;
    int b;
} Foo, *pFoo, *lpFoo;

with:

struct Foo
{   int a;
    int b;
}
alias pFoo  = Foo*;
alias lpFoo = Foo*;

Struct Member Alignment

A good D implementation by default will align struct members the same way as the C compiler it was designed to work with. But if the .h file has some #pragma's to control alignment, they can be duplicated with the D align attribute:

#pragma pack(1)
struct Foo
{
    int a;
    int b;
};
#pragma pack()

becomes:

struct Foo
{
  align (1):
    int a;
    int b;
}

Nested Structs

struct Foo
{
    int a;
    struct Bar
    {
	int c;
    } bar;
};
 
struct Abc
{
    int a;
    struct
    {
	int c;
    } bar;
};

becomes:

struct Foo
{
    int a;
    struct Bar
    {
	int c;
    }
    Bar bar;
}
 
struct Abc
{
    int a;
    struct
    {
	int c;
    }
}

__cdecl, __pascal, __stdcall

int __cdecl x;
int __cdecl foo(int a);
int __pascal bar(int b);
int __stdcall abc(int c);

becomes:

extern (C) int x;
extern (C) int foo(int a);
extern (Pascal) int bar(int b);
extern (Windows) int abc(int c);

__declspec(dllimport)

__declspec(dllimport) int __stdcall foo(int a);

becomes:

export extern (Windows) int foo(int a);

__fastcall

Unfortunately, D doesn't support the __fastcall convention. Therefore, a shim will be needed, either written in C:

int __fastcall foo(int a);
 
int myfoo(int a)
{
    return foo(int a);
}

and compiled with a C compiler that supports __fastcall and linked in, or compile the above, disassemble it with obj2asm and insert it in a D myfoo shim with inline assembler.