Difference between revisions of "D binding for C"

From D Wiki
Jump to: navigation, search
(Make introduction account for ImportC)
 
(27 intermediate revisions by 7 users not shown)
Line 1: Line 1:
While D cannot directly compile C source code, it can easily interface to C code, be linked with C object files, and call C functions in DLLs. The interface to C code is normally found in C '''.h''' files. So, the trick to connecting with C code is in converting C .h files to D modules. This turns out to be difficult to do mechanically since inevitably some human judgement must be applied. This is a guide to doing such conversions.
+
== Introduction ==
 +
 
 +
D can easily interface to C code, be linked with C object files, and call C functions in DLLs.
 +
With the [https://dlang.org/spec/importc.html ImportC] compiler extension, a D compiler can directly import or compile C source code.
 +
However, because of complex macros and compiler extensions, ImportC (and other automatic tools) might not get you there in one go, in which case manual C bindings must be written.
 +
 
 +
The interface to C code is normally found in C '''.h''' files. So, the trick to connecting with C code is in converting C .h files to D modules. In cases where automatic tools fail, some human judgement must be applied. This is a guide to doing such conversions.
 +
 
  
 
== Preprocessor ==
 
== Preprocessor ==
Line 9: Line 16:
  
 
will create a file '''program.lst''' which is the source file after all text preprocessing.
 
will create a file '''program.lst''' which is the source file after all text preprocessing.
 +
 +
For gcc (GNU Compiler Collection), use the command:
 +
<syntaxhighlight lang="bash">
 +
gcc -E -P program.h > program.lst
 +
</syntaxhighlight>
 +
  
 
Remove all the '''#if''', '''#ifdef''', '''#include''', etc. statements.
 
Remove all the '''#if''', '''#ifdef''', '''#include''', etc. statements.
Line 23: Line 36:
  
 
to give it C linkage.
 
to give it C linkage.
 +
 +
=== Global variables ===
 +
 +
Global variables need to have an extra <code>extern</code> and the <code>__gshared</code> storage.
 +
 +
''The C Way''
 +
 +
int a;
 +
 +
''The D Way''
 +
 +
extern (C) extern __gshared int a;
 +
 +
For TLS variables __gshared is not used.
 +
  
 
== Types ==
 
== Types ==
  
A little global search and replace will take care of renaming the C types to D types. The following table shows a typical mapping for 32 bit C code:
+
A little global search and replace will take care of renaming the C types to D types. The following tables show typical mappings for 32 bit and 64 bit C code.
 +
Note that there is a difference between them according to the type long. For convencience D offers the type alias '''core.stdc.config.c_ulong''' and '''core.stdc.config.c_long'''. 
 +
 
 +
Also note that the following lists sometimes show the implicit C variant, e.g., '''long long''' instead of its equivalent explicit variant '''long long int'''.
 +
 
 +
For 32 bit systems:
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 47: Line 80:
 
|long
 
|long
 
|int
 
|int
 +
|-
 +
|unsigned int
 +
|uint
 +
|-
 +
|int
 +
|int
 +
|-
 +
|unsigned short
 +
|ushort
 +
|-
 +
|signed char
 +
|byte
 +
|-
 +
|unsigned char
 +
|ubyte
 +
|-
 +
|wchar_t
 +
|wchar or dchar
 +
|-
 +
|bool
 +
|bool, byte, int
 +
|-
 +
|size_t
 +
|size_t
 +
|-
 +
|ptrdiff_t
 +
|ptrdiff_t
 +
 +
|}
 +
 +
For 64 bit systems:
 +
 +
{| class="wikitable"
 +
|+Mapping C type to D type
 +
!C type
 +
!D type
 +
|-
 +
|long double
 +
|real
 +
|-
 +
|unsigned long long
 +
|ulong
 +
|-
 +
|long long
 +
|long
 +
|-
 +
|unsigned long
 +
|uint (Windows) / ulong (Unix)
 +
|-
 +
|long
 +
|int (Windows) / long (Unix)
 
|-
 
|-
 
|unsigned
 
|unsigned
 
|uint
 
|uint
 +
|-
 +
|unsigned int
 +
|int
 
|-
 
|-
 
|unsigned short
 
|unsigned short
Line 80: Line 167:
  
 
== String Literals ==  
 
== String Literals ==  
In most cases, any ‘L’ prefix to a string can just be dropped, as D will implicitly convert strings to wide characters if necessary. However, one can also replace:
+
In most cases, any ‘L’ prefix to a string can just be dropped, as D will implicitly convert strings to wide characters if necessary.  
 +
 
 +
However, one can also replace:
  
 +
''The C Way''
 
<syntaxhighlight lang="C">
 
<syntaxhighlight lang="C">
 
L"string"
 
L"string"
Line 88: Line 178:
 
with:
 
with:
  
 +
''The D Way''
 
<syntaxhighlight lang="D">
 
<syntaxhighlight lang="D">
 
"string"w // for 16 bit wide characters
 
"string"w // for 16 bit wide characters
 
"string"d // for 32 bit wide characters
 
"string"d // for 32 bit wide characters
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
== Macros ==
 +
Lists of macros like:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
#define FOO 1
 +
#define BAR 2
 +
#define ABC 3
 +
#define DEF 40
 +
</syntaxhighlight>
 +
 +
can be replaced with:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
enum
 +
{  FOO = 1,
 +
    BAR = 2,
 +
    ABC = 3,
 +
    DEF = 40
 +
}
 +
</syntaxhighlight>
 +
 +
or with:
 +
 +
<syntaxhighlight lang="D">
 +
enum int FOO = 1;
 +
enum int BAR = 2;
 +
enum int ABC = 3;
 +
enum int DEF = 40;
 +
</syntaxhighlight>
 +
 +
Function style macros, such as:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
#define MAX(a,b) ((a) < (b) ? (b) : (a))
 +
</syntaxhighlight>
 +
 +
can be replaced with functions:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
int MAX(int a, int b) { return (a < b) ? b : a; }
 +
</syntaxhighlight>
 +
 +
The functions, however, won't work if they appear inside static initializers that must be evaluated at compile time rather than runtime. To do it at compile time, a template can be used:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
#define GT_DEPTH_SHIFT  (0)
 +
#define GT_SIZE_SHIFT  (8)
 +
#define GT_SCHEME_SHIFT (24)
 +
#define GT_DEPTH_MASK  (0xffU << GT_DEPTH_SHIFT)
 +
#define GT_TEXT        ((0x01) << GT_SCHEME_SHIFT)
 +
 +
/* Macro that constructs a graphtype */
 +
#define GT_CONSTRUCT(depth,scheme,size) \
 +
((depth) | (scheme) | ((size) << GT_SIZE_SHIFT))
 +
 +
/* Common graphtypes */
 +
#define GT_TEXT16  GT_CONSTRUCT(4, GT_TEXT, 16)
 +
</syntaxhighlight>
 +
 +
The corresponding D version would be:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
enum uint GT_DEPTH_SHIFT  = 0;
 +
enum uint GT_SIZE_SHIFT  = 8;
 +
enum uint GT_SCHEME_SHIFT = 24;
 +
enum uint GT_DEPTH_MASK  = 0xffU << GT_DEPTH_SHIFT;
 +
enum uint GT_TEXT        = 0x01 << GT_SCHEME_SHIFT;
 +
 +
// Template that constructs a graphtype
 +
template GT_CONSTRUCT(uint depth, uint scheme, uint size)
 +
{
 +
// notice the name of the const is the same as that of the template
 +
enum uint GT_CONSTRUCT = (depth | scheme | (size << GT_SIZE_SHIFT));
 +
}
 +
 +
// Common graphtypes
 +
enum uint GT_TEXT16 = GT_CONSTRUCT!(4, GT_TEXT, 16);
 +
</syntaxhighlight>
 +
 +
== Declaration Lists ==
 +
D doesn't allow declaration lists to change the type. Hence:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
int *p, q, t[3], *s;
 +
</syntaxhighlight>
 +
 +
should be written as:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
int* p, s;
 +
int q;
 +
int[3] t;
 +
</syntaxhighlight>
 +
 +
== Void Parameter Lists ==
 +
Functions that take no parameters:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
int foo(void);
 +
</syntaxhighlight>
 +
 +
are in D:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
int foo();
 +
</syntaxhighlight>
 +
 +
== Extern Global C Variables ==
 +
Whenever a global variable is declared in D, it is also defined. But if it's also defined by the C object file being linked in, there will be a multiple definition error. To fix this problem, use the extern storage class. For example, given a C header file named foo.h:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
struct Foo { };
 +
struct Foo bar;
 +
</syntaxhighlight>
 +
 +
It can be replaced with the D modules, foo.d:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
struct Foo { }
 +
extern (C)
 +
{
 +
    extern Foo bar;
 +
}
 +
</syntaxhighlight>
 +
 +
== Typedef ==
 +
<code>alias</code> is the D equivalent to the C typedef:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
typedef int foo;
 +
</syntaxhighlight>
 +
 +
becomes:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
alias foo = int;
 +
</syntaxhighlight>
 +
 +
== Function pointers ==
 +
 +
With function pointers there are (at least) two cases where an alias have to be used, instead of a function pointer.
 +
 +
* When declaring function parameters with a specific linkage.
 +
* When using a cast with a specific linkage. You won't see this in a binding, if you're not converting inline functions.
 +
 +
=== Function parameters ===
 +
The following is syntactically invalid in D:
 +
 +
''The C Way''
 +
void foo (extern(C) void function () callback);
 +
 +
Use an alias:
 +
 +
''The D Way''
 +
alias Callback = extern (C) void function();
 +
void foo (Callback callback);
 +
 +
=== Cast ===
 +
 +
You won't see this in a binding, if you're not converting inline functions.
 +
 +
This is invalid in D as well:
 +
 +
void* foo;
 +
...
 +
auto bar = cast(extern (C) void function ()) foo;
 +
 +
Use the same approach as above:
 +
 +
alias Callback = extern (C) void function();
 +
...
 +
auto bar = cast(Callback) foo;
 +
 +
== Structs ==
 +
Replace declarations like:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
typedef struct Foo
 +
{  int a;
 +
    int b;
 +
} Foo, *pFoo, *lpFoo;
 +
</syntaxhighlight>
 +
 +
with:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
struct Foo
 +
{  int a;
 +
    int b;
 +
}
 +
alias pFoo  = Foo*;
 +
alias lpFoo = Foo*;
 +
</syntaxhighlight>
 +
 +
== Anonymous structs ==
 +
If an anonymous struct is used directly to declare a variable you're forced to invent a name for the struct in D, since D doesn't support anonymous structs.
 +
 +
''The C Way''
 +
<syntaxhighlight lang="D">
 +
struct
 +
{
 +
  int a;
 +
  int b;
 +
} c;
 +
</syntaxhighlight>
 +
Translate to:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
struct _AnonymousStruct1
 +
{
 +
  int a;
 +
  int b;
 +
}
 +
 +
_AnonymousStruct1 c;
 +
</syntaxhighlight>
 +
 +
Any name can be used in this case.
 +
 +
== Struct Member Alignment ==
 +
A good D implementation by default will align struct members the same way as the C compiler it was designed to work with. But if the .h file has some <code>#pragma</code>'s to control alignment, they can be duplicated with the D align attribute:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
#pragma pack(1)
 +
struct Foo
 +
{
 +
    int a;
 +
    int b;
 +
};
 +
#pragma pack()
 +
</syntaxhighlight>
 +
 +
becomes:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
struct Foo
 +
{
 +
  align (1):
 +
    int a;
 +
    int b;
 +
}
 +
</syntaxhighlight>
 +
 +
== Nested Structs ==
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
struct Foo
 +
{
 +
    int a;
 +
    struct Bar
 +
    {
 +
int c;
 +
    } bar;
 +
};
 +
 +
struct Abc
 +
{
 +
    int a;
 +
    struct
 +
    {
 +
int c;
 +
    } bar;
 +
};
 +
</syntaxhighlight>
 +
 +
becomes:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
struct Foo
 +
{
 +
    int a;
 +
    struct Bar
 +
    {
 +
int c;
 +
    }
 +
    Bar bar;
 +
}
 +
 +
struct Abc
 +
{
 +
    int a;
 +
    struct
 +
    {
 +
int c;
 +
    }
 +
}
 +
</syntaxhighlight>
 +
 +
== __cdecl, __stdcall ==
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
int __cdecl x;
 +
int __cdecl foo(int a);
 +
int __stdcall abc(int c);
 +
</syntaxhighlight>
 +
 +
becomes:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
extern (C) int x;
 +
extern (C) int foo(int a);
 +
extern (Windows) int abc(int c);
 +
</syntaxhighlight>
 +
 +
== __declspec(dllimport) ==
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
__declspec(dllimport) int __stdcall foo(int a);
 +
</syntaxhighlight>
 +
 +
becomes:
 +
 +
''The D Way''
 +
<syntaxhighlight lang="D">
 +
export extern (Windows) int foo(int a);
 +
</syntaxhighlight>
 +
 +
== __fastcall ==
 +
Unfortunately, D doesn't support the '''__fastcall''' convention. Therefore, a shim will be needed, either written in C:
 +
 +
''The C Way''
 +
<syntaxhighlight lang="C">
 +
int __fastcall foo(int a);
 +
 +
int myfoo(int a)
 +
{
 +
    return foo(int a);
 +
}
 +
</syntaxhighlight>
 +
 +
and compiled with a C compiler that supports '''__fastcall''' and linked in, or compile the above, disassemble it with [http://www.digitalmars.com/ctg/obj2asm.html obj2asm] and insert it in a D '''myfoo''' shim with [http://dlang.org/iasm.html inline assembler].
 +
 +
 +
== See also ==
 +
 +
* [[Bind D to C]] Obsolete
 +
* [[Binding generators]] Tools which can perform such conversions automatically
 +
* [http://p0nce.github.io/d-idioms/#Porting-from-C-gotchas Porting from C gotchas]
 +
 +
[[Category:Binding]]
 +
[[Category:HowTo]]

Latest revision as of 10:44, 27 March 2024

Introduction

D can easily interface to C code, be linked with C object files, and call C functions in DLLs. With the ImportC compiler extension, a D compiler can directly import or compile C source code. However, because of complex macros and compiler extensions, ImportC (and other automatic tools) might not get you there in one go, in which case manual C bindings must be written.

The interface to C code is normally found in C .h files. So, the trick to connecting with C code is in converting C .h files to D modules. In cases where automatic tools fail, some human judgement must be applied. This is a guide to doing such conversions.


Preprocessor

.h files can sometimes be a bewildering morass of layers of macros, #include files, #ifdef's, etc. D doesn't include a text preprocessor like the C preprocessor, so the first step is to remove the need for it by taking the preprocessed output. For DMC (the Digital Mars C/C++ compiler), the command:

dmc -c program.h -e -l

will create a file program.lst which is the source file after all text preprocessing.

For gcc (GNU Compiler Collection), use the command:

gcc -E -P program.h > program.lst


Remove all the #if, #ifdef, #include, etc. statements.

Linkage

Generally, surround the entire module with:

extern (C)
{
     /* ...file contents... */
}

to give it C linkage.

Global variables

Global variables need to have an extra extern and the __gshared storage.

The C Way

int a;

The D Way

extern (C) extern __gshared int a;

For TLS variables __gshared is not used.


Types

A little global search and replace will take care of renaming the C types to D types. The following tables show typical mappings for 32 bit and 64 bit C code. Note that there is a difference between them according to the type long. For convencience D offers the type alias core.stdc.config.c_ulong and core.stdc.config.c_long.

Also note that the following lists sometimes show the implicit C variant, e.g., long long instead of its equivalent explicit variant long long int.

For 32 bit systems:

Mapping C type to D type
C type D type
long double real
unsigned long long ulong
long long long
unsigned long uint
long int
unsigned int uint
int int
unsigned short ushort
signed char byte
unsigned char ubyte
wchar_t wchar or dchar
bool bool, byte, int
size_t size_t
ptrdiff_t ptrdiff_t

For 64 bit systems:

Mapping C type to D type
C type D type
long double real
unsigned long long ulong
long long long
unsigned long uint (Windows) / ulong (Unix)
long int (Windows) / long (Unix)
unsigned uint
unsigned int int
unsigned short ushort
signed char byte
unsigned char ubyte
wchar_t wchar or dchar
bool bool, byte, int
size_t size_t
ptrdiff_t ptrdiff_t

NULL

NULL and ((void*)0) should be replaced with null. Numeric Literals Any ‘L’ or ‘l’ numeric literal suffixes should be removed, as a C long is (usually) the same size as a D int. Similarly, ‘LL’ suffixes should be replaced with a single ‘L’. Any ‘u’ suffix will work the same in D.

String Literals

In most cases, any ‘L’ prefix to a string can just be dropped, as D will implicitly convert strings to wide characters if necessary.

However, one can also replace:

The C Way

L"string"

with:

The D Way

"string"w	// for 16 bit wide characters
"string"d	// for 32 bit wide characters

Macros

Lists of macros like:

The C Way

#define FOO	1
#define BAR	2
#define ABC	3
#define DEF	40

can be replaced with:

The D Way

enum
{   FOO = 1,
    BAR = 2,
    ABC = 3,
    DEF = 40
}

or with:

enum int FOO = 1;
enum int BAR = 2;
enum int ABC = 3;
enum int DEF = 40;

Function style macros, such as:

The C Way

#define MAX(a,b) ((a) < (b) ? (b) : (a))

can be replaced with functions:

The D Way

int MAX(int a, int b) { return (a < b) ? b : a; }

The functions, however, won't work if they appear inside static initializers that must be evaluated at compile time rather than runtime. To do it at compile time, a template can be used:

The C Way

#define GT_DEPTH_SHIFT  (0)
#define GT_SIZE_SHIFT   (8)
#define GT_SCHEME_SHIFT (24)
#define GT_DEPTH_MASK   (0xffU << GT_DEPTH_SHIFT)
#define GT_TEXT         ((0x01) << GT_SCHEME_SHIFT)

/* Macro that constructs a graphtype */
#define GT_CONSTRUCT(depth,scheme,size) \
	((depth) | (scheme) | ((size) << GT_SIZE_SHIFT))

/* Common graphtypes */
#define GT_TEXT16  GT_CONSTRUCT(4, GT_TEXT, 16)

The corresponding D version would be:

The D Way

enum uint GT_DEPTH_SHIFT  = 0;
enum uint GT_SIZE_SHIFT   = 8;
enum uint GT_SCHEME_SHIFT = 24;
enum uint GT_DEPTH_MASK   = 0xffU << GT_DEPTH_SHIFT;
enum uint GT_TEXT         = 0x01 << GT_SCHEME_SHIFT;

// Template that constructs a graphtype
template GT_CONSTRUCT(uint depth, uint scheme, uint size)
{
 // notice the name of the const is the same as that of the template
 enum uint GT_CONSTRUCT = (depth | scheme | (size << GT_SIZE_SHIFT));
}

// Common graphtypes
enum uint GT_TEXT16 = GT_CONSTRUCT!(4, GT_TEXT, 16);

Declaration Lists

D doesn't allow declaration lists to change the type. Hence:

The C Way

int *p, q, t[3], *s;

should be written as:

The D Way

int* p, s;
int q;
int[3] t;

Void Parameter Lists

Functions that take no parameters:

The C Way

int foo(void);

are in D:

The D Way

int foo();

Extern Global C Variables

Whenever a global variable is declared in D, it is also defined. But if it's also defined by the C object file being linked in, there will be a multiple definition error. To fix this problem, use the extern storage class. For example, given a C header file named foo.h:

The C Way

struct Foo { };
struct Foo bar;

It can be replaced with the D modules, foo.d:

The D Way

struct Foo { }
extern (C)
{
    extern Foo bar;
}

Typedef

alias is the D equivalent to the C typedef:

The C Way

typedef int foo;

becomes:

The D Way

alias foo = int;

Function pointers

With function pointers there are (at least) two cases where an alias have to be used, instead of a function pointer.

  • When declaring function parameters with a specific linkage.
  • When using a cast with a specific linkage. You won't see this in a binding, if you're not converting inline functions.

Function parameters

The following is syntactically invalid in D:

The C Way

void foo (extern(C) void function () callback);

Use an alias:

The D Way

alias Callback = extern (C) void function(); 
void foo (Callback callback);

Cast

You won't see this in a binding, if you're not converting inline functions.

This is invalid in D as well:

void* foo;
...
auto bar = cast(extern (C) void function ()) foo;

Use the same approach as above:

alias Callback = extern (C) void function(); 
...
auto bar = cast(Callback) foo;

Structs

Replace declarations like:

The C Way

typedef struct Foo
{   int a;
    int b;
} Foo, *pFoo, *lpFoo;

with:

The D Way

struct Foo
{   int a;
    int b;
}
alias pFoo  = Foo*;
alias lpFoo = Foo*;

Anonymous structs

If an anonymous struct is used directly to declare a variable you're forced to invent a name for the struct in D, since D doesn't support anonymous structs.

The C Way

struct
{
   int a;
   int b;
} c;

Translate to:

The D Way

struct _AnonymousStruct1
{
   int a;
   int b;
}
 
_AnonymousStruct1 c;

Any name can be used in this case.

Struct Member Alignment

A good D implementation by default will align struct members the same way as the C compiler it was designed to work with. But if the .h file has some #pragma's to control alignment, they can be duplicated with the D align attribute:

The C Way

#pragma pack(1)
struct Foo
{
    int a;
    int b;
};
#pragma pack()

becomes:

The D Way

struct Foo
{
  align (1):
    int a;
    int b;
}

Nested Structs

The C Way

struct Foo
{
    int a;
    struct Bar
    {
	int c;
    } bar;
};

struct Abc
{
    int a;
    struct
    {
	int c;
    } bar;
};

becomes:

The D Way

struct Foo
{
    int a;
    struct Bar
    {
	int c;
    }
    Bar bar;
}

struct Abc
{
    int a;
    struct
    {
	int c;
    }
}

__cdecl, __stdcall

The C Way

int __cdecl x;
int __cdecl foo(int a);
int __stdcall abc(int c);

becomes:

The D Way

extern (C) int x;
extern (C) int foo(int a);
extern (Windows) int abc(int c);

__declspec(dllimport)

The C Way

__declspec(dllimport) int __stdcall foo(int a);

becomes:

The D Way

export extern (Windows) int foo(int a);

__fastcall

Unfortunately, D doesn't support the __fastcall convention. Therefore, a shim will be needed, either written in C:

The C Way

int __fastcall foo(int a);

int myfoo(int a)
{
    return foo(int a);
}

and compiled with a C compiler that supports __fastcall and linked in, or compile the above, disassemble it with obj2asm and insert it in a D myfoo shim with inline assembler.


See also