Win32 DLLs in D

From D Wiki
Jump to: navigation, search

DLLs (Dynamic Link Libraries) are one of the foundations of system programming for Windows. The D programming language enables the creation of several different types of DLLs.

For background information on what DLLs are and how they work Chapter 11 of Jeffrey Richter's book Advanced Windows is indispensible.

This guide will show how to create DLLs of various types with D.

Compiling a DLL

Use the -shared switch to tell the compiler that the generated code is to be put into a DLL. Code compiled for an EXE file will use the optimization assumption that _tls_index==0. Such code in a DLL will crash.

DLLs with a C Interface

A DLL presenting a C interface can connect to any other code in a language that supports calling C functions in a DLL.

DLLs can be created in D in roughly the same way as in C. A DllMain() is required, looking like:

import std.c.windows.windows;
import core.sys.windows.dll;
 
__gshared HINSTANCE g_hInst;
 
extern (Windows)
BOOL DllMain(HINSTANCE hInstance, ULONG ulReason, LPVOID pvReserved)
{
    switch (ulReason)
    {
	case DLL_PROCESS_ATTACH:
	    g_hInst = hInstance;
	    dll_process_attach( hInstance, true );
	    break;
 
	case DLL_PROCESS_DETACH:
	    dll_process_detach( hInstance, true );
	    break;
 
	case DLL_THREAD_ATTACH:
	    dll_thread_attach( true, true );
	    break;
 
	case DLL_THREAD_DETACH:
	    dll_thread_detach( true, true );
	    break;
 
        default:
    }
    return true;
}

Notes:

  • DllMain simply forwards to the appropriate helper functions. These setup the runtime, create thread objects for interaction with the garbage collector and initialize thread local storage data.
  • The DLL does not share its runtime or memory with other DLLs.
  • The first boolean argument to the dll-helper functions specify whether all threads should be controlled by the garbage collector. You might need more control over this behaviour if there are threads in the process that must not be suspended. In this case pass false to disable the automatic handling of all threads.
  • The presence of DllMain() is recognized by the compiler causing it to emit a reference to __acrtused_dll and the phobos.lib runtime library.

Link with a .def (Module Definition File) along the lines of:

LIBRARY         MYDLL
DESCRIPTION     'My DLL written in D'

EXETYPE		NT
CODE            PRELOAD DISCARDABLE
DATA            WRITE

EXPORTS
		DllGetClassObject       @2
		DllCanUnloadNow         @3
		DllRegisterServer       @4
		DllUnregisterServer     @5

The functions in the EXPORTS list are for illustration. Replace them with the actual exported functions from MYDLL. Alternatively, use implib. Here's an example of a simple DLL with a function print() which prints a string:

mydll.d:

module mydll;
import std.c.stdio;
export void dllprint() { printf("hello dll world\n"); }

Note: We use printfs in these examples instead of writefln to make the examples as simple as possible.

mydll.def:

LIBRARY "mydll.dll"
EXETYPE NT
SUBSYSTEM WINDOWS
CODE SHARED EXECUTE
DATA WRITE

Put the code above that contains DllMain() into a file dll.d. Compile and link the dll with the following command:

C:>dmd -ofmydll.dll -L/IMPLIB mydll.d dll.d mydll.def
C:>

which will create mydll.dll and mydll.lib. Now for a program, test.d, which will use the dll:

test.d:

import mydll;
 
int main()
{
   mydll.dllprint();
   return 0;
}

Create an interface file mydll.di that doesn't have the function bodies.

mydll.di:

export void dllprint();

Compile and link with the command:

C:>dmd test.d mydll.lib
C:>

and run:

C:>test
hello dll world
C:>

Memory Allocation

D DLLs use garbage collected memory management. The question is what happens when pointers to allocated data cross DLL boundaries? If the DLL presents a C interface, one would assume the reason for that is to connect with code written in other languages. Those other languages will not know anything about D's memory management. Thus, the C interface will have to shield the DLL's callers from needing to know anything about it.

There are many approaches to solving this problem:

  • Do not return pointers to D gc allocated memory to the caller of the DLL. Instead, have the caller allocate a buffer, and have the DLL fill in that buffer.
  • Retain a pointer to the data within the D DLL so the GC will not free it. Establish a protocol where the caller informs the D DLL when it is safe to free the data.
  • Notify the GC about external references to a memory block by calling GC.addRange.
  • Use operating system primitives like VirtualAlloc() to allocate memory to be transferred between DLLs.
  • Use std.c.stdlib.malloc() (or another non-gc allocator) when allocating data to be returned to the caller. Export a function that will be used by the caller to free the data.

COM Programming

Many Windows API interfaces are in terms of COM (Common Object Model) objects (also called OLE or ActiveX objects). A COM object is an object who's first field is a pointer to a vtbl[], and the first 3 entries in that vtbl[] are for QueryInterface(), AddRef(), and Release().

For understanding COM, Kraig Brockshmidt's Inside OLE is an indispensible resource.

COM objects are analogous to D interfaces. Any COM object can be expressed as a D interface, and every D object with an interface X can be exposed as a COM object X. This means that D is compatible with COM objects implemented in other languages.

While not strictly necessary, the Phobos library provides an Object useful as a super class for all D COM objects, called ComObject. ComObject provides a default implementation for QueryInterface(), AddRef(), and Release().

Windows COM objects use the Windows calling convention, which is not the default for D, so COM functions need to have the attribute extern (Windows).

So, to write a COM object:

import std.c.windows.com;
 
class MyCOMobject : ComObject
{
    extern (Windows):
	...
}

The sample code includes an example COM client program and server DLL.

D code calling D code in DLLs

Having DLLs in D be able to talk to each other as if they were statically linked together is, of course, very desirable as code between applications can be shared, and different DLLs can be independently developed.

The underlying difficulty is what to do about garbage collection (gc). Each EXE and DLL will have their own gc instance. While these gc's can coexist without stepping on each other, it's redundant and inefficient to have multiple gc's running. The idea explored here is to pick one gc and have the DLLs redirect their gc's to use that one. The one gc used here will be the one in the EXE file, although it's also possible to make a separate DLL just for the gc.

The example will show both how to statically load a DLL, and to dynamically load/unload it.

Starting with the code for the DLL, mydll.d:

module mydll;
 
/*
 * MyDll demonstration of how to write D DLLs.
 */
 
import core.runtime;
import std.c.stdio;
import std.c.stdlib;
import std.string;
import std.c.windows.windows;
 
HINSTANCE g_hInst;
 
extern (C)
{
    void gc_setProxy(void* p);
    void gc_clrProxy();
}
 
extern (Windows) BOOL DllMain(HINSTANCE hInstance, ULONG ulReason, LPVOID pvReserved)
{
    switch (ulReason)
    {
        case DLL_PROCESS_ATTACH:
            printf("DLL_PROCESS_ATTACH\n");
            Runtime.initialize();
            break;
 
        case DLL_PROCESS_DETACH:
            printf("DLL_PROCESS_DETACH\n");
            Runtime.terminate();
            break;
 
        case DLL_THREAD_ATTACH:
            printf("DLL_THREAD_ATTACH\n");
            return false;
 
        case DLL_THREAD_DETACH:
            printf("DLL_THREAD_DETACH\n");
            return false;
 
        default:
    }
    g_hInst = hInstance;
    return true;
}
 
export void MyDLL_Initialize(void* gc)
{
    printf("MyDLL_Initialize()\n");
    gc_setProxy(gc);
}
 
export void MyDLL_Terminate()
{
    printf("MyDLL_Terminate()\n");
    gc_clrProxy();
}
 
static this()
{
    printf("static this for mydll\n");
}
 
static ~this()
{
    printf("static ~this for mydll\n");
}
 
/* --------------------------------------------------------- */
 
class MyClass
{
    string concat(string a, string b)
    {
        return a ~ " " ~ b;
    }
}
 
export MyClass getMyClass()
{
    return new MyClass();
}


DllMain

This is the main entry point for any D DLL. It gets called by the C startup code (for DMC++, the source is \dm\src\win32\dllstart.c). The printf's are placed there so one can trace how it gets called. Notice that the initialization and termination code seen in the earlier DllMain sample code is in this version as well. This is because the same DLL should be usable from both C and D programs, so the same initialization process should work for both.


MyDLL_Initialize

When the DLL is dynamically linked via Runtime.loadLibrary() the runtime makes sure that any initialization steps required by the D program are executed after the library is loaded. If the library is statically linked, this routine is not called by the program, so to make sure the DLL is initialized properly we have to do some of the work ourselves. And because the library is being statically linked, we need a function specific to this DLL to perform the initialization. This function takes one argument, a handle to the caller's gc. We'll see how that handle is obtained later. To pass this handle to the runtime and override the DLL's built-in gc we'll call gc_setProxy(). The function is exported as that is how a function is made visible outside of a DLL.


MyDLL_Terminate

Correspondingly, this function terminates the DLL, and is called prior to unloading it. It has only one job: informing the runtime that the DLL will no longer be using the caller's gc via gc_clrProxy(). This is critical, as the DLL will be unmapped from memory, and if the gc continues to scan its data areas it will cause segment faults.


static this, static ~this

These are examples of the module's static constructor and destructor, here with a print in each to verify that they are running and when.


MyClass

This is an example of a class that can be exported from and used by the caller of a DLL. The concat member function allocates some gc memory, and free frees gc memory.


getMyClass

An exported factory that allocates an instance of MyClass and returns a reference to it.

To build the mydll.dll DLL:

1. dmd -c mydll -g

Compiles mydll.d into mydll.obj. -g turns on debug info generation.


2. dmd mydll.obj mydll.def -g -L/map

Links mydll.obj into a DLL named mydll.dll. mydll.def is the Module Definition File, and has the contents:

LIBRARY         MYDLL
DESCRIPTION     'MyDll demonstration DLL'
EXETYPE		NT
CODE            PRELOAD DISCARDABLE
DATA            PRELOAD MULTIPLE


-g turns on debug info generation, and -L/map generates a map file mydll.map.


3. implib /noi /system mydll.lib mydll.dll

Creates an import library mydll.lib suitable for linking in with an application that will be statically loading mydll.dll.


Here's test.d, a sample application that makes use of mydll.dll. There are two versions, one statically binds to the DLL, and the other dynamically loads it.

import core.runtime;
import std.stdio;
import core.memory;
 
import mydll;
 
//version=DYNAMIC_LOAD;
 
version (DYNAMIC_LOAD)
{
    import std.c.windows.windows;
 
    alias MyClass function() getMyClass_fp;
 
    int main()
    {
        HMODULE h;
        FARPROC fp;
 
        getMyClass_fp getMyClass;
        MyClass c;
 
        printf("Start Dynamic Link...\n");
 
        h = cast(HMODULE) Runtime.loadLibrary("mydll.dll");
        if (h is null)
        {
            printf("error loading mydll.dll\n");
            return 1;
        }
 
        fp = GetProcAddress(h, "D5mydll10getMyClassFZC5mydll7MyClass");
        if (fp is null)
        {   printf("error loading symbol getMyClass()\n");
            return 1;
        }
 
        getMyClass = cast(getMyClass_fp) fp;
        c = (*getMyClass)();
        foo(c);
 
        if (!Runtime.unloadLibrary(h))
        {   printf("error freeing mydll.dll\n");
            return 1;
        }
 
        printf("End...\n");
        return 0;
    }
}
else
{   // static link the DLL
    extern (C)
    {
        void* gc_getProxy();
    }
 
    int main()
    {
        printf("Start Static Link...\n");
        MyDLL_Initialize(gc_getProxy());
        foo(getMyClass());
        MyDLL_Terminate();
        printf("End...\n");
        return 0;
    }
}
 
void foo(MyClass c)
{
    string s = c.concat("Hello", "world!");
    writeln(s);
}

Let's start with the statically linked version, which is simpler. It's compiled and linked with the command:

C:>dmd test mydll.lib -g

Note how it is linked with mydll.lib, the import library for mydll.dll. The code is straightforward, it initializes mydll.lib with a call to MyDLL_Initialize(), passing the handle to test.exe's gc. Then, we can use the DLL and call its functions just as if it were part of test.exe. In foo(), gc memory is allocated and freed both by test.exe and mydll.dll. When we're done using the DLL, it is terminated with MyDLL_Terminate().

Running it looks like this:

C:>test
DLL_PROCESS_ATTACH
Start Static Link...
MyDLL_Initialize()
static this for mydll
Hello world!
MyDLL_Terminate()
static ~this for mydll
End...
C:>

The dynamically linked version is a little harder to set up. Compile and link it with the command:

C:>dmd test -version=DYNAMIC_LOAD -g

The import library mydll.lib is not needed. The DLL is loaded with a call to Runtime.loadLibrary(), and each exported function has to be retrieved via a call to GetProcAddress(). An easy way to get the decorated name to pass to GetProcAddress() is to copy and paste it from the generated mydll.map file under the Export heading. Once this is done, we can use the member functions of the DLL classes as if they were part of test.exe. When done, release the DLL with Runtime.unloadLibrary().

Running it looks like this:

C:>test
Start Dynamic Link...
DLL_PROCESS_ATTACH
static this for mydll
Hello world!
static ~this for mydll
DLL_PROCESS_DETACH
End...
C:>



Windows DLL know-how Development