LDC-specific language changes

From D Wiki
Revision as of 15:38, 19 December 2016 by JohanEngelen (talk | contribs) (Traits)
Jump to: navigation, search

LDC tries to conform to the D specification as closely as possible. There are a few deviations, and several small extensions, which are documented here.

Violations of the specification

Some parts of the D specification are hard or impossible to implement with LLVM, they should be listed here.

Inline assembler

Almost everything works, but there are a few open issues. For instance the D spec isn't clear about at all is how asm blocks mixed with normal D code (for example code between two asm blocks) interacts.

Specific issues are:

ret

In short, LLVM inline assembler is not allowed to control program flow outside of the asm blocks, see below for a bit more information.

Gotos into inline assembly

For labels inside inline asm blocks, the D spec says "They can be the target of goto statements.", this is not supported at the moment. Basically, LLVM does not allow jumping in to or out of an asm block. We work around this for jumping out of asm by converting these branches to assignments to a temporary that is then used in a switch statement right after the inline asm block to jump to the final destination. This same workaround could be applied for jumping into inline assembly.

Deviations from the D ABI

The D spec only specifies an ABI for x86 processors on Windows and Linux. On other architectures and platforms LDC is free to do as it pleases, and does. However, on x86 the only parts of the ABI currently implemented is:

  • the callee clears any parameters from the stack
  • floating point values are returned on the x87 FPU stack
  • reversing order of parameters
  • returning delegates and dynamic arrays in EAX/EDX.
  • passing last argument in EAX
  • returning small structs in EAX

Extended inline assembly

LDC supports an LLVM-specific variant of GCC's extended inline assembly expressions. See the inline assembly expressions page for more information.

Inline LLVM IR

LDC supports "inlining" LLVM IR. See the LDC inline IR page for more information.

Versions

Besides the predefined versions from the D spec (in particular, of course, LDC), LDC conditionally defines a few more version identifiers for backwards compatibility reasons: LLVM, LLVM64, Thumb, mingw32, darwin, solaris. Please migrate your code to the official identifiers; the old ones might go away soon.

An LDC_LLVM_* version identifier is defined corresponding to the LLVM version used to build LDC, e.g. LDC_LLVM_308 for LLVM 3.8.*.

Attributes

LDC provides a number of extra special attributes (UDAs) that can be applied to functions and variables and that are recognized by LDC for specific compiler behavior. The attributes are defined in the ldc.attributes module:

import ldc.attributes;
@(ldc.attributes.section(".mySection")) int global;


@(ldc.attributes.allocSize)

Applies to: functions only. (LDC >= 1.2.0)

Specifies that the function returns null or a pointer to at least a certain number of allocated bytes. Parameters sizeArgIdx and numArgIdx specify the 0-based index of the function arguments that should be used to calculate the number of bytes returned:

  bytes = arg[sizeArgIdx] * (numArgIdx < 0) ? arg[numArgIdx] : 1

The optimizer may assume that an @allocSize function has no other side effects and that it is valid to eliminate calls to the function if the allocated memory is not used. The optimizer will eliminate all code from foo in this example:

@allocSize(0) void* myAlloc(size_t size);
void foo() {
    auto p = myAlloc(100);
    p[0] = 1;
}

See LLVM LangRef "allocsize" for more details.

This attribute has no effect for LLVM < 3.9.

Example:

import ldc.attributes;
@allocSize(0) extern(C) void* malloc(size_t size);
@allocSize(2,1) extern(C) void* reallocarray(void *ptr, size_t nmemb,
                                             size_t size);
@allocSize(0,1) void* my_calloc(size_t element_size, size_t count,
                                bool irrelevant);

@(ldc.attributes.fastmath)

Applies to: functions only. (LDC >= 1.1.0)

Sets "fast math" for a function, enabling aggressive math optimizations.

The fast math optimizations may dramatically change the outcome of floating point calculations (e.g. because of reassociation).

Example:

@fastmath
double dot(double[] a, double[] b) {
    double s = 0;
    foreach(size_t i; 0..a.length)
    {
        // will result in vectorized fused-multiply-add instructions
        s += a * b;
    }
    return s;
}

@(ldc.attributes.llvmAttr("key", "value"))

Applies to: functions only

Adds an LLVM attribute to a function, without checking the validity or applicability of the attribute. The attribute is specified as key-value pair. If the value string is empty, just the key is added as attribute.

The attribute is applied directly to the generated LLVM IR and should only be used when absolutely necessary. LDC cannot make any guarantee about whether the attribute is valid, whether it works as expected or not, or what effects it will have on any part of the program.

Example:

@llvmAttr("unsafe-fp-math", "true")
double dot(double[] a, double[] b) {
    double s = 0;
    foreach(size_t i; 0..a.length)
    {
        s = inlineIR!(`
        %p = fmul fast double %0, %1
        %r = fadd fast double %p, %2
        ret double %r`, double)(a[i], b[i], s);
    }
    return s;
}

@(ldc.attributes.optStrategy("strategy"))

Applies to: functions only

Sets the optimization strategy for a function. Valid strategies are "minsize", "none", and "optsize". The strategies are mutually exclusive.

@optStrategy("none") in particular is useful to selectively debug functions when a fully unoptimized program cannot be used (e.g. due to too low performance).

Strategies:

"minsize"
Tries to keep the code size of the function low and does optimizations to reduce code size that may reduce runtime performance.
"none"
Disables most optimizations for a function.
It implies pragma(inline, false): the function is never inlined in a calling function, and the attribute cannot be combined with pragma(inline, true). Functions with pragma(inline, true) are still candidates for inlining into the function.
"optsize"
Tries to keep the code size of the function low and does optimizations to reduce code size as long as they do not significantly impact runtime performance.

Example:

@optStrategy("none")
void foo() {
    // ...
}

@(ldc.attributes.section("section_name"))

Applies to: functions and global variables

This attribute overrides the default section for functions/variables and allows you to explicitly define the "section_name" into which to place function or variable (variables must be global).

Example:

void foo() {/+...+/}
@(section(".interrupt_vector"))
void function()[3] table_interrupt_vector = [ &foo,
                                              &foo,
                                              &foo ];

@(ldc.attributes.target("feature"))

Applies to: functions only

This attribute adds extra target features to the ones specified on the commandline. The recognized features depend on the LLVM version.

The passed string should be a comma-separated list of options. The options are passed to LLVM by adding them to the "target-features" function attribute, after minor processing: negative options (e.g. "no-sse") have the "no" stripped (--> "-sse"), whereas positive options (e.g. "sse") gain a leading "+" (--> "+sse"). Negative options override positive options regardless of their order. The "arch=..." option is a special case and is passed to LLVM via the "target-cpu" function attribute.

@target mimics GCC's __attribute__((target("..."))) and the recognized feature strings are very alike.

Examples:

 @target("no-sse")
 void foo_nosse(float *A, float* B, float K, uint n) {
     for (int i = 0; i < n; ++i)
         A[i] *= B[i] + K;
 }
 @target("arch=haswell")
 void foo_haswell(float *A, float* B, float K, uint n) {
     for (int i = 0; i < n; ++i)
         A[i] *= B[i] + K;
 }

@(ldc.attributes.weak)

Applies to: functions and global variables

When applied to a global symbol, specifies that the symbol should be emitted with weak linkage. A weak symbol can be thought of as a "default implementation" of that symbol, that is overridable by user code. An example use case is a library function that should be overridable by user code.

"Note that weak linkage does not actually allow the optimizer to inline the body of this function into callers because it doesn’t know if this definition of the function is the definitive definition within the program or whether it will be overridden by a stronger definition." -- the LLVM Language Reference

Example:
In a library:

 // When allocate_memory(...) is used in the library it remains overridable by user code
 // such that allocations *within* the library are also done by the user's version.
 // If the user does not provide allocate_memory, the library's version is used.
 @weak void* allocate_memory(size_t size)
 {
     return malloc(size);
 }

In user code:

 // Override default memory allocation (note: no @weak)
 void* allocate_memory(size_t size)
 {
     return my_own_malloc(size);
 }

Pragmas

LDC provides pragmas to access internal functions and can be used to tweak certain behavior.

LDC_intrinsic

The LDC_intrinsic pragma provides access to LLVM's built-in intrinsic functions. It requires a single string literal parameter with full name of the intrinsic. For example "llvm.sqrt.f32".

  • It can only be used on function declarations or funtion template declarations.
  • Any affected function declarations are not allowed to have bodies.
  • The functions must translate to the same signature as the intrinsic.
  • You may not take the address of intrinsics.

Example:

// provide square root intrinsics
pragma(LDC_intrinsic, "llvm.sqrt.f32")
  float sqrt(float);
pragma(LDC_intrinsic, "llvm.sqrt.f64")
  double sqrt(double);
pragma(LDC_intrinsic, "llvm.sqrt.f80")
  real sqrt(real); // x86 only

Overloaded intrinsics can also be accessed more easily with a templated version instead, currently only one overloaded type is supported.

Example:

// templated atomic swap intrinsic
pragma(LDC_intrinsic, "llvm.atomic.swap.i#.p0i#")
    T llvm_atomic_swap(T)(T* ptr, T val);

The # mark in the name is replaced with the size in bits of the type of the template parameter.

The LDC_intrinsic pragma should not be used in user-code directly, instead, please refer to the ldc.intrinsics module.

LDC_no_typeinfo

You can use this pragma to stop typeinfo from being implicitly generated for a declaration.

Example:

pragma(LDC_no_typeinfo) {
    struct Opaque {}
}

LDC_no_moduleinfo

This pragma disables the generation of the ModuleInfo metadata to register the current module with druntime. Note that this, among other things, leads to any static constructors not being run, and should only be used in very specific circumstances.

Example:

module my_bare_metal_module;
pragma(LDC_no_moduleinfo);

LDC_alloca

This pragma allows you to access the alloca instruction of LLVM directly. It only applies to function declarations and the final LLVM type for that declaration must be: i8* (i32/i64). The size parameter will be truncated to i32 if necessary.

Example:

pragma(LDC_alloca) void* alloca(size_t);

Variadic argument handling intrinsics

Example:

alias void* va_list;

pragma(LDC_va_start) void va_start(T)(va_list ap, ref T);

pragma(LDC_va_arg) T va_arg(T)(va_list ap);

pragma(LDC_va_end) void va_end(va_list args);

pragma(LDC_va_copy) void va_copy(va_list dst, va_list src);

LDC_allow_inline

Use this pragma statement inside a non-naked function using inline asm. This will tell the optimizers that it is safe to inline this function.

Example:

int add(int a, int b) {
    pragma(LDC_allow_inline);
    asm { mov EAX, a; add EAX, b; }
}

LDC_never_inline

Use of this pragma statement will tell the optimizers that this function should never be inlined.

Example:

void* getStackTop() {
    pragma(LDC_never_inline);
    return ldc.intrinsics.llvm_frameaddress(0);
}

LDC_inline_ir

This pragma makes it possible to use llvm assembly language from D. See the LDC inline IR page for more information.

LDC_global_crt_ctor and LDC_global_crt_dtor

If you are doing very low-level stuff then there might be the need to execute code before the D runtime is initialized. With these 2 pragmas it possible to run code as part of the C runtime construction and destruction. A possible application is the initialization of a global mutex as it is done in monitor_.d. If the pragma is specified on a function or static method then an entry is made in the corresponding list. E.g. in monitor_.d:

    extern (C) {
        pragma(LDC_global_crt_ctor, 1024)
        void _STI_monitor_staticctor()
        {
            // ...
        }
    }

The optional priority is used to order the execution of the functions. (For more information see the LLVM documentation on global ctors and global dtors variables.)

This works on Linux without problems. On Windows with MS C Runtime ctors work always but dtors are invoked only if linked against the static C runtime. Dtors on Windows require at least LLVM 3.2.

Traits

targetCPU

(LDC >= 1.1.0)

This trait returns a string describing the CPU that LDC is targeting. The CPU name is what LLVM uses and is the same as what can be passed with the -mcpu= commandline option.
Refer to LLVM's documentation for a list of CPU names.
Note that the string is a compile-time constant (in contrast with run-time information obtainable from e.g. CPUID).

Example:

import std.stdio;
void main() {
  writeln("CPU = ", __traits(targetCPU));
}

which prints "CPU = haswell" when compiled with -mcpu=native on my machine.

targetHasFeature

(LDC >= 1.1.0)

This trait returns a boolean value that indicates whether the target CPU supports a certain feature. The trait only returns true when it is certain that the feature is supported. If false is returned, the feature may or may not be present. The spelling of the feature has to be exactly according to LLVM's convention for feature names (called "attributes", passed on the commandline using -mattr=).
Note that this is compile-time information (in contrast with run-time information obtainable from e.g. CPUID).

Example:

import std.stdio;
void main() {
  writeln("CPU = ", __traits(targetCPU));
  writeln("Has 'sse3' = ", __traits(targetHasFeature, "sse3"));
  writeln("Has 'sse4' = ", __traits(targetHasFeature, "sse4"));
  writeln("Has 'sse4.1' = ", __traits(targetHasFeature, "sse4.1"));
}

which prints this on my machine:

   ❯ ldc2 -mcpu=native -run target_traits.d
   CPU = haswell
   Has 'sse3' = true
   Has 'sse4' = false
   Has 'sse4.1' = true