Difference between revisions of "LDC-specific language changes"
JohanEngelen (talk | contribs) (@(ldc.attributes.naked)) |
JohanEngelen (talk | contribs) (Sort the pragmas) |
||
Line 365: | Line 365: | ||
== Pragmas == | == Pragmas == | ||
− | LDC provides pragmas to access | + | LDC provides pragmas to tweak certain behavior. |
+ | |||
+ | === LDC_alloca === | ||
+ | |||
+ | This pragma allows you to access the alloca instruction of LLVM directly. It only applies to function declarations and the final LLVM type for that declaration must be: '''i8* (i32/i64)'''. The size parameter will be truncated to '''i32''' if necessary. | ||
+ | |||
+ | Example: | ||
+ | <source lang="d"> | ||
+ | pragma(LDC_alloca) void* alloca(size_t); | ||
+ | </source> | ||
+ | |||
+ | === LDC_allow_inline === | ||
+ | |||
+ | Use this pragma statement inside a non-naked function using inline asm. This will tell the optimizers that it is safe to inline this function. | ||
+ | |||
+ | Example: | ||
+ | <source lang="d"> | ||
+ | int add(int a, int b) { | ||
+ | pragma(LDC_allow_inline); | ||
+ | asm { mov EAX, a; add EAX, b; } | ||
+ | } | ||
+ | </source> | ||
+ | |||
+ | === LDC_global_crt_ctor and LDC_global_crt_dtor === | ||
+ | |||
+ | If you are doing very low-level stuff then there might be the need to execute code before the D runtime is initialized. With these 2 pragmas it possible to run code as part of the C runtime construction and destruction. A possible application is the initialization of a global mutex as it is done in monitor_.d. If the pragma is specified on a function or static method then an entry is made in the corresponding list. E.g. in monitor_.d: | ||
+ | |||
+ | <source lang="d"> | ||
+ | extern (C) { | ||
+ | pragma(LDC_global_crt_ctor, 1024) | ||
+ | void _STI_monitor_staticctor() | ||
+ | { | ||
+ | // ... | ||
+ | } | ||
+ | } | ||
+ | </source> | ||
+ | |||
+ | The optional priority is used to order the execution of the functions. (For more information see the LLVM documentation on | ||
+ | [http://www.llvm.org/docs/LangRef.html#the-llvm-global-ctors-global-variable global ctors] and [http://www.llvm.org/docs/LangRef.html#the-llvm-global-dtors-global-variable global dtors] variables.) | ||
+ | |||
+ | This works on Linux without problems. On Windows with MS C Runtime ctors work always but dtors are invoked only if linked against the static C runtime. Dtors on Windows require at least LLVM 3.2. | ||
+ | |||
+ | === LDC_inline_ir === | ||
+ | |||
+ | This pragma makes it possible to use [http://llvm.org/docs/LangRef.html llvm assembly language] from D. See the [[LDC inline IR|LDC inline IR page]] for more information. | ||
=== LDC_intrinsic === | === LDC_intrinsic === | ||
Line 400: | Line 444: | ||
The <tt>LDC_intrinsic</tt> pragma should not be used in user-code directly, instead, please refer to the <tt>ldc.intrinsics</tt> module. | The <tt>LDC_intrinsic</tt> pragma should not be used in user-code directly, instead, please refer to the <tt>ldc.intrinsics</tt> module. | ||
− | === | + | === LDC_never_inline === |
− | + | Use of this pragma statement will tell the optimizers that this function should never be inlined. | |
Example: | Example: | ||
<source lang="d"> | <source lang="d"> | ||
− | pragma( | + | void* getStackTop() { |
− | + | pragma(LDC_never_inline); | |
+ | return ldc.intrinsics.llvm_frameaddress(0); | ||
} | } | ||
</source> | </source> | ||
Line 421: | Line 466: | ||
</source> | </source> | ||
− | === | + | === LDC_no_typeinfo === |
− | + | You can use this pragma to stop typeinfo from being implicitly generated for a declaration. | |
Example: | Example: | ||
<source lang="d"> | <source lang="d"> | ||
− | pragma( | + | pragma(LDC_no_typeinfo) { |
+ | struct Opaque {} | ||
+ | } | ||
</source> | </source> | ||
− | === | + | === LDC_va_* for variadic argument handling intrinsics === |
Example: | Example: | ||
Line 444: | Line 491: | ||
pragma(LDC_va_copy) void va_copy(va_list dst, va_list src); | pragma(LDC_va_copy) void va_copy(va_list dst, va_list src); | ||
</source> | </source> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Traits == | == Traits == |
Revision as of 12:22, 25 January 2019
LDC tries to conform to the D specification as closely as possible. There are a few deviations, and several small extensions, which are documented here.
Contents
- 1 Violations of the specification
- 2 Extended inline assembly
- 3 Inline LLVM IR
- 4 Versions
- 5 Attributes
- 5.1 @(ldc.attributes.allocSize)
- 5.2 @(ldc.attributes.dynamicCompile)
- 5.3 @(ldc.attributes.dynamicCompileConst)
- 5.4 @(ldc.attributes.dynamicCompileEmit)
- 5.5 @(ldc.attributes.fastmath)
- 5.6 @(ldc.attributes.llvmAttr("key", "value"))
- 5.7 @(ldc.attributes.llvmFastMathFlag("flag"))
- 5.8 @(ldc.attributes.naked)
- 5.9 @(ldc.attributes.optStrategy("strategy"))
- 5.10 @(ldc.attributes.polly)
- 5.11 @(ldc.attributes.section("section_name"))
- 5.12 @(ldc.attributes.target("feature"))
- 5.13 @(ldc.attributes.weak)
- 6 Pragmas
- 7 Traits
Violations of the specification
Some parts of the D specification are hard or impossible to implement with LLVM, they should be listed here.
Inline assembler
Almost everything works, but there are a few open issues. For instance the D spec isn't clear about at all is how asm blocks mixed with normal D code (for example code between two asm blocks) interacts.
Specific issues are:
ret
In short, LLVM inline assembler is not allowed to control program flow outside of the asm blocks, see below for a bit more information.
Gotos into inline assembly
For labels inside inline asm blocks, the D spec says "They can be the target of goto statements.", this is not supported at the moment. Basically, LLVM does not allow jumping in to or out of an asm block. We work around this for jumping out of asm by converting these branches to assignments to a temporary that is then used in a switch statement right after the inline asm block to jump to the final destination. This same workaround could be applied for jumping into inline assembly.
Deviations from the D ABI
The D spec only specifies an ABI for x86 processors on Windows and Linux. On other architectures and platforms LDC is free to do as it pleases, and does. However, on x86 the only parts of the ABI currently implemented is:
- the callee clears any parameters from the stack
- floating point values are returned on the x87 FPU stack
- reversing order of parameters
- returning delegates and dynamic arrays in EAX/EDX.
- passing last argument in EAX
- returning small structs in EAX
Extended inline assembly
LDC supports an LLVM-specific variant of GCC's extended inline assembly expressions. See the inline assembly expressions page for more information.
Inline LLVM IR
LDC supports "inlining" LLVM IR. See the LDC inline IR page for more information.
Versions
Besides the predefined versions from the D spec (in particular, of course, LDC), LDC conditionally defines a few more version identifiers for backwards compatibility reasons: LLVM, LLVM64, Thumb, mingw32, darwin, solaris. Please migrate your code to the official identifiers; the old ones might go away soon.
An LDC_LLVM_* version identifier is defined corresponding to the LLVM version used to build LDC, e.g. LDC_LLVM_308 for LLVM 3.8.*.
Attributes
LDC provides a number of extra special attributes (UDAs) that can be applied to functions and variables and that are recognized by LDC for specific compiler behavior. The attributes are defined in the ldc.attributes module:
import ldc.attributes;
@(ldc.attributes.section(".mySection")) int global;
@(ldc.attributes.allocSize)
Applies to: functions only. (LDC >= 1.2.0)
Specifies that the function returns null or a pointer to at least a certain number of allocated bytes. Parameters sizeArgIdx and numArgIdx specify the 0-based index of the function arguments that should be used to calculate the number of bytes returned:
bytes = arg[sizeArgIdx] * (numArgIdx < 0) ? arg[numArgIdx] : 1
The optimizer may assume that an @allocSize function has no other side effects and that it is valid to eliminate calls to the function if the allocated memory is not used. The optimizer will eliminate all code from foo in this example:
@allocSize(0) void* myAlloc(size_t size);
void foo() {
auto p = myAlloc(100);
p[0] = 1;
}
See LLVM LangRef "allocsize" for more details.
This attribute has no effect for LLVM < 3.9.
Example:
import ldc.attributes;
@allocSize(0) extern(C) void* malloc(size_t size);
@allocSize(2,1) extern(C) void* reallocarray(void *ptr, size_t nmemb,
size_t size);
@allocSize(0,1) void* my_calloc(size_t element_size, size_t count,
bool irrelevant);
@(ldc.attributes.dynamicCompile)
Applies to: functions only
When applied to a function, marks this function for dynamic compilation. All functions marked with this attribute must be explicitly compiled in runtime via ldc.dynamic_compile api before usage.
This attribute has no effect if dynamic compilation wasn't enabled with -enable-dynamic-compile
Example:
import ldc.attributes;
@dynamicCompile int foo() { return 42; }
...
// Somewhere at startup
compileDynamicCode();
...
int val = foo();
@(ldc.attributes.dynamicCompileConst)
Applies to: global variables
When applied to global variable, this variable will be treated as constant by any dynamically compiled functions and is subject to optimizations. All dynamically compiled functions must be recompiled after any update to such variable.
This attribute has no effect if dynamic compilation wasn't enabled with -enable-dynamic-compile
Example:
import ldc.attributes;
@dynamicCompileConst __gshared int value;
@dynamicCompile int foo() { return value * 42; }
...
// Somewhere at startup
value = 5;
compileDynamicCode();
...
int val = foo();
@(ldc.attributes.dynamicCompileEmit)
Applies to: functions only. (LDC >= 1.11.0)
When applied to a function, makes this function available for dynamic compilation. In contrast to @dynamicCompile, calls to the function will be to the statically compiled function (like normal functions). The function body is made available for dynamic compilation with the jit facilities (e.g. jit bind). If both @dynamicCompile and @dynamicCompileEmit attributes are applied to function, @dynamicCompile will get precedence.
This attribute has no effect if dynamic compilation wasn't enabled with -enable-dynamic-compile
Example:
import ldc.attributes;
@dynamicCompileEmit int foo() { return 42; }
...
int val = foo(); // compileDynamicCode() doesn't needed for direct call
...
@(ldc.attributes.fastmath)
Applies to: functions only. (LDC >= 1.1.0)
Sets "fast math" for a function, enabling aggressive math optimizations.
The fast math optimizations may dramatically change the outcome of floating point calculations (e.g. because of reassociation).
Example:
@fastmath
double dot(double[] a, double[] b) {
double s = 0;
foreach(size_t i; 0..a.length)
{
// will result in vectorized fused-multiply-add instructions
s += a * b;
}
return s;
}
@(ldc.attributes.llvmAttr("key", "value"))
Applies to: functions only
Adds an LLVM attribute to a function, without checking the validity or applicability of the attribute. The attribute is specified as key-value pair. If the value string is empty, just the key is added as attribute.
The attribute is applied directly to the generated LLVM IR and should only be used when absolutely necessary. LDC cannot make any guarantee about whether the attribute is valid, whether it works as expected or not, or what effects it will have on any part of the program.
Example:
@llvmAttr("unsafe-fp-math", "true")
double dot(double[] a, double[] b) {
double s = 0;
foreach(size_t i; 0..a.length)
{
s = inlineIR!(`
%p = fmul fast double %0, %1
%r = fadd fast double %p, %2
ret double %r`, double)(a[i], b[i], s);
}
return s;
}
@(ldc.attributes.llvmFastMathFlag("flag"))
Applies to: functions only. (LDC >= 1.1.0)
Sets an LLVM's fast-math flag for all floating point operations in the function this attribute is applied to. See LLVM LangRef for possible values: http://llvm.org/docs/LangRef.html#fast-math-flags
@llvmFastMathFlag("clear") clears all flags.
Example:
@llvmFastMathFlag("contract")
double fma(double a, double b, double c)
{
return a * b + c;
}
@(ldc.attributes.naked)
Applies to: functions only. (LDC >= 1.11.0)
Adds LLVM's "naked" attribute to a function, disabling function prologue and epilogue emission, incl. LDC's. Intended to be used in combination with a function body defined via ldc.llvmasm.__asm() and/or ldc.simd.inlineIR().
Example:
import ldc.attributes;
import ldc.llvmasm;
void* getStackTop() nothrow @nogc @naked
{
return __asm!(void*)("movl %esp, $0", "=r");
}
@(ldc.attributes.optStrategy("strategy"))
Applies to: functions only. (LDC >= 1.1.0)
Sets the optimization strategy for a function. Valid strategies are "minsize", "none", and "optsize". The strategies are mutually exclusive.
@optStrategy("none") in particular is useful to selectively debug functions when a fully unoptimized program cannot be used (e.g. due to too low performance).
Strategies:
- "minsize"
- Tries to keep the code size of the function low and does optimizations to reduce code size that may reduce runtime performance.
- "none"
- Disables most optimizations for a function.
- It implies pragma(inline, false): the function is never inlined in a calling function, and the attribute cannot be combined with pragma(inline, true). Functions with pragma(inline, true) are still candidates for inlining into the function.
- "optsize"
- Tries to keep the code size of the function low and does optimizations to reduce code size as long as they do not significantly impact runtime performance.
Example:
@optStrategy("none")
void foo() {
// ...
}
@(ldc.attributes.polly)
Applies to functions only
Experimental! (not guaranteed to work). Requires an LDC built against LLVM built with Polly. Enables advanced polyhedral optimisations on the applied function.
Example
@polly void someExpensiveFunction(double[] d) {/+...+/}
@(ldc.attributes.section("section_name"))
Applies to: functions and global variables
This attribute overrides the default section for functions/variables and allows you to explicitly define the "section_name" into which to place function or variable (variables must be global).
Example:
void foo() {/+...+/}
@(section(".interrupt_vector"))
void function()[3] table_interrupt_vector = [ &foo,
&foo,
&foo ];
@(ldc.attributes.target("feature"))
Applies to: functions only
This attribute adds extra target features to the ones specified on the commandline. The recognized features depend on the LLVM version.
The passed string should be a comma-separated list of options. The options are passed to LLVM by adding them to the "target-features" function attribute, after minor processing: negative options (e.g. "no-sse") have the "no" stripped (--> "-sse"), whereas positive options (e.g. "sse") gain a leading "+" (--> "+sse"). Negative options override positive options regardless of their order. The "arch=..." option is a special case and is passed to LLVM via the "target-cpu" function attribute.
@target mimics GCC's __attribute__((target("..."))) and the recognized feature strings are very alike.
Examples:
@target("no-sse")
void foo_nosse(float *A, float* B, float K, uint n) {
for (int i = 0; i < n; ++i)
A[i] *= B[i] + K;
}
@target("arch=haswell")
void foo_haswell(float *A, float* B, float K, uint n) {
for (int i = 0; i < n; ++i)
A[i] *= B[i] + K;
}
@(ldc.attributes.weak)
Applies to: functions and global variables
When applied to a global symbol, specifies that the symbol should be emitted with weak linkage. A weak symbol can be thought of as a "default implementation" of that symbol, that is overridable by user code. An example use case is a library function that should be overridable by user code.
"Note that weak linkage does not actually allow the optimizer to inline the body of this function into callers because it doesn’t know if this definition of the function is the definitive definition within the program or whether it will be overridden by a stronger definition." -- the LLVM Language Reference
Example:
In a library:
// When allocate_memory(...) is used in the library it remains overridable by user code
// such that allocations *within* the library are also done by the user's version.
// If the user does not provide allocate_memory, the library's version is used.
@weak void* allocate_memory(size_t size)
{
return malloc(size);
}
In user code:
// Override default memory allocation (note: no @weak)
void* allocate_memory(size_t size)
{
return my_own_malloc(size);
}
Pragmas
LDC provides pragmas to tweak certain behavior.
LDC_alloca
This pragma allows you to access the alloca instruction of LLVM directly. It only applies to function declarations and the final LLVM type for that declaration must be: i8* (i32/i64). The size parameter will be truncated to i32 if necessary.
Example:
pragma(LDC_alloca) void* alloca(size_t);
LDC_allow_inline
Use this pragma statement inside a non-naked function using inline asm. This will tell the optimizers that it is safe to inline this function.
Example:
int add(int a, int b) {
pragma(LDC_allow_inline);
asm { mov EAX, a; add EAX, b; }
}
LDC_global_crt_ctor and LDC_global_crt_dtor
If you are doing very low-level stuff then there might be the need to execute code before the D runtime is initialized. With these 2 pragmas it possible to run code as part of the C runtime construction and destruction. A possible application is the initialization of a global mutex as it is done in monitor_.d. If the pragma is specified on a function or static method then an entry is made in the corresponding list. E.g. in monitor_.d:
extern (C) {
pragma(LDC_global_crt_ctor, 1024)
void _STI_monitor_staticctor()
{
// ...
}
}
The optional priority is used to order the execution of the functions. (For more information see the LLVM documentation on global ctors and global dtors variables.)
This works on Linux without problems. On Windows with MS C Runtime ctors work always but dtors are invoked only if linked against the static C runtime. Dtors on Windows require at least LLVM 3.2.
LDC_inline_ir
This pragma makes it possible to use llvm assembly language from D. See the LDC inline IR page for more information.
LDC_intrinsic
The LDC_intrinsic pragma provides access to LLVM's built-in intrinsic functions. It requires a single string literal parameter with full name of the intrinsic. For example "llvm.sqrt.f32".
- It can only be used on function declarations or funtion template declarations.
- Any affected function declarations are not allowed to have bodies.
- The functions must translate to the same signature as the intrinsic.
- You may not take the address of intrinsics.
Example:
// provide square root intrinsics
pragma(LDC_intrinsic, "llvm.sqrt.f32")
float sqrt(float);
pragma(LDC_intrinsic, "llvm.sqrt.f64")
double sqrt(double);
pragma(LDC_intrinsic, "llvm.sqrt.f80")
real sqrt(real); // x86 only
Overloaded intrinsics can also be accessed more easily with a templated version instead, currently only one overloaded type is supported.
Example:
// templated atomic swap intrinsic
pragma(LDC_intrinsic, "llvm.atomic.swap.i#.p0i#")
T llvm_atomic_swap(T)(T* ptr, T val);
The # mark in the name is replaced with the size in bits of the type of the template parameter.
The LDC_intrinsic pragma should not be used in user-code directly, instead, please refer to the ldc.intrinsics module.
LDC_never_inline
Use of this pragma statement will tell the optimizers that this function should never be inlined.
Example:
void* getStackTop() {
pragma(LDC_never_inline);
return ldc.intrinsics.llvm_frameaddress(0);
}
LDC_no_moduleinfo
This pragma disables the generation of the ModuleInfo metadata to register the current module with druntime. Note that this, among other things, leads to any static constructors not being run, and should only be used in very specific circumstances.
Example:
module my_bare_metal_module;
pragma(LDC_no_moduleinfo);
LDC_no_typeinfo
You can use this pragma to stop typeinfo from being implicitly generated for a declaration.
Example:
pragma(LDC_no_typeinfo) {
struct Opaque {}
}
LDC_va_* for variadic argument handling intrinsics
Example:
alias void* va_list;
pragma(LDC_va_start) void va_start(T)(va_list ap, ref T);
pragma(LDC_va_arg) T va_arg(T)(va_list ap);
pragma(LDC_va_end) void va_end(va_list args);
pragma(LDC_va_copy) void va_copy(va_list dst, va_list src);
Traits
targetCPU
(LDC >= 1.1.0)
This trait returns a string describing the CPU that LDC is targeting. The CPU name is what LLVM uses and is the same as what can be passed with the -mcpu= commandline option.
Refer to LLVM's documentation for a list of CPU names.
Note that the string is a compile-time constant (in contrast with run-time information obtainable from e.g. CPUID).
Example:
import std.stdio;
void main() {
writeln("CPU = ", __traits(targetCPU));
}
which prints "CPU = haswell" when compiled with -mcpu=native on my machine.
targetHasFeature
(LDC >= 1.1.0)
This trait returns a boolean value that indicates whether the target CPU supports a certain feature. The trait only returns true when it is certain that the feature is supported. If false is returned, the feature may or may not be present.
The spelling of the feature has to be exactly according to LLVM's convention for feature names (called "attributes", passed on the commandline using -mattr=).
Note that this is compile-time information (in contrast with run-time information obtainable from e.g. CPUID).
Example:
import std.stdio;
void main() {
writeln("CPU = ", __traits(targetCPU));
writeln("Has 'sse3' = ", __traits(targetHasFeature, "sse3"));
writeln("Has 'sse4' = ", __traits(targetHasFeature, "sse4"));
writeln("Has 'sse4.1' = ", __traits(targetHasFeature, "sse4.1"));
}
which prints this on my machine:
❯ ldc2 -mcpu=native -run target_traits.d CPU = haswell Has 'sse3' = true Has 'sse4' = false Has 'sse4.1' = true