Difference between revisions of "LDC inline assembly expressions"
(Add ARM example) |
|||
(One intermediate revision by one other user not shown) | |||
Line 46: | Line 46: | ||
Common clobbers: | Common clobbers: | ||
* <tt>~{memory}</tt> == clobbers memory | * <tt>~{memory}</tt> == clobbers memory | ||
+ | |||
+ | The section on [http://llvm.org/docs/LangRef.html#inline-assembler-expressions Inline Assembler Expressions] in the [http://llvm.org/docs/LangRef.html LLVM Language Reference] has all the details. | ||
== X86-32 == | == X86-32 == | ||
Line 71: | Line 73: | ||
return __asm!int("movl $1, $0", "=a,*m", &dst); | return __asm!int("movl $1, $0", "=a,*m", &dst); | ||
}</source> | }</source> | ||
+ | |||
+ | == X86-64 == | ||
+ | |||
+ | === Examples === | ||
+ | |||
+ | <source lang="d"> | ||
+ | // write system call | ||
+ | ulong sys_write(long arg1, in void* arg2, long arg3) { | ||
+ | // The number of the syscall must be passed in rax (write = 1) | ||
+ | // Returning from the syscall rax contains the result | ||
+ | // The kernel clobbers rcx and r11 | ||
+ | return __asm!ulong | ||
+ | ( | ||
+ | "syscall", | ||
+ | "={rax}, {rax}, {rdi}, {rsi}, {rdx}, | ||
+ | ~{rcx},~{r11}", | ||
+ | 1, arg1, arg2, arg3 | ||
+ | ); | ||
+ | } | ||
+ | </source> | ||
+ | |||
+ | See the [http://www.x86-64.org/documentation/abi.pdf X86-64 ABI specification] | ||
+ | |||
+ | == ARM == | ||
+ | |||
+ | * <tt>{cpsr}</tt> == current program status register | ||
+ | |||
+ | === Examples === | ||
+ | |||
+ | The following code adds two multibyte numbers and a carry and returns the final carry. | ||
+ | It uses an interface similar to [https://github.com/dlang/phobos/blob/master/std/internal/math/biguintnoasm.d multibyteAddSub() in biguintnoasm.d]. | ||
+ | |||
+ | <source lang="d"> | ||
+ | uint multibyteAdd(uint[] dest, const(uint) [] src1, | ||
+ | const (uint) [] src2, uint carry) pure @nogc nothrow | ||
+ | { | ||
+ | assert(carry == 0 || carry == 1); | ||
+ | assert(src1.length >= dest.length && src2.length >= dest.length); | ||
+ | return __asm!uint(` cmp $2,#0 @ Check dest.length | ||
+ | beq 1f | ||
+ | mov r5,#0 @ Initialize index | ||
+ | 2: | ||
+ | ldr r6,[${3:m},r5,LSL #2] @ Load *(src1.ptr + index) | ||
+ | ldr r7,[${4:m},r5,LSL #2] @ Load *(src2.ptr + index) | ||
+ | lsrs $0,$0,#1 @ Set carry | ||
+ | adcs r6,r6,r7 @ Add with carry | ||
+ | str r6,[${1:m},r5,LSL #2] @ Store *(dest.ptr + index) | ||
+ | adc $0,$0,#0 @ Store carry | ||
+ | add r5,r5,#1 @ Increment index | ||
+ | cmp $2,r5 | ||
+ | bhi 2b | ||
+ | 1:`, | ||
+ | "=&r,=*m,r,*m,*m,0,~{r5},~{r6},~{r7},~{cpsr}", | ||
+ | dest.ptr, dest.length, src1.ptr, src2.ptr, carry); | ||
+ | }</source> | ||
+ | |||
+ | The constraint string in this example is complex. | ||
+ | * A value is returned in a register which is marked as early clobber (<tt>=&r</tt>). This register is not used for input values. | ||
+ | * The pointers uses the memory constraint (<tt>=*m</tt> and <tt>*m</tt>). The pointer is passed in a register which is considered read-only. | ||
+ | * The carry parameter is tied to output parameter <tt>0</tt> and therefore uses the same register. | ||
+ | * The code clobers some registers and the current program status register. | ||
+ | |||
+ | Please note the following details: | ||
+ | * Even if you tie an input parameter to an output parameter you may need to mark the output register as early clobber. If LLVM can prove that input parameters always have the same value (e.g. all array length and the carry are 1) then the same register is used for these inputs. This happens even if the input parameter is tied to an output parameter! Marking the output as early clobber prevents this. | ||
+ | * Use can use only local labels (<tt>1:</tt>, <tt>2:</tt> and so on) and you have to specify the direction (<tt>f</tt> = forward, <tt>b</tt> = backward) if you refer to them. | ||
+ | * The parameter for a memory constraint is replaced with a memory access: <tt>[</tt> register <tt>]</tt>. E.g. the parameter <tt>$1</tt> from the example above may be expanded as <tt>[r0]</tt>. If you need only the register then you must use the modifier <tt>m</tt>: <tt>${1:m}</tt> is expanded as <tt>r0</tt>. (The register is chosen by LLVM!) | ||
+ | * <tt>@</tt> is used to mark a comment. Other architectures use other characters. | ||
== PPC 32 == | == PPC 32 == |
Latest revision as of 20:40, 22 April 2016
LDC supports an LLVM-specific variant of GCC's extended inline assembly expressions. They are useful on platforms where the D asm statement is not yet available (i.e. non-x86), or when the limitations or it being a statement are problematic. Being an expression, extended inline expressions are able to return values!
Additionally issues regarding inlining of function containing inline asm are mostly not relevant for extended inline assembly expressions. Effectively, extended inline assembly expression can be used to efficiently implement new intrinsics in the compiler.
Contents
Interface
To use them you must import the module containing the magic declarations:
import ldc.llvmasm;
Three different forms exist:
No return value:
void __asm (char[] asmcode, char[] constraints, [ Arguments... ] );
Single return value:
template __asm(T) {
T __asm (char[] asmcode, char[] constraints, [ Arguments... ] );
}
Multiple return values:
struct __asmtuple_t(T...) {
T v;
}
template __asmtuple(T...) {
__asmtuple_t!(T) __asmtuple (char[] asmcode, char[] constraints, [ Arguments... ] );
}
In all cases the constraint list must match the return type and arguments.
Constraints is a comma seperated list of outputs, inputs and clobbers.
Output constraints must come first, then input constraints, then finally clobbers.
Common output constraints:
- =*m == memory output
- =r == general purpose register output
Common input constraints:
- *m == memory input
- r == general purpose register input
- i == immediate value input
Common clobbers:
- ~{memory} == clobbers memory
The section on Inline Assembler Expressions in the LLVM Language Reference has all the details.
X86-32
X86-32 specific constraints
- a or {ax} or {eax} == EAX
- b or {bx} or {ebx} == EBX
- c or {cx} or {ecx} == ECX
- d or {dx} or {edx} == EDX
- A == EAX:EDX
- {flags} == EFLAGS
- {st} == ST(0)
- {st(N)} == ST(N)
- {fpsw} == floating point status word
Examples
// store val into dst
void store(ref int dst, int val) {
__asm("movl $1, $0", "=*m,r", &dst, val);
}
// load dst into EAX and return it
int load(ref int dst) {
return __asm!int("movl $1, $0", "=a,*m", &dst);
}
X86-64
Examples
// write system call
ulong sys_write(long arg1, in void* arg2, long arg3) {
// The number of the syscall must be passed in rax (write = 1)
// Returning from the syscall rax contains the result
// The kernel clobbers rcx and r11
return __asm!ulong
(
"syscall",
"={rax}, {rax}, {rdi}, {rsi}, {rdx},
~{rcx},~{r11}",
1, arg1, arg2, arg3
);
}
See the X86-64 ABI specification
ARM
- {cpsr} == current program status register
Examples
The following code adds two multibyte numbers and a carry and returns the final carry. It uses an interface similar to multibyteAddSub() in biguintnoasm.d.
uint multibyteAdd(uint[] dest, const(uint) [] src1,
const (uint) [] src2, uint carry) pure @nogc nothrow
{
assert(carry == 0 || carry == 1);
assert(src1.length >= dest.length && src2.length >= dest.length);
return __asm!uint(` cmp $2,#0 @ Check dest.length
beq 1f
mov r5,#0 @ Initialize index
2:
ldr r6,[${3:m},r5,LSL #2] @ Load *(src1.ptr + index)
ldr r7,[${4:m},r5,LSL #2] @ Load *(src2.ptr + index)
lsrs $0,$0,#1 @ Set carry
adcs r6,r6,r7 @ Add with carry
str r6,[${1:m},r5,LSL #2] @ Store *(dest.ptr + index)
adc $0,$0,#0 @ Store carry
add r5,r5,#1 @ Increment index
cmp $2,r5
bhi 2b
1:`,
"=&r,=*m,r,*m,*m,0,~{r5},~{r6},~{r7},~{cpsr}",
dest.ptr, dest.length, src1.ptr, src2.ptr, carry);
}
The constraint string in this example is complex.
- A value is returned in a register which is marked as early clobber (=&r). This register is not used for input values.
- The pointers uses the memory constraint (=*m and *m). The pointer is passed in a register which is considered read-only.
- The carry parameter is tied to output parameter 0 and therefore uses the same register.
- The code clobers some registers and the current program status register.
Please note the following details:
- Even if you tie an input parameter to an output parameter you may need to mark the output register as early clobber. If LLVM can prove that input parameters always have the same value (e.g. all array length and the carry are 1) then the same register is used for these inputs. This happens even if the input parameter is tied to an output parameter! Marking the output as early clobber prevents this.
- Use can use only local labels (1:, 2: and so on) and you have to specify the direction (f = forward, b = backward) if you refer to them.
- The parameter for a memory constraint is replaced with a memory access: [ register ]. E.g. the parameter $1 from the example above may be expanded as [r0]. If you need only the register then you must use the modifier m: ${1:m} is expanded as r0. (The register is chosen by LLVM!)
- @ is used to mark a comment. Other architectures use other characters.
PPC 32
- {cc} == condition code register
Examples
// store val into dst, clobbering r4
void store(ref int dst, int val) {
__asm("ldw 4, $1 ; stw 4, $0", "=*m,r,~{r4}", &dst, val);
}
PPC 64
- {cc} == condition code register
Examples
// returning the floating point status and control register
uint getFPSCR()
{
double fspr = __asm!double("mffs 0", "={f0}");
return cast(uint) *cast(ulong*) &fspr;
}
MIPS 64
MIPS assembly languages uses $ to denote registers. You have to quote them with a second $.
Examples
// returning stack pointer
void* getStackTop()
{
return __asm!(void *)("move $0, $$sp", "=r");
}