Difference between revisions of "LDC inline assembly expressions"

From D Wiki
Jump to: navigation, search
(Wiki syntax fixes.)
(Add ARM example)
 
(8 intermediate revisions by 4 users not shown)
Line 1: Line 1:
Inline assembly expressions are useful when the D asm statement is not yet available, or when the limitations or it being a statement are problematic. Being an expression, extended inline expressions are able to return values!
+
LDC supports an LLVM-specific variant of GCC's extended inline assembly expressions. They are useful on platforms where the D asm statement is not yet available (i.e. non-x86), or when the limitations or it being a statement are problematic. Being an expression, extended inline expressions are able to return values!
  
 
Additionally issues regarding inlining of function containing inline asm are mostly not relevant for extended inline assembly expressions. Effectively, extended inline assembly expression can be used to efficiently implement new intrinsics in the compiler.
 
Additionally issues regarding inlining of function containing inline asm are mostly not relevant for extended inline assembly expressions. Effectively, extended inline assembly expression can be used to efficiently implement new intrinsics in the compiler.
Line 25: Line 25:
 
}
 
}
 
template __asmtuple(T...) {
 
template __asmtuple(T...) {
   __asmtuple_t!(T) __asm (char[] asmcode, char[] constraints, [ Arguments... ] );
+
   __asmtuple_t!(T) __asmtuple (char[] asmcode, char[] constraints, [ Arguments... ] );
 
}
 
}
 
</source>
 
</source>
Line 46: Line 46:
 
Common clobbers:
 
Common clobbers:
 
* <tt>~{memory}</tt> == clobbers memory
 
* <tt>~{memory}</tt> == clobbers memory
 +
 +
The section on [http://llvm.org/docs/LangRef.html#inline-assembler-expressions Inline Assembler Expressions] in the [http://llvm.org/docs/LangRef.html LLVM Language Reference] has all the details.
  
 
== X86-32 ==
 
== X86-32 ==
Line 56: Line 58:
 
* <tt>d</tt> or <tt>{dx}</tt> or <tt>{edx}</tt> == EDX
 
* <tt>d</tt> or <tt>{dx}</tt> or <tt>{edx}</tt> == EDX
 
* <tt>A</tt> == EAX:EDX
 
* <tt>A</tt> == EAX:EDX
 +
* <tt>{flags}</tt> == EFLAGS
 
* <tt>{st}</tt> == ST(0)
 
* <tt>{st}</tt> == ST(0)
 
* <tt>{st(N)}</tt> == ST(N)
 
* <tt>{st(N)}</tt> == ST(N)
 +
* <tt>{fpsw}</tt> == floating point status word
  
 
=== Examples ===
 
=== Examples ===
Line 63: Line 67:
 
<source lang="d">// store val into dst
 
<source lang="d">// store val into dst
 
void store(ref int dst, int val) {
 
void store(ref int dst, int val) {
   __asm(&quot;movl $1, $0&quot;, &quot;=*m,r&quot;, &amp;dst, val);
+
   __asm("movl $1, $0", "=*m,r", &dst, val);
 
}</source>
 
}</source>
 
<source lang="d">// load dst into EAX and return it
 
<source lang="d">// load dst into EAX and return it
 
int load(ref int dst) {
 
int load(ref int dst) {
   return __asm!(int)(&quot;movl $1, $0&quot;, &quot;=a,*m&quot;, &amp;dst);
+
   return __asm!int("movl $1, $0", "=a,*m", &dst);
 
}</source>
 
}</source>
 +
 +
== X86-64 ==
 +
 +
=== Examples ===
 +
 +
<source lang="d">
 +
// write system call
 +
ulong sys_write(long arg1, in void* arg2, long arg3) {
 +
    // The number of the syscall must be passed in rax (write = 1)
 +
    // Returning from the syscall rax contains the result
 +
    // The kernel clobbers rcx and r11
 +
    return __asm!ulong
 +
    (
 +
        "syscall",
 +
        "={rax}, {rax}, {rdi}, {rsi}, {rdx},
 +
        ~{rcx},~{r11}",     
 +
        1, arg1, arg2, arg3
 +
    );
 +
}
 +
</source>
 +
 +
See the [http://www.x86-64.org/documentation/abi.pdf X86-64 ABI specification]
 +
 +
== ARM ==
 +
 +
* <tt>{cpsr}</tt> == current program status register
 +
 +
=== Examples ===
 +
 +
The following code adds two multibyte numbers and a carry and returns the final carry.
 +
It uses an interface similar to [https://github.com/dlang/phobos/blob/master/std/internal/math/biguintnoasm.d multibyteAddSub() in biguintnoasm.d].
 +
 +
<source lang="d">
 +
uint multibyteAdd(uint[] dest, const(uint) [] src1,
 +
    const (uint) [] src2, uint carry) pure @nogc nothrow
 +
{
 +
    assert(carry == 0 || carry == 1);
 +
    assert(src1.length >= dest.length && src2.length >= dest.length);
 +
    return __asm!uint(`  cmp    $2,#0                @ Check dest.length
 +
                        beq    1f
 +
                        mov  r5,#0                @ Initialize index
 +
                      2:
 +
                        ldr  r6,[${3:m},r5,LSL #2] @ Load *(src1.ptr + index)
 +
                        ldr  r7,[${4:m},r5,LSL #2] @ Load *(src2.ptr + index)
 +
                        lsrs  $0,$0,#1              @ Set carry
 +
                        adcs  r6,r6,r7              @ Add with carry
 +
                        str  r6,[${1:m},r5,LSL #2] @ Store *(dest.ptr + index)
 +
                        adc  $0,$0,#0              @ Store carry
 +
                        add  r5,r5,#1              @ Increment index
 +
                        cmp  $2,r5
 +
                        bhi  2b
 +
                      1:`,
 +
                      "=&r,=*m,r,*m,*m,0,~{r5},~{r6},~{r7},~{cpsr}",
 +
                      dest.ptr, dest.length, src1.ptr, src2.ptr, carry);
 +
}</source>
 +
 +
The constraint string in this example is complex.
 +
* A value is returned in a register which is marked as early clobber (<tt>=&r</tt>). This register is not used for input values.
 +
* The pointers uses the memory constraint (<tt>=*m</tt> and <tt>*m</tt>). The pointer is passed in a register which is considered read-only.
 +
* The carry parameter is tied to output parameter <tt>0</tt> and therefore uses the same register.
 +
* The code clobers some registers and the current program status register.
 +
 +
Please note the following details:
 +
* Even if you tie an input parameter to an output parameter you may need to mark the output register as early clobber. If LLVM can prove that input parameters always have the same value (e.g. all array length and the carry are 1) then the same register is used for these inputs. This happens even if the input parameter is tied to an output parameter! Marking the output as early clobber prevents this.
 +
* Use can use only local labels (<tt>1:</tt>, <tt>2:</tt> and so on) and you have to specify the direction (<tt>f</tt> = forward, <tt>b</tt> = backward) if you refer to them.
 +
* The parameter for a memory constraint is replaced with a memory access: <tt>[</tt> register <tt>]</tt>. E.g. the parameter <tt>$1</tt> from the example above may be expanded as <tt>[r0]</tt>. If you need only the register then you must use the modifier <tt>m</tt>: <tt>${1:m}</tt> is expanded as <tt>r0</tt>. (The register is chosen by LLVM!)
 +
* <tt>@</tt> is used to mark a comment. Other architectures use other characters.
 +
 
== PPC 32 ==
 
== PPC 32 ==
 +
 +
* <tt>{cc}</tt> == condition code register
  
 
=== Examples ===
 
=== Examples ===
  
<source lang="d">// store val into dst
+
<source lang="d">// store val into dst, clobbering r4
 
void store(ref int dst, int val) {
 
void store(ref int dst, int val) {
   __asm(&quot;ldw r4, $1 ; stw r4, $0&quot;, &quot;=*m,r,~{r4}&quot;, &amp;dst, val);
+
   __asm("ldw 4, $1 ; stw 4, $0", "=*m,r,~{r4}", &dst, val);
 +
}</source>
 +
 
 +
== PPC 64 ==
 +
 
 +
* <tt>{cc}</tt> == condition code register
 +
 
 +
=== Examples ===
 +
 
 +
<source lang="d">// returning the floating point status and control register
 +
uint getFPSCR()
 +
{
 +
    double fspr = __asm!double("mffs 0", "={f0}");
 +
    return cast(uint) *cast(ulong*) &fspr;
 
}</source>
 
}</source>
  
 +
== MIPS 64 ==
 +
 +
MIPS assembly languages uses $ to denote registers. You have to quote them with a second $.
 +
 +
=== Examples ===
 +
 +
<source lang="d">// returning stack pointer
 +
void* getStackTop()
 +
{
 +
    return __asm!(void *)("move $0, $$sp", "=r");
 +
}</source>
  
 
[[Category:LDC]]
 
[[Category:LDC]]

Latest revision as of 20:40, 22 April 2016

LDC supports an LLVM-specific variant of GCC's extended inline assembly expressions. They are useful on platforms where the D asm statement is not yet available (i.e. non-x86), or when the limitations or it being a statement are problematic. Being an expression, extended inline expressions are able to return values!

Additionally issues regarding inlining of function containing inline asm are mostly not relevant for extended inline assembly expressions. Effectively, extended inline assembly expression can be used to efficiently implement new intrinsics in the compiler.

Interface

To use them you must import the module containing the magic declarations:

import ldc.llvmasm;

Three different forms exist:

No return value:

void __asm (char[] asmcode, char[] constraints, [ Arguments... ] );

Single return value:

template __asm(T) {
  T __asm (char[] asmcode, char[] constraints, [ Arguments... ] );
}

Multiple return values:

struct __asmtuple_t(T...) {
  T v;
}
template __asmtuple(T...) {
  __asmtuple_t!(T) __asmtuple (char[] asmcode, char[] constraints, [ Arguments... ] );
}

In all cases the constraint list must match the return type and arguments.

Constraints is a comma seperated list of outputs, inputs and clobbers.

Output constraints must come first, then input constraints, then finally clobbers.

Common output constraints:

  • =*m == memory output
  • =r == general purpose register output

Common input constraints:

  • *m == memory input
  • r == general purpose register input
  • i == immediate value input

Common clobbers:

  • ~{memory} == clobbers memory

The section on Inline Assembler Expressions in the LLVM Language Reference has all the details.

X86-32

X86-32 specific constraints

  • a or {ax} or {eax} == EAX
  • b or {bx} or {ebx} == EBX
  • c or {cx} or {ecx} == ECX
  • d or {dx} or {edx} == EDX
  • A == EAX:EDX
  • {flags} == EFLAGS
  • {st} == ST(0)
  • {st(N)} == ST(N)
  • {fpsw} == floating point status word

Examples

// store val into dst
void store(ref int dst, int val) {
  __asm("movl $1, $0", "=*m,r", &dst, val);
}
// load dst into EAX and return it
int load(ref int dst) {
  return __asm!int("movl $1, $0", "=a,*m", &dst);
}

X86-64

Examples

// write system call
ulong sys_write(long arg1, in void* arg2, long arg3) {
    // The number of the syscall must be passed in rax (write = 1)
    // Returning from the syscall rax contains the result
    // The kernel clobbers rcx and r11
    return __asm!ulong
    (
        "syscall", 
        "={rax}, {rax}, {rdi}, {rsi}, {rdx},
        ~{rcx},~{r11}",       
        1, arg1, arg2, arg3
    );
}

See the X86-64 ABI specification

ARM

  • {cpsr} == current program status register

Examples

The following code adds two multibyte numbers and a carry and returns the final carry. It uses an interface similar to multibyteAddSub() in biguintnoasm.d.

uint multibyteAdd(uint[] dest, const(uint) [] src1,
    const (uint) [] src2, uint carry) pure @nogc nothrow
{
    assert(carry == 0 || carry == 1);
    assert(src1.length >= dest.length && src2.length >= dest.length);
    return __asm!uint(`  cmp    $2,#0                @ Check dest.length
                         beq    1f
                         mov   r5,#0                 @ Initialize index
                       2:
                         ldr   r6,[${3:m},r5,LSL #2] @ Load *(src1.ptr + index)
                         ldr   r7,[${4:m},r5,LSL #2] @ Load *(src2.ptr + index)
                         lsrs  $0,$0,#1              @ Set carry
                         adcs  r6,r6,r7              @ Add with carry
                         str   r6,[${1:m},r5,LSL #2] @ Store *(dest.ptr + index)
                         adc   $0,$0,#0              @ Store carry
                         add   r5,r5,#1              @ Increment index
                         cmp   $2,r5
                         bhi   2b
                       1:`,
                      "=&r,=*m,r,*m,*m,0,~{r5},~{r6},~{r7},~{cpsr}",
                      dest.ptr, dest.length, src1.ptr, src2.ptr, carry);
}

The constraint string in this example is complex.

  • A value is returned in a register which is marked as early clobber (=&r). This register is not used for input values.
  • The pointers uses the memory constraint (=*m and *m). The pointer is passed in a register which is considered read-only.
  • The carry parameter is tied to output parameter 0 and therefore uses the same register.
  • The code clobers some registers and the current program status register.

Please note the following details:

  • Even if you tie an input parameter to an output parameter you may need to mark the output register as early clobber. If LLVM can prove that input parameters always have the same value (e.g. all array length and the carry are 1) then the same register is used for these inputs. This happens even if the input parameter is tied to an output parameter! Marking the output as early clobber prevents this.
  • Use can use only local labels (1:, 2: and so on) and you have to specify the direction (f = forward, b = backward) if you refer to them.
  • The parameter for a memory constraint is replaced with a memory access: [ register ]. E.g. the parameter $1 from the example above may be expanded as [r0]. If you need only the register then you must use the modifier m: ${1:m} is expanded as r0. (The register is chosen by LLVM!)
  • @ is used to mark a comment. Other architectures use other characters.

PPC 32

  • {cc} == condition code register

Examples

// store val into dst, clobbering r4
void store(ref int dst, int val) {
  __asm("ldw 4, $1 ; stw 4, $0", "=*m,r,~{r4}", &dst, val);
}

PPC 64

  • {cc} == condition code register

Examples

// returning the floating point status and control register
uint getFPSCR()
{
    double fspr = __asm!double("mffs 0", "={f0}");
    return cast(uint) *cast(ulong*) &fspr;
}

MIPS 64

MIPS assembly languages uses $ to denote registers. You have to quote them with a second $.

Examples

// returning stack pointer
void* getStackTop()
{
    return __asm!(void *)("move $0, $$sp", "=r");
}