33
1 ARM 64bit has come! Tetsuyuki Kobayashi 2014.5.23 Japan Technical Jamboree 2014.5.25 Updated for カーネル /VM 探検隊

ARM 64bit has come!

Embed Size (px)

DESCRIPTION

The first impression of A64 instruction set.

Citation preview

Page 1: ARM 64bit has come!

1

ARM 64bit has come!

Tetsuyuki Kobayashi

2014.5.23 Japan Technical Jamboree2014.5.25 Updated for カーネル/VM探検隊

Page 2: ARM 64bit has come!

2

The latest version of this slide will be available from here

http://www.slideshare.net/tetsu.koba/presentations

Page 3: ARM 64bit has come!

3

Who am I?

20+ years involved in embedded systems 10 years in real time OS, such as iTRON 10 years in embedded Java Virtual Machine Now GCC, Linux, QEMU, Android, …

Blogs http://d.hatena.ne.jp/embedded/ (Personal) http://blog.kmckk.com/ (Corporate) http://kobablog.wordpress.com/(English)

Twitter @tetsu_koba

Page 4: ARM 64bit has come!

Today's topics

Introduction of ARM 64bit But does not cover all, only something interesting for me :)

Try aarch64 using QEMU

Page 5: ARM 64bit has come!

ARMv8 terminology

AArch64: 64 bit mode 1 instruction set: A64 A64: 32bit fixed length instructions

AArch32: 32 bit mode Upper compatible with ARMv7-A architecture 2 instruction sets: A32, T32 A32: ARM, 32bit fixed length instructions T32: Thumb2, 16bit/32bit instructions

Page 6: ARM 64bit has come!

6

ARM64 is not official name

In the kernel source arch/arm64

Page 7: ARM 64bit has come!

Exception level

4 levels Typical usage

EL0: User application EL1: Kernel of OS EL2: Hypervisor EL3: Secure monitor

Aarch64/aarch32 can change between exception level

CF. PL0-PL2 (Privilege level) at ARMv7

Page 8: ARM 64bit has come!

Aarch64 execution model

R0 – R30: 64bit length general purpose registers

Wn: lower 32bit Xn: 64bit 32th register means zero register(XZR, WZR) or SP

SP: Stack Pointer Must be 16 byte aligned WSP for lower 32bit

PC: Program Counter Can not use for calculate destination

Page 9: ARM 64bit has come!

Aarch64 execution model (cont.)

V0 – V31: 128 bit length registers For floating point and SIMD Aarch64 must have FPU. No calling standard for

soft-float. Scalar

Bn, Hn, Sn, Dn, Qn Vector

Vn.8B, Vn.16B, Vn.4H, Vn.8H, Vn.2S, Vn.4S, Vn.1D, Vn.2D

FPCR: Floating Point Control Register FPSR: Floating Point Status Register

Page 10: ARM 64bit has come!

Aarch64 addressing model

Without tag: 64bit virtual address With tag: 8bit tag + 56bit virtual address

Tag is ignored when load/store/branch Good for implementing type-less languages

Effective virtual address length is 48bit.

Page 11: ARM 64bit has come!

Calling standard (AAPCS64)

R30 = LR (Link Register) R29 = FP (Frame Pointer) Parameter passing

R0 – R7 for integer and pointer V0 – V7 for float

Callee must preserve R19 – R29, SP V8 – V15

No calling standard for soft-float

Page 12: ARM 64bit has come!

A64 instruction set

Brand-new, clean design for 64bit architecture Not all, very small set of ”conditional data

processing” instructions No equivalent of Thumb2's IT instruction.

Page 13: ARM 64bit has come!

No multiple load/store

No multiple load/store GP registers such as LDM/STM, PUSH/POP

Instead, there are 2 register load/store such as LDP/STP

Page 14: ARM 64bit has come!

YIELD instruction

NOP with hinting not important Use in spin-loop and trigger context

switching in SMT(Symmetric Multi-Threading)

Page 15: ARM 64bit has come!

Sample #1 source

#include <stdio.h>

int main(){

int i;

for (i = 5; i >=0; i--) {printf("count down: %d\n", i);

}return 0;

}

Page 16: ARM 64bit has come!

Sample #1 Thumb2

000083f8 <main>: 83f8: b570 push {r4, r5, r6, lr} 83fa: 2405 movs r4, #5 83fc: f248 456c movw r5, #33900 ; 0x846c 8400: f2c0 0500 movt r5, #0 8404: 2601 movs r6, #1 8406: 4630 mov r0, r6 8408: 4629 mov r1, r5 840a: 4622 mov r2, r4 840c: f7ff ef7a blx 8304 <_init+0x38> 8410: 3c01 subs r4, #1 8412: f1b4 3fff cmp.w r4, #4294967295 ; 0xffffffff 8416: d1f6 bne.n 8406 <main+0xe> 8418: 2000 movs r0, #0 841a: bd70 pop {r4, r5, r6, pc}

Page 17: ARM 64bit has come!

Sample #1 A64

0000000000400440 <main>: 400440: a9be7bfd stp x29, x30, [sp,#-32]! 400444: 910003fd mov x29, sp 400448: a90153f3 stp x19, x20, [sp,#16] 40044c: 90000014 adrp x20, 400000 <_init-0x3c0> 400450: 528000b3 mov w19, #0x5 // #5 400454: 911a0294 add x20, x20, #0x680 400458: 2a1303e2 mov w2, w19 40045c: 52800020 mov w0, #0x1 // #1 400460: aa1403e1 mov x1, x20 400464: 97ffffeb bl 400410 <__printf_chk@plt> 400468: 51000673 sub w19, w19, #0x1 40046c: 3100067f cmn w19, #0x1 400470: 54ffff41 b.ne 400458 <main+0x18> 400474: 52800000 mov w0, #0x0 // #0 400478: a94153f3 ldp x19, x20, [sp,#16] 40047c: a8c27bfd ldp x29, x30, [sp],#32 400480: d65f03c0 ret

Page 18: ARM 64bit has come!

Sample #2 source

int iaload(int *base, int index){

return base[index];}

long long laload(long long *base, int index){

return base[index];}

char ibload(char *base, int index){

return base[index];}

short isload(short *base, int index){

return base[index];}

Page 19: ARM 64bit has come!

Sample #2 Thumb2

00000000 <iaload>: 0: f850 0021 ldr.w r0, [r0, r1, lsl #2] 4: 4770 bx lr 6: bf00 nop

00000008 <laload>: 8: eb00 01c1 add.w r1, r0, r1, lsl #3 c: e9d1 0100 ldrd r0, r1, [r1] 10: 4770 bx lr 12: bf00 nop

00000014 <ibload>: 14: 5c40 ldrb r0, [r0, r1] 16: 4770 bx lr

00000018 <isload>: 18: f930 0011 ldrsh.w r0, [r0, r1, lsl #1] 1c: 4770 bx lr 1e: bf00 nop

Page 20: ARM 64bit has come!

Sample #2 A64

0000000000000000 <iaload>: 0: b861d800 ldr w0, [x0,w1,sxtw #2] 4: d65f03c0 ret

0000000000000008 <laload>: 8: f861d800 ldr x0, [x0,w1,sxtw #3] c: d65f03c0 ret

0000000000000010 <ibload>: 10: 3861c800 ldrb w0, [x0,w1,sxtw] 14: d65f03c0 ret

0000000000000018 <isload>: 18: 7861d800 ldrh w0, [x0,w1,sxtw #1] 1c: d65f03c0 ret

Page 21: ARM 64bit has come!

Sample #3 source

double range(double x, double min, double max){

if (x < min) return min;

else if (x > max)return max;

else return x;

}

Page 22: ARM 64bit has come!

Sample #3 Thumb2

00000000 <range>: 0: eeb4 0bc1 vcmpe.f64 d0, d1 4: eef1 fa10 vmrs APSR_nzcv, fpscr 8: d407 bmi.n 1a <range+0x1a> a: eeb4 0bc2 vcmpe.f64 d0, d2 e: eef1 fa10 vmrs APSR_nzcv, fpscr 12: bfc8 it gt 14: eeb0 0b42 vmovgt.f64 d0, d2 18: 4770 bx lr 1a: eeb0 0b41 vmov.f64d0, d1 1e: 4770 bx lr

Page 23: ARM 64bit has come!

Sample #3 A64

0000000000000000 <range>: 0: 1e612010 fcmpe d0, d1 4: 540000a4 b.mi 18 <range+0x18> 8: 1e622010 fcmpe d0, d2 c: 1e604041 fmov d1, d2 10: 5400004c b.gt 18 <range+0x18> 14: 1e604001 fmov d1, d0 18: 1e604020 fmov d0, d1 1c: d65f03c0 ret

Page 24: ARM 64bit has come!

Cache control

Application level cache instructions Data cache

DC VAU DC CVAC DC CIVAC

Instruction cache IC IVAU

No need to call kernel syscall JIT friendly

Page 25: ARM 64bit has come!

Preloading cache

PRFM <prfop>, addr|label <prfop> ::= <type><target><policy> <type> ::= PLD | PST | PLI <target> ::= L1 | L2 | L3 <policy> ::= KEEP | STRM

Page 26: ARM 64bit has come!

Non-temporal load/store

LDNP/STNP Hinting unlikely to be accessed again

(like streaming)

Page 27: ARM 64bit has come!

Aarch32

Upper compatible with ARMv7 Added encrypt extension Added other some new instructions

aligned to aarch64 Removed Jazelle, ThumbEE

Page 28: ARM 64bit has come!

Let's try Aarch64 using QEMU

Qemu 2.0 supports aarch64 user mode emulation

Ubuntu 14.04 has qemu 2.0 and cross compiler for aarch64

$ sudo apt-get install qemu-user-static$ sudo apt-get install g++-aarch64-linux-gnu

Page 29: ARM 64bit has come!

Prepare gdb for aarch64

$ sudo apt-get build-dep gdb $ wget http://ftp.gnu.org/gnu/gdb/gdb-7.7.1.tar.bz2 $ tar xf gdb-7.7.1.tar.bz2 $ mkdir obj $ cd obj $ ../gdb-7.7.1/configure --target=aarch64-linux-gnu $ make $ sudo make install

Page 30: ARM 64bit has come!

Execute by qemu and connect gdb

$ aarch64-linux-gnu-gcc -g a.c$ export QEMU_LD_PREFIX=/usr/aarch64-linux-gnu/$ qemu-aarch64-static -g 1234 ./a.out

$ aarch64-linux-gnu-gdb ./a.out  ...(gdb) target remote :1234(gdb) b main(gdb) c(gdb) x/i $pc=> 0x4005a0 <main>: stp x29, x30, [sp,#-48]!(gdb)

Page 31: ARM 64bit has come!

DEMO

Page 32: ARM 64bit has come!

32

References

ARMv8Technology Preview ARMv8 Instruction Set Overview ARM®Architecture Reference Manual Procedure Call Standard for theARM 64-bitArch

itecture(AArch64) ARM 64bit ARMv8 の アーキテクチャ の概要

Ubuntu 14.04 arm 64bit(aarch6で4)のコードをコンパイルして動かしてみる

Page 33: ARM 64bit has come!

33

Any comment?

@tetsu_koba

Thank you for listening!