Upload
krishnanand
View
12
Download
0
Embed Size (px)
DESCRIPTION
LLVM
Citation preview
5/26/2018 99_DSLsWithLLVM
1/40
Implementing Domain-Specific Languages with
LLVM
David Chisnall
February 5, 2012
5/26/2018 99_DSLsWithLLVM
2/40
What are Domain-Specific Languages?
UNIX bc / dc
Graphviz
JavaScript
AppleScript / Visual Basic for Applications
Firewall filter rules
EMACS Lisp
Some are alsogeneral-purpose programming languages.
5/26/2018 99_DSLsWithLLVM
3/40
What is LLVM?
A set of libraries for implementing compilers Intermediate representation (LLVM IR) for optimisation
Various helper libraries
5/26/2018 99_DSLsWithLLVM
4/40
How Do I Use LLVM?
Generate LLVM IR from your language
Link to some helper functions written in C and compiled toLLVM IR with clang
Run optimisers
Emit code: object code files, assembly, or machine code in
memory (JIT)
5/26/2018 99_DSLsWithLLVM
5/40
A Typical DSL Implementation
ASTParser Interpreter
LLVM Optimiser JIT
Clang
Runtime Support Code
LLVM Optimiser
LLVM Linker Native Linker
Executable
5/26/2018 99_DSLsWithLLVM
6/40
What Is LLVM IR?
Unlimited Single-Assignment Register machine instruction set
Three common representations: Human-readable LLVM assembly (.ll files) Dense bitcode binary representation (.bc files) C++ classes
5/26/2018 99_DSLsWithLLVM
7/40
The Structure of an LLVM Compilation Unit
Compilation units are LLVM Modules, each one contains one
or more... Functions, each of which contains one or more...
Basic Blocks, each of which contains one or more...
Instructions
5/26/2018 99_DSLsWithLLVM
8/40
LLVM Instructions
alloca allocates space on the stack
add and so on: arithmetic instructions jeq, jne, ret flow control
call, invoke structured flow control
LLVM also provides some intrinsic functions for things like
atomic operations
5/26/2018 99_DSLsWithLLVM
9/40
Instructions are Values
LLVM uses an infinite SSA register set The result of (almost) every instruction is a register
These can be operands to other instructions
5/26/2018 99_DSLsWithLLVM
10/40
Basic Blocks
Start with (zero or more) phi instructions End with a flow control instruction (terminator)
No flow control inside a basic block
5/26/2018 99_DSLsWithLLVM
11/40
Functions
Start with an entry basic block. All locals should be allocas in the entry block.
Must end (if it terminates) with a ret instruction
5/26/2018 99_DSLsWithLLVM
12/40
Hello World in LLVM
@ . s t r = p r iv a te c o ns t an t [12 x i8 ] c " he ll o w or ld
\00"@ . s t r 1 = p r iv at e c o ns t an t [9 x i8 ] c " he ll o % s \0 0"
d ef i ne i 32 @ m a i n ( i32 % a r g c , i8 ** % argv ) {
%1 = icmp eq i32 % argc , 1
br i1 %1 , label % w o r l d , label % n a m e
w o r l d :
%2 = g e t e l e m e n t p t r [12 x i8 ]* @ . str , i64 0 ,i64 0
c al l v oi d @ p u t s ( i8 * %2)
re t i3 2 0
n a m e :
%3 = g e t e l e m e n t p t r i n b o u n d s i8 ** % argv , i64 1%4 = l oa d i8 ** %3 , align 8
%5 = g e t e l e m e n t p t r [9 x i8 ]* @ . str1 , i64 0 ,
i64 0
c al l v oi d ( i8 * , ... ) * @ pr in tf ( i8 * %5 , i8 * %4)
re t i3 2 0}
5/26/2018 99_DSLsWithLLVM
13/40
LLVM Types
LLVM is strongly typed
Types are structural (e.g. 8-bit integer - signed and unsigned
are properties of operations, not typed) Arrays of different length are different types
Pointers and integers are different
Structures with the same layout are different if they have
different names (new in LLVM 3.) Various casts to translate between them
5/26/2018 99_DSLsWithLLVM
14/40
A Worked Example
Full source code:http://cs.swan.ac.uk/~csdavec/FOSDEM12/examples.tbz2
Compiler source file:http://cs.swan.ac.uk/~csdavec/FOSDEM12/compiler.cc.html
http://cs.swan.ac.uk/~csdavec/FOSDEM12/examples.tbz2http://cs.swan.ac.uk/~csdavec/FOSDEM12/examples.tbz2http://cs.swan.ac.uk/~csdavec/FOSDEM12/examples.tbz2http://cs.swan.ac.uk/~csdavec/FOSDEM12/compiler.cc.htmlhttp://cs.swan.ac.uk/~csdavec/FOSDEM12/compiler.cc.htmlhttp://cs.swan.ac.uk/~csdavec/FOSDEM12/compiler.cc.htmlhttp://cs.swan.ac.uk/~csdavec/FOSDEM12/compiler.cc.htmlhttp://cs.swan.ac.uk/~csdavec/FOSDEM12/examples.tbz25/26/2018 99_DSLsWithLLVM
15/40
A Simple DSL
Simple language for implementing cellular automata
Programs run on every cell in a grid Lots of compromises to make it easy to implement!
10 per-instance accumulator registers (a0-a9)
10 shared registers (g0-g9)
Current cell value register (v)
5/26/2018 99_DSLsWithLLVM
16/40
Arithmetic Statements
{operator} {register} {expression}
Arithmetic operations are statements - no operatorprecedence.
5/26/2018 99_DSLsWithLLVM
17/40
Neighbours Statements
neigbours ( {statements} )
Only flow control construct in the language
Executes the statements once for every neighbour of thecurrent cell
5/26/2018 99_DSLsWithLLVM
18/40
Select Expressions
[ {register} |
{value or range) => {expression},
{value or range) => {expression}...]
Maps a value in a register to another value selected from a
range Unlisted ranges are implicitly mapped to 0
5/26/2018 99_DSLsWithLLVM
19/40
Examples
Flash every cell:
= v [ v | 0 => 1 ]
Count the neighbours:
neighbours ( + a1 1)
= v a 1
Connways Game of Life:
neighbours ( + a1 a0 )
= v [ v |0 = > [ a 1 | 3 = > 1 ] ,
1 => [ a1 | (2,3) => 1]
]
5/26/2018 99_DSLsWithLLVM
20/40
AST Representation
Nodes with two children Registers and literals encoded into pointers with low bit set
5/26/2018 99_DSLsWithLLVM
21/40
Implementing the Compiler
One C++ file Uses several LLVM classes
Some parts written in C and compiled to LLVM IR with clang
5/26/2018 99_DSLsWithLLVM
22/40
The Most Important LLVM Classes
Module - A compilation unit.
Function - Can you guess?
BasicBlock - a basic block
GlobalVariable (I hope its obvious)
IRBuilder - a helper for creating IR
Type - superclass for all LLVM concrete types
ConstantExpr - superclass for all constant expressions
PassManagerBuilder- Constructs optimisation passes to run
ExecutionEngine - The thing that drives the JIT
5/26/2018 99_DSLsWithLLVM
23/40
The Runtime Library
void a u to m at o n ( i n t 16 _ t * o l dg ri d , i n t1 6 _t *n ew gr id , i n t1 6 _t w id th , i n t1 6 _t
h ei gh t ) {
i nt 16 _t g [ 10 ] = {0} ;
i n t1 6 _t i = 0;
for ( i nt 16_ t x =0 ; x < width ; x ++) {for ( i nt 16_ t y =0 ; y < height ; y ++ , i ++) {
n e wg ri d [ i ] = c el l ( o l dg ri d , n ew gr id , w id th ,
h ei gh t , x , y , o ld gr id [ i ] , g ) ;
}
}
}
Generate LLVM bitcode that we can link into our language:
$ clang -c -emit-llvm runtime.c -o runtime.bc -O0
5/26/2018 99_DSLsWithLLVM
24/40
Setup
// L oa d the r un ti me m od ul e
O w n i ng P t r < M e m o r y B u f f e r > b u f f e r ;
M e m o r y B u f f e r : : g e t F i l e ("runtime.bc", b uf fe r ) ;
Mo d = P a rs eB i tc o de Fi l e ( b uf fe r . g et ( ) , C ) ;
// Get the s tu b f un ct io n
F = Mod - > g e tF un ct io n ("cell") ;// S to p e xp or ti ng it
F - > s e t L i n k a g e ( G l o b a l V a l u e : : P r i v a t e L i n k a g e ) ;
// Set up the first basic block
B as ic Bl oc k * e nt ry =
B a s i c B l o c k : : C r e a t e ( C , "entry", F ) ;// C re at e the t yp e use d for r eg is te rs
r eg Ty = T yp e : : g e t In t 16 T y ( C ) ;
// Get the IR Bu il de r ready to use
B . S e t I n s e r t P o i n t ( e n t r y ) ;
5/26/2018 99_DSLsWithLLVM
25/40
Creating Space for the Registers
for ( int i =0 ; i
5/26/2018 99_DSLsWithLLVM
26/40
GEP? WTF? BBQ?
GEP is short for GetElementPtr
Returns a pointer to an element of a structure or array Does not dereference the pointer
Architecture-agnostic representation of complex addressingmodes
5/26/2018 99_DSLsWithLLVM
27/40
Compiling Arithmetic Statements
V al ue * r eg = B . C r ea te Lo ad ( a [ v al ] ) ;
V al ue * r e su lt = B . C re at eA dd ( r eg , e xp r ) ;
B . C r e a te S t o r e ( r e su lt , a [ v a l ] ) ;
LLVM IR is SSA, but this isnt
Memory is not part of SSA
The Mem2Reg pass will fix this for us
5/26/2018 99_DSLsWithLLVM
28/40
Flow Control
More complex, requires new basic blocks and PHI nodes Range expressions use one block for each range
Fall through to the next one
5/26/2018 99_DSLsWithLLVM
29/40
Range Expressions
P HI No de * p hi = P HI No de : : C r ea te ( r egT y , cou nt , "
result", co nt ) ;
...
// For e ach r ange :
V al ue * m in = C on st an tI nt :: g et ( r eg Ty , m in Va l ) ;
V al ue * m ax = C on st an tI nt :: g et ( r eg Ty , m ax Va l ) ;
match = B . C re a te A nd ( B . C r e a t e I C m p S G E ( reg , min ) ,B . C r e a te I Cm p S LE ( r eg , m ax ) ) ;
B a si c Bl o ck * e x pr = B a si c Bl o ck : : C r ea te ( C , "
range_result", F ) ;
B a si c Bl o ck * n e xt = B a si c Bl o ck : : C r ea te ( C , "
range_next", F ) ;B . C r e a te C on d Br ( m at ch , e xp r , n ex t ) ;
B . S e t I n s e r t P o i n t ( e x p r ) ; // ( G e ne ra te t he
e x pr e ss i on a ft er t hi s )
phi - > a d d I nc o mi n g ( v al , B . G e t I ns e r tB l oc k ( ) ) ;
B . C r e a t e B r ( c o n t ) ;
5/26/2018 99_DSLsWithLLVM
30/40
Optimising the IR
P a s s M a n a ge r B u i l d e r P M B u il d e r ;P M Bu i ld e r . O p t Le v el = o p t im i se L ev e l ;
P M B u i l d e r . I n l i n e r = c r e a t e F u n c t i o n I n l i n i n g P a s s
( 2 7 5 ) ;
F u nc t io n Pa ss M an a ge r * F PM = n ew
F u n c t i o n P a s s M a n a g e r ( M o d ) ;P M B u i l d e r . p o p u l a t e F u n c t i o n P a s s M a n a g e r ( * F P M ) ;
for ( M o du le : : i t er at or I = Mod - > b eg in ( ) ,
E = M o d - > e n d ( ) ; I ! = E ; + + I ) {
if (! I - > i s D e c la r at i on ( ) ) FPM - > r un ( * I ) ;
}F P M - > d o F i n a l i z a t i o n ( ) ;
P as sM an ag er * MP = n ew P as sM an ag er () ;
P M B u i l d e r . p o p u l a t e M o d u l e P a s s M a n a g e r ( * M P ) ;
M P - > r u n ( * M o d ) ;
5/26/2018 99_DSLsWithLLVM
31/40
Generating Code
s td : : s t r in g e rr or ;
E x e cu t io n E ng i n e * E E = E x e cu t i on E n gi n e : : c r e at e (
M o d , false , & e rr or ) ;
if (! EE ) {
fprintf (stderr , "Error:%s\n", e rr or . c _s tr ( ) ) ;e x i t ( - 1 ) ;
}
return ( a u t o m a t o n ) E E - > g e t P o i n t e r T o F u n c t i o n ( M o d - >
g e t F u n c t i o n ("automaton") ) ;
Now we have a function pointer, just like any other functionpointer!
5/26/2018 99_DSLsWithLLVM
32/40
Some Statistics
Component Lines of Code
Parser 400Interpreter 200Compiler 300
Running 200000 iterations of Connways Game of Life on a 50x50grid:
Interpreter
Compiler
5/26/2018 99_DSLsWithLLVM
33/40
Improving Performance
Can we improve the IR we generate? Can LLVM improve the IR for us?
Can we improve the overall system?
5/26/2018 99_DSLsWithLLVM
34/40
Improving the IR
Optimsers work best when they have lots of information to
work with. Split the inner loop into fixed-size blocks?
Generate special versions of the program for edges andcorners?
M k B U f O i i i
5/26/2018 99_DSLsWithLLVM
35/40
Make Better Use of Optimisations
This version uses the default set of LLVM passes
Try changing the order or explicitly adding others
Writing new LLVM parses is quite easy - maybe you can writesome specific to your language?
W i i N P
5/26/2018 99_DSLsWithLLVM
36/40
Writing a New Pass
Tutorial:http://llvm.org/docs/WritingAnLLVMPass.html
ModulePass subclasses modify a whole module
FunctionPass subclasses modify a function
LoopPass subclasses modify a function
Lots of analysis passes create information your passes can use!
E l L ifi P
http://llvm.org/docs/WritingAnLLVMPass.htmlhttp://llvm.org/docs/WritingAnLLVMPass.html5/26/2018 99_DSLsWithLLVM
37/40
Example Language-specific Passes
ARC Optimisations:
Part of LLVM
Elide reference counting operations in Objective-C code whennot required
Makes heavy use of LLVMs flow control analysis
GNUstep Objective-C runtime optimisations:
Distributed with the runtime.
Can be used by clang (Objective-C) or LanguageKit(Smalltalk)
Cache method lookups, turn dynamic into static behaviour ifsafe
Oth Lib i
5/26/2018 99_DSLsWithLLVM
38/40
Other Libraries
LLVM is not the end, when designing a language
Its trivial to call into other libraries with C APIs from
LLVM-generated code. libdispatch gives cheap concurrency
libgc gives garbage collection
libobjc2 gives a dynamic object model
This language is conceptually parallel - why not use libdispatch torun 16x16 blocks concurrently?
FFI Aid d b Cl
5/26/2018 99_DSLsWithLLVM
39/40
FFI Aided by Clang
libclang allows you to easily parse headers.
Can get names, type encodings for functions. No explicit FFI
Pragmatic Smalltalk uses this to provide a C alien: messagessent to C are turned into function calls
5/26/2018 99_DSLsWithLLVM
40/40
Questions?