18
/ 17 LLVM Pass and Code Instrumentation Prof. Moonzoo Kim CS Dept., KAIST 08/23/202 2 CS453 Automated Software Testing

LLVM Pass and Code Instrumentation

Embed Size (px)

DESCRIPTION

CS453 Automated Software Testing. LLVM Pass and Code Instrumentation. Prof. Moonzoo Kim CS Dept., KAIST. Pass in LLVM. A Pass receives an LLVM IR and performs analyses and/or transformations. Using opt , it is possible to run each Pass. - PowerPoint PPT Presentation

Citation preview

Page 1: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation

Prof. Moonzoo KimCS Dept., KAIST

CS453 Automated Software Testing

Page 2: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 2

Pass in LLVM

• A Pass receives an LLVM IR and performs analyses and/or transformations.– Using opt, it is possible to run each Pass.

• A Pass can be executed in a middle of compiling process from source code to binary code.– The pipeline of Passes is arranged by Pass Manager

C/C++frontend

Pass1 Passn llcIR IR1 … IRnExecutable

codeSource

code

Opt

Clang

Page 3: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 3

LLVM Pass Framework

• The LLVM Pass Framework is the library to manipulate an AST of LLVM IR (http://llvm.org/doxygen/index.html)

• An LLVM Pass is an implementation of a subclass of the Pass class – Each Pass is defined as visitor on a certain type of LLVM AST nodes– There are six subclasses of Pass

• ModulePass: visit each module (file)• CallGraphSCCPass: visit each set of functions with caller-call relations in

a module (useful to draw a call graph)• FunctionPass: visit each function in a module• LoopPass: visit each set of basic blocks of a loop in each function• RegionPass: visit the basic blocks not in any loop in each function• BasicBlockPass: visit each basic block in each function

Page 4: LLVM Pass and Code Instrumentation

/ 1704/19/2023

4

Control Flow Graph (CFG) at LLVM IR int f() { int y; y = (x > 0) ? x : 0 ; return y;}

1 entry: 2 … 3 %0 = load i32* %x 4 %c = icmp sgt i32 %0 0 5 br i1 %c, label %c.t, %c.f

6 c.t: 7 %1 = load i32* %x 8 br label %c.end

9 c.f:10 br label %c.end

11 c.end:12 %cond = phi i32 [%1,%c.t],[0,%c.f]13 store i32 %cond, i32* %y14 return i32 %cond

CFG

2 …3 %0=…4 %c=…5 br i1 %c…

entry:

c.t:

7 %1=load i32* …8 br label %c.end

c.f:

10 br label %c.end

12 %cond=phi13 store …14 return …

c.end:

terminator

terminator

terminatorterminator

Page 5: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 5

Example Pass• Let’s create IntWrite that aim to monitor all history of 32-bit inte-

ger variable updates (definitions)– Implemented as a FunctionPass– Produces a text file where it record which variable is defined as which value

at which code location.

• IntWrite instruments a target program to insert a probe before ev-ery integer writing operation, which extracts runtime information

10 y = x ;11 z = y + x ;

_probe_(10, “y”, x);10 y = x ; _probe_(11, “z”, y+x);11 z = y + x ; … void _probe_(int l,char *,int v){ fprintf(fp, “%d %s %d\n”,…);}

Page 6: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 6

Module Class

• A Module instance stores all information related to the LLVM IR created by a target program file (functions, global variables, etc.)

• APIs (public methods)

– getModuleIdentifier(): return the name of the module

– getFunction(StringRef Name): return the Function in-stance whose identifier is Name in the module

– getOrInsertFunction(StringRef Name, Type *Re-turnType,…): add a new Function instance whose identifier is Name to the module

– getGlobalVariable(StringRef Name): return the Glob-alVariable instance whose identifier is Name in the module

Page 7: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 7

Type Class

• A Type instance is used for representing the data type of reg-isters, variables, and function arguments.

• Static members– Type::getVoidTy(…): void type– Type::getInt8Ty(…): 8-bit unsigned integer (char) type– Type::getInt32Ty(…): 32-bit unsigned integer type– Type::getInt8PtrTy(…): 8-bit pointer type– Type::getDoubleTy(…): 64-bit IEEE floating pointer type

Page 8: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 8

FunctionPass Class (1/2)

• FunctionPass::doInitialization(Module &)– Executed once for a module (file) before any visitor method execution– Do necessary initializations, and modify the given Module instances

(e.g., add a new function declaration)

• FunctionPass::doFinalization(Module &)– Executed once for a module (file) before after all visitor method execu-

tions– Export the information obtained from the analysis or the transforma-

tion, any wrap-up

Page 9: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 9

Example• IntWrite should inserts a new function _init_ at the begin-

ning of the target program’s main function– _init_() is to open an output file

01 virtual bool doInitialization(Module & M) {02 if(M.getFunction(StartingRef(“_init_”))!=NULL){ 03 errs() << “_init_() already exists.” ; 04 exit(1) ;05 }

06 FunctionType *fty = FunctionType::get(Type::getVoidTy(M.getContext()),false) ;07 fp_init_ = M.getOrInsertFunction(“_init_”, fty) ;

...08 return true ;09 }

check if _init_() already exists

add a new declaration _init_()

Page 10: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 10

FunctionPass Class (2/2)

• runOnFunction(Function &)– Executed once for every function defined in the module

• The execution order in different functions is not possible to control.– Read and modify the target function definition

• Function Class– getFunctionType(): returns the FunctionType instance that

contains the information on the types of function arguments.– getEntryBlock(): returns the BasicBlock instance of the en-

try basic block.– begin(): the head of the BasicBlock iterator– end(): the end of the BasicBlock iterator

Page 11: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 11

Example

01 virtual bool runOnFunction(Function &F) {

02 cout << “Analyzing “ << F->getName() << “\n” ;03 for (Function::iterator i = F.begin(); i != F.end(); i++){

04 runOnBasicBlock(*i) ;

05 }

06 return true;//You should return true if F was modified. False otherwise.

07 }

Page 12: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 12

BasicBlock Class

• A BasicBlock instance contains a list of instructions

• APIs– begin(): return the iterator of the beginning of the basic block

– end(): return the iterator of the end of the basic block

– getFirstInsertionPt(): return the first iterator (i.e., the first instruction location) where a new instruction can be added safely (i.e., after phi instruction and debug intrinsic)

– getTerminator(): return the terminator instruction

– splitBasicBlock(iterator I, …): split the basic block into two at the instruction of I.

Page 13: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 13

Instruction Class

• An Instruction instance contains the information of an LLVM IR instruction.

• Each type of instruction has a subclass of Instruction (e.g. LoadInst, BranchInst)

• APIs– getOpcode(): returns the opcode which indicates the instruction

type– getOperand(unsigned i): return the i-th operand– getDebugLoc(): obtain the debugging data that contains the infor-

mation on the corresponding code location– isTerminator(), isBinaryOp() , isCast(), ….

Page 14: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 14

Examplebool runOnBasicBlock(BasicBlock &B) {

for(BasicBlock::iterator i = B.begin(); i != B.end(); i++){

if(i->getOpcode() == Instruction::Store &&

i->getOperand(0)->getType() == Type::getInt32Ty(ctx)){

StoreInst * st = dyn_cast<StoreInst>(i);

int loc = st->getDebugLoc().getLine(); //code location

Value * var = st->getPointerOperand(); //variable

Value * val = st->getOperand(0); // target register

/* insert a function call */

}

}

return true ;

}

01

02

03

04

05

06

07

08

09

10

11

12

13

Page 15: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 15

How to Insert New Instructions

• IBBuilder class provides a uniform API for inserting in-structions to a basic block.– IRBuilder(Instruction *p): create an IRBuilder in-

stance that can insert instructions right before Instruction *p

• APIs– CreateAdd(Value *LHS, Value *RHS, …): create an add

instruction whose operands are LHS and RHS at the predefined loca-tion, and then returns the Value instance of the target operand

– CreateCall(Value *Callee, Value *Arg,…): add a new call instruction to function Callee with the argument as Arg

– CreateSub(), CreateMul(), CreateAnd(), …

Page 16: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 16

Value Class• A Value is a super class of all entities in LLVM IR such as a

constant, a register, a variable, and a function.

• The register defined by an Instruction is represented as a Value instance.

• APIs– getType(): returns the Type instance of a Value instance.– getName(): return the name from the source code.

Page 17: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 17

Example00 if(i->getOpcode() == Instruction::Store &&01 i->getOperand(0)->getType() == Type::getInt32Ty(ctx) {

02 StoreInst * st = dyn_cast<StoreInst>(i);

03 int loc = st->getDebugLoc().getLine(); //code location04 Value * var = st->getPointerOperand(); //variable05 Value * val = st->getOperand(0); // target register

06 IRBuilder<> builder(i) ;07 Value * args[3] ;08 args[0] = ConstantInt::get(intTy, loc, false) ; 09 args[1] = builder.CreateGlobalStringPtr(var->getName(),"");10 args[2] = val ;

11 builder.CreateCall(p_probe, args, Twine("")) ;12 }

Page 18: LLVM Pass and Code Instrumentation

/ 1704/19/2023

LLVM Pass and Code Instrumentation 18

More Information• Writing an LLVM Pass

– http:// llvm.org/docs/WritingAnLLVMPass.html

• LLVM API Documentation– http://llvm.org/doxygen/

• How to Build and Run an LLVM Pass for Homework#4– http://swtv.kaist.ac.kr/courses/s453-14fall/hw4-manual.pdf