76
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Control Flow Analysis & Local Optimizations

Advanced Compiler Techniques

  • Upload
    adila

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

Advanced Compiler Techniques. Control Flow Analysis & Local Optimizations. LIU Xianhua School of EECS, Peking University. Levels of Optimizations. Local inside a basic block Global (intraprocedural) Across basic blocks Whole procedure analysis Interprocedural Across procedures - PowerPoint PPT Presentation

Citation preview

Page 1: Advanced Compiler Techniques

Advanced Compiler Techniques

LIU Xianhua

School of EECS, Peking University

Control Flow Analysis & Local Optimizations

Page 2: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Levels of Optimizations

• Local– inside a basic block

• Global (intraprocedural)– Across basic blocks–Whole procedure analysis

• Interprocedural– Across procedures–Whole program analysis

2

Page 3: Advanced Compiler Techniques

3

The Golden Rules of Optimization Premature Optimization is Evil Donald Knuth, premature optimization

is the root of all evil Optimization can introduce new, subtle bugs Optimization usually makes code harder to

understand and maintain Get your code right first, then, if really

needed, optimize it Document optimizations carefully Keep the non-optimized version handy, or

even as a comment in your code

“Advanced Compiler Techniques”

Page 4: Advanced Compiler Techniques

4

The Golden Rules of Optimization The 80/20 Rule In general, 80% percent of a

program’s execution time is spent executing 20% of the code

90%/10% for performance-hungry programs

Spend your time optimizing the important 10/20% of your program

Optimize the common case even at the cost of making the uncommon case slower

“Advanced Compiler Techniques”

Page 5: Advanced Compiler Techniques

5

The Golden Rules of Optimization Good Algorithms Rule The best and most important way of optimizing

a program is using good algorithms E.g. O(n*log) rather than O(n2)

However, we still need lower level optimization to get more of our programs

In addition, asymptotic complexity is not always an appropriate metric of efficiency Hidden constant may be misleading E.g. a linear time algorithm than runs in 100*n+100

time is slower than a cubic time algorithm than runs in n3+10 time if the problem size is small

“Advanced Compiler Techniques”

Page 6: Advanced Compiler Techniques

“Advanced Compiler Techniques”

General Optimization Techniques

• Strength reduction– Use the fastest version of an operation– E.g.

x >> 2 instead of x / 4x << 1 instead of x * 2

• Common sub expression elimination– Eliminate redundant calculations– E.g.

double x = d * (lim / max) * sx;double y = d * (lim / max) * sy;

double depth = d * (lim / max); double x = depth * sx; double y = depth * sy;

6

Page 7: Advanced Compiler Techniques

“Advanced Compiler Techniques”

General Optimization Techniques

• Code motion– Invariant expressions should be executed only

once– E.g.

for (int i = 0; i < x.length; i++) x[i] *= Math.PI * Math.cos(y);

double picosy = Math.PI * Math.cos(y);for (int i = 0; i < x.length; i++) x[i] *= picosy;

7

Page 8: Advanced Compiler Techniques

“Advanced Compiler Techniques”

General Optimization Techniques

• Loop unrolling– The overhead of the loop control code can be

reduced by executing more than one iteration in the body of the loop. E.g.double picosy = Math.PI * Math.cos(y);for (int i = 0; i < x.length; i++) x[i] *= picosy;

double picosy = Math.PI * Math.cos(y);for (int i = 0; i < x.length; i += 2) { x[i] *= picosy; x[i+1] *= picosy;}

8

Page 9: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Compiler Optimizations

• Compilers try to generate good code– i.e. Fast

• Code improvement is challenging–Many problems are NP-hard

• Code improvement may slow down the compilation process– In some domains, such as just-in-time

compilation, compilation speed is critical

9

Page 10: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Phases of Compilation• The first three

phases are language-dependent

• The last two are machine-dependent

• The middle two dependent on neither the language nor the machine

10

Page 11: Advanced Compiler Techniques

11

Phases

“Advanced Compiler Techniques”

Page 12: Advanced Compiler Techniques

Control Flow

• Control transfer = branch (taken or fall-through)• Control flow

– Branching behavior of an application– What sequences of instructions can be executed

• Execution Dynamic control flow– Direction of a particular instance of a branch– Predict, speculate, squash, etc.

• Compiler Static control flow– Not executing the program– Input not known, so what could happen

• Control flow analysis– Determining properties of the program branch structure– Determining instruction execution properties

12“Advanced Compiler Techniques”

Page 13: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Basic Blocks

• A basic block is a maximal sequence of consecutive three-address instructions with the following properties:– The flow of control can only enter the

basic block thru the 1st instruction in the block. (no jumps into the middle of the block)

– Control will leave the block without halting or branching, except possibly at the last instruction in the block.

• Basic blocks become the nodes of a flow graph, with edges indicating the order.

13

Page 14: Advanced Compiler Techniques

14

Examples1) i = 12) j = 13) t1 = 10 * i4) t2 = t1 + j5) t3 = 8 * t26) t4 = t3 - 887) a[t4] = 0.08) j = j + 19) if j <= 10 goto (3)10)i = i + 111)if i <= 10 goto (2)12)i = 113)t5 = i - 114)t6 = 88 * t515)a[t6] = 1.016)i = i + 117)if i <= 10 goto (13)

for i from 1 to 10 do for j from 1 to 10 do a[i,j]=0.0

for i from 1 to 10 do a[i,i]=0.0

“Advanced Compiler Techniques”

Page 15: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Identifying Basic Blocks

• Input: sequence of instructions instr(i)

• Output: A list of basic blocks• Method:– Identify leaders:

the first instruction of a basic block– Iterate: add subsequent instructions to

basic block until we reach another leader

15

Page 16: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Identifying Leaders

• Rules for finding leaders in code– First instr in the code is a leader– Any instr that is the target of a

(conditional or unconditional) jump is a leader

– Any instr that immediately follow a (conditional or unconditional) jump is a leader

16

Page 17: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Basic Block Partition Algorithm

leaders = {1}// start of programfor i = 1 to |n| // all instructions

if instr(i) is a branchleaders = leaders U targets of instr(i) U instr(i+1)

worklist = leadersWhile worklist not empty

x = first instruction in worklistworklist = worklist – {x}block(x) = {x}for i = x + 1; i <= |n| && i not in leaders; i++block(x) = block(x) U {i}

17

Page 18: Advanced Compiler Techniques

“Advanced Compiler Techniques”

E

AB

C

D

F

Basic Block Example

Leaders

1. i = 12. j = 13. t1 = 10 * i4. t2 = t1 + j5. t3 = 8 * t26. t4 = t3 - 887. a[t4] = 0.08. j = j + 19. if j <= 10 goto (3)10. i = i + 111. if i <= 10 goto (2)12. i = 113. t5 = i - 114. t6 = 88 * t515. a[t6] = 1.016. i = i + 117. if i <= 10 goto

(13)

Basic Blocks

18

Page 19: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Control-Flow Graphs

• Control-flow graph:– Node: an instruction or sequence of

instructions (a basic block)• Two instructions i, j in same basic block

iff execution of i guarantees execution of j– Directed edge: potential flow of control– Distinguished start node Entry & Exit• First & last instruction in program

19

Page 20: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Control-Flow Edges

• Basic blocks = nodes• Edges:– Add directed edge between P and S if:• Jump/branch from last statement of P to first

statement of S, or• According to the initial order, S immediately

follows P in program order and P does not end with unconditional branch (goto/return/call)

– Definition of predecessor and successor• P is a predecessor of S• S is a successor of P 20

Page 21: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Control-Flow Edge AlgorithmInput: block(i), sequence of basic

blocksOutput: CFG where nodes are basic

blocks

for i = 1 to the number of blocksx = last instruction of block(i)if instr(x) is a branch/jump

for each target y of instr(x),

create edge (i -> y)if instr(x) is not unconditional branch,

create edge (i -> i+1)

21

Page 22: Advanced Compiler Techniques

Dominator

• Defn: Dominator – Given a CFG(V, E, Entry, Exit), a node x dominates a node y, if every path from the Entry block to y contains x

• In the reverse direction, node x post-dominates block y if every path from y to the exit has to pass through block x.

• Some properties of dominators:– Reflexivity, transitivity, anti-symmetry– If x dominates z and y dominates z, then either x dominates y

or y dominates x• Intuition– Given some BB, which blocks are guaranteed to have

executed prior to executing the BB

22“Advanced Compiler Techniques”

Page 23: Advanced Compiler Techniques

23

Dominator Tree It is said that a block x immediately dominates block y if x

dominates y, and there is no intervening block P such that x dominates P and P dominates y. In other words, x is the last dominator on all paths from entry to y. Each block has a unique immediate dominator.

A dominator tree is a tree where each node's children are those nodes it immediately dominates. Because the immediate dominator is unique, it is a tree. The start node is the root of the tree.

1

35

24

1

35

24

{1,5}

{1,4}

{1,2,3}

{1,2}

{1}

“Advanced Compiler Techniques”

Page 24: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Loops

• Loops comes from– while, do-while, for, goto……

• Many transformation depends on loops• Back edge: An edge is a back edge if its head

dominates its tail.• Loop definition: A set of nodes L in a CFG is a

loop if1. There is a node called the loop entry: no

other node in L has a predecessor outside L.2. Every node in L has a nonempty path (within

L) to the entry of L.24

Page 25: Advanced Compiler Techniques

25

Example: Back Edges

1

35

24

{1,5}

{1,4}

{1,2,3}

{1,2}

{1}

1

35

24

{1,5}

{1,4}

{1,2,3}

{1,2}

{1}

“Advanced Compiler Techniques”

DAG ( Directed Acyclic Graph )

CFG ( Control Flow Graph )

Page 26: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Loop Examples

• {B3}• {B6}• {B2, B3, B4}

26

Page 27: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Identifying Loops• Motivation–majority of runtimefocus optimization on loop bodies!• remove redundant code, replace expensive

operations ) speed up program

• Finding loops:– easy…

for i = 1 to 1000 for j = 1 to 1000 for k = 1 to 1000 do something

1 i = 1; j = 1; k = 1;2 A1: if i > 1000 goto L1;3 A2: if j > 1000 goto L2;4 A3: if k > 1000 goto L3;5 do something6 k = k + 1; goto A3;7 L3: j = j + 1; goto A2;8 L2: i = i + 1; goto A1;9 L1: halt

or harder(GOTOs)

27

Page 28: Advanced Compiler Techniques

28

Interval Analysis(T1/T2 Trans)

T1 Transformation

T2 Transformation

“Advanced Compiler Techniques”

Page 29: Advanced Compiler Techniques

29

Interval Analysis(T1/T2 Trans)

1

35

24

T2

T2

“Advanced Compiler Techniques”

Page 30: Advanced Compiler Techniques

30

Interval Analysis(T1/T2 Trans)

14

5

23 T1

“Advanced Compiler Techniques”

Page 31: Advanced Compiler Techniques

31

Interval Analysis(T1/T2 Trans)

14

5

23

T2

“Advanced Compiler Techniques”

Page 32: Advanced Compiler Techniques

32

Interval Analysis(T1/T2 Trans)

12345 T1

12345

“Advanced Compiler Techniques”

Page 33: Advanced Compiler Techniques

33

Structure Analysis

StaticFeatures

Desription

1 SS_No. 典型子结构唯一标识 2 Edge_No. 典型子结构中控制流边的唯一标识3 I_last_of_head 该边首基本块最后一条指令的操作码4 Br_direction 该边首基本块最后一条指令的跳转方向5 I_pre_last 该边首基本块最后一条指令的前一条指令的操作码

“Advanced Compiler Techniques”

Page 34: Advanced Compiler Techniques

Weighted CFG

34

Profiling – Run the application on 1 or more sample inputs, record some behavior Control flow profiling

edge profile block profile

Path profiling Cache profiling Memory dependence profiling

Annotate control flow profile onto a CFG weighted CFG

Optimize more effectively with profile info!! Optimize for the common case Make educated guess

BB1

BB2

BB4

BB3

BB5 BB6

BB7

Entry

Exit

20

10 10

10 10

20 0

20 0

20

“Advanced Compiler Techniques”

Page 35: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Local Optimization

• Optimization of basic blocks

• §8.5

35

Page 36: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Transformations on basic blocks

• eliminating local common sub-expressions• eliminating dead code• reordering statements that do not depend on

one another• applying algebraic laws to reorder operands

of three-address instructions

• All of the above require symbolic execution of the basic block, to obtain def/use information

36

Page 37: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Simple symbolic interpretation:next-use information

• If x is computed in statement i, and is an operand of statement j, j > i, its value must be preserved (register or memory) until j.

• If x is computed at k, k > i, the value computed at i has no further use, and be discarded (i.e. register reused)

• Next-use information is annotated over statements and symbol table.

• Computed on one backwards pass over statement.

37

Page 38: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Next-Use Information• Definitions

1. Statement i assigns a value to x;2. Statement j has x as an operand;3. Control can flow from i to j along a path

with no intervening assignments to x;

Statement j uses the value of x computed at statement i.

i.e., x is live at statement i.

38

Page 39: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Computing next-use

• Use symbol table to annotate status of variables

• Each operand in a statement carries additional information:–Operand liveness (boolean)–Operand next use (later statement)

• On exit from block, all temporaries are dead (no next-use) 39

Page 40: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Algorithm• INPUT: a basic block B• OUTPUT: at each statement i: x=y op z in B,

create liveness and next-use for x, y, z• METHOD: for each statement in B

(backward)– Retrieve liveness & next-use info from a table– Set x to “not live” and “no next-use”– Set y, z to “live” and the next uses of y,z to “i”

• Note: step 2 & 3 cannot be interchanged.– E.g., x = x + y

40

Page 41: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Example1. x = 12. y = 13. x = x + y4. z = y5. x = y + z

Exit:x: live, 6 y: not livez: not live

41

Exit:x: live, 6 y: not livez: not live

5:x: not live, noy: live, 5z: live, 5

4:x: not live, noy: live, 4z: not live, no

3:x: live, 3y: live, 3z: not live, no

2:x: live, 3 y: not live, noz: not live, no

1:x: not live, noy: not live, noz: not live, no

Page 42: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Computing dependencies in BB: the DAG

• Use directed acyclic graph (DAG) to recognize common subexpressions and remove redundant quadruples.

• Intermediate code optimization:– basic block => DAG => improved block =>

assembly

• Leaves are labeled with identifiers and constants.

• Internal nodes are labeled with operators and identifiers 42

Page 43: Advanced Compiler Techniques

DAG Representation of Basic Blocks Construct a DAG for a basic block

1. There is a node in the DAG for each of the initial values of the variables appearing in the basic block.

2. There is a node N associated with each statement s within the block. The children of N are those nodes corresponding to statements that are the last definitions, prior to s, of the operands used by s.

3. Node N is labeled by the operator applied at s, and also attached to N is the list of variables for which it is the last definition within the block.

4. Certain nodes are designated output nodes. These are the nodes whose variables are live on exit from the block; that is, their values may be used later, in another block of the flow graph. “Advanced Compiler Techniques” 43

Page 44: Advanced Compiler Techniques

“Advanced Compiler Techniques”

DAG construction

• Forward pass over basic block• For x = y op z;

– Find node labeled y, or create one – Find node labeled z, or create one– Create new node for op, or find an existing one with

descendants y, z (need hash scheme)– Add x to list of labels for new node– Remove label x from node on which it appeared

• For x = y;– Add x to list of labels of node which currently holds y

44

a = b + cb = a – dc = b + cd = a - d

+

b0 c0

a—

d0

+ cb d

Page 45: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Finding Local Common Subexpr.

• Suppose b is not live on exit.a = b + cb = a – dc = b + cd = a - d

+

+

-

b0 c0

d0a

b, d

c

a = b + cd = a – dc = d + c

45

a = b + cd = a – db = dc = d + c

Page 46: Advanced Compiler Techniques

“Advanced Compiler Techniques”

LCS: another example

a = b + cb = b – dc = c + de = b + c

++ -

b0 c0 d0

a b

e+

c

46

Page 47: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Dead Code Elimination• Delete any root that has no live variables attached• Repeated application of this transformation will

remove all nodes from the DAG that correspond to dead code.

a = b + cb = b – dc = c + de = b + c

++ -

b0 c0 d0

a b

e+

c On exit:a, b livec, e not live

a = b + cb = b – d

47

Page 48: Advanced Compiler Techniques

The Use of Algebraic Identities Eliminate computations

Reduction in strength

Constant folding 2*3.14 = 6.28 evaluated at compile time

Other algebraic transformations x*y => y*x x>y => x-y>0 a=b+c; e=c+d+b; => a=b+c; e=a+d;

“Advanced Compiler Techniques” 48

Page 49: Advanced Compiler Techniques

Representation of Array References x = a[i] a[j]=y killed node

“Advanced Compiler Techniques” 49

x = a[i]a[j] = yz = a[i]

z = x??

Page 50: Advanced Compiler Techniques

Representation of Array References

a is an array. b is a position in the array a.

x is killed by b[j]=y.“Advanced Compiler Techniques” 50

b = a + 12x = b[i]b[j] = y

Page 51: Advanced Compiler Techniques

Pointer Assign. & Proc. Calls Problem of the following assignments

x = *p *q = y we do not know what p or q point to. x = *p is a use of every variable *q = y is a possible assignment to every variable. the operator =* must take all nodes that are currently

associated with identifiers as arguments, which is relevant for dead-code elimination.

the *= operator kills all other nodes so far constructed in the DAG.

Global pointer analyses can be used to limit the set of variables

Procedure calls behave much like assignments through pointers. Assume that a procedure uses and changes any data to which

it has access. If variable x is in the scope of a procedure P, a call to P both

uses the node with attached variable x and kills that node.“Advanced Compiler Techniques” 51

Page 52: Advanced Compiler Techniques

Reassembling BBs From DAG 's

b is not live on exit

b is live on exit

“Advanced Compiler Techniques” 52

Page 53: Advanced Compiler Techniques

Reassembling BBs From DAG 's The rules of reassembling

The order of instructions must respect the order of nodes in the DAG

Assignments to an array must follow all previous assignments to, or evaluations from, the same array

Evaluations of array elements must follow any previous assignments to the same array

Any use of a variable must follow all previous procedure calls or indirect assignments through a pointer.

Any procedure call or indirect assignment through a pointer must follow all previous evaluations of any variable.

“Advanced Compiler Techniques” 53

Page 54: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Peephole Optimization

• Dragon§8.7

• Introduction to peephole• Common techniques• Algebraic identities• An example

54

Page 55: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Peephole Optimization

• Simple compiler do not perform machine-independent code improvement– They generates naive code

• It is possible to take the target hole and optimize it– Sub-optimal sequences of instructions that

match an optimization pattern are transformed into optimal sequences of instructions

– This technique is known as peephole optimization

– Peephole optimization usually works by sliding a window of several instructions (a peephole)55

Page 56: Advanced Compiler Techniques

56

Peephole Optimization Goals:

- improve performance- reduce memory footprint- reduce code size

Method: 1. Exam short sequences of target instructions 2. Replacing the sequence by a more efficient one.• redundant-instruction elimination • algebraic simplifications• flow-of-control optimizations • use of machine idioms

“Advanced Compiler Techniques”

Page 57: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Peephole OptimizationCommon Techniques

57

Page 58: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Peephole OptimizationCommon Techniques

58

Page 59: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Peephole OptimizationCommon Techniques

59

Page 60: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Peephole OptimizationCommon Techniques

60

Page 61: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Algebraic identities

• Worth recognizing single instructions with a constant operand– Eliminate computations

• A * 1 = A• A * 0 = 0• A / 1 = A

– Reduce strenth• A * 2 = A + A• A/2 = A * 0.5

– Constant folding• 2 * 3.14 = 6.28

• More delicate with floating-point61

Page 62: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Is this ever helpful?

• Why would anyone write X * 1?• Why bother to correct such obvious junk

code?• In fact one might write– #define MAX_TASKS 1

...a = b * MAX_TASKS;

• Also, seemingly redundant code can be produced by other optimizations. – This is an important effect.

62

Page 63: Advanced Compiler Techniques

63

Replace Multiply by Shift A := A * 4;

Can be replaced by 2-bit left shift (signed/unsigned)

But must worry about overflow if language does

A := A / 4; If unsigned, can replace with shift right But shift right arithmetic is a well-known

problem Language may allow it anyway (traditional C)

“Advanced Compiler Techniques”

Page 64: Advanced Compiler Techniques

64

The Right Shift problem Arithmetic Right shift:

shift right and use sign bit to fill most significant bits

-5 111111...1111111011 SAR 111111...1111111101 which is -3, not -2 in most languages -5/2 = -2

“Advanced Compiler Techniques”

Page 65: Advanced Compiler Techniques

65

Addition chains for multiplication If multiply is very slow (or on a machine

with no multiply instruction like the original SPARC), decomposing a constant operand into sum of powers of two can be effective: X * 125 = x * 128 - x*4 + x two shifts, one subtract and one add, which

may be faster than one multiply Note similarity with efficient exponentiation

method

“Advanced Compiler Techniques”

Page 66: Advanced Compiler Techniques

66

Flow-of-control optimizations

goto L1 . . .L1: goto L2

goto L2 . . .L1: goto L2

if a < b goto L1 . . .L1: goto L2

if a < b goto L2 . . .L1: goto L2

goto L1 . . .L1: if a < b goto L2L3:

if a < b goto L2 goto L3 . . .L3:

“Advanced Compiler Techniques”

Page 67: Advanced Compiler Techniques

67

Peephole Opt: an Example

debug = 0. . .if(debug) { print debugging information }

debug = 0 . . . if debug = 1 goto L1 goto L2L1: print debugging informationL2:

Source Code:

IntermediateCode:

“Advanced Compiler Techniques”

Page 68: Advanced Compiler Techniques

68

Eliminate Jump after Jump

Before:

After:

debug = 0 . . . if debug = 1 goto L1 goto L2L1: print debugging informationL2:

debug = 0 . . . if debug 1 goto L2 print debugging informationL2:

“Advanced Compiler Techniques”

Page 69: Advanced Compiler Techniques

69

Constant Propagation

Before:

After:

debug = 0 . . . if debug 1 goto L2 print debugging informationL2:

debug = 0 . . . if 0 1 goto L2 print debugging informationL2:

“Advanced Compiler Techniques”

Page 70: Advanced Compiler Techniques

70

Unreachable Code(dead code elimination)

Before:

After:

debug = 0 . . .

debug = 0 . . . if 0 1 goto L2 print debugging informationL2:

“Advanced Compiler Techniques”

Page 71: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Peephole Optimization Summary

• Peephole optimization is very fast– Small overhead per instruction since

they use a small, fixed-size window

• It is often easier to generate naïve code and run peephole optimization than generating good code!

71

Page 72: Advanced Compiler Techniques

“Advanced Compiler Techniques”

Summary

• Introduction to optimization

• Control Flow Analysis

• Basic knowledge– Basic blocks– Control-flow graphs

• Local Optimizations– Peephole optimizations

72

Page 73: Advanced Compiler Techniques

“Advanced Compiler Techniques”

HW & Next Time

• Homework– EX 8.4.1, 8.5.1, 8.5.2

• Next Time: Dataflow analysis– Dragon§9.2

73

Page 74: Advanced Compiler Techniques

If You Want to Get Started …

• Go to http://llvm.org• Download and install LLVM on your

favorite Linux box– Read the installation instructions to help

you–Will need gcc 4.x

• Try to run it on a simple C program

74“Advanced Compiler Techniques”

Page 75: Advanced Compiler Techniques

编译支持:对循环和函数调用的分析自动识别循环和函数调用结构

分析远距离转移指令的关联关系 插入引导指令 二进制程序

Source codeSource code

C/C++源代码

BB1

BB2 BB3

BB4

Pred-branches

Succ-branches

header

BB1

BB2 BB3

BB4

guide_save_history

guide_restore history

Pred-branches

Succ-branches

header

preheader

tail

BB1

call proc

Succ-branches

Pred-branches

BB1

guide_save_historycall proc

guide_retsore_history

Succ-branches

Pred-branches

循环结构变换和引导指令插入函数结构变换和引导指令插入

Page 76: Advanced Compiler Techniques

PSA 预测技术的编译支持 编译支持:

首先,编译器根据子程序结构信息和静态数据依赖图,分析哪些数据值与间接转移指令具有较强关联性 然后,编译器为每个指令插入引导指令并进行调度

switch(k){ case 0...3: ...//action 1 case 4...7: ...//action 2 case 8...11: ...//action 3}

(a) source code

(b) assemble code

Basic c_value

Case block1Case block2Case block3

...C

c_value

(c) Eexecution of switch-case statement

Normalized c_value

r_value ctable

R G J

r <- k ; C r <- k/4 ; R r <- [GP + sl] ; G jump r ; J

L1:… ;;action 1L2:… ;;action 2

r <- o_addr ; O r <- [r + f_off] ; F jump r ; C

;; (obj->func)();

object Func pointerO F C

obj obj->func

(a) source & assembler code

(b) Execution of function pointer call

r <- o_addr ; O r <- [r + v_off] ; V r <- [r + f_off] ; F jump r ; C

+f()

Base

+f()

Derive

;;Base d <= new Derive();;;d.f();

Obj V_fun_addr

V_fun1V_fun2V_fun3

...O V F C

dvtable D.f

(a) class hierarchy(b) source & assembler code

(c) Eexecution of virtual function call

虚函数调用以虚函数表首地址作为关联数据值

Switch-Case 语句以规格化之后的 case值作为关联数据值

函数指针调用以指针值或指针数组索引作为关联数据值