32
Compiler Lecture Note, Intermediate Language Page 1 PL&C Lab, DongGuk University 9 중 중 중중 중중중중 중중

PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Embed Size (px)

Citation preview

Page 1: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 1

PL&C Lab, DongGuk University

제 9 장중 간 언어

컴파일러 입문

Page 2: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 2

PL&C Lab, DongGuk University

Contents

• Introduction

• Polish Notation

• Three Address Code

• Tree Structured Code

• Abstract Machine Code

• Concluding Remarks

Page 3: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 3

PL&C Lab, DongGuk University

• Compiler Model

Source Program

Lexical Analyzer

Syntax Analyzer

Semantic Analyzer

Intermediate Code Generator

tokens

AST

Front-End

Code Optimizer

Target Code Generator

IC

Back-End

IL

Object Program

Front-End- language dependant partBack-End - machine dependant part

Introduction

Page 4: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 4

PL&C Lab, DongGuk University

• IL 의 필요성– Modular Construction

– Automatic Construction

– Easy Translation

– Portability

– Optimization

– Bootstrapping

• IL 의 분류– Polish Notation --- Postfix, IR

– Three Address Code --- Quadruple, Triple, Indirect triple

– Tree Structured Code --- PT, AST, TCOL

– Abstract Machine Code --- P-code, EM-code, U-code, Bytecode

Page 5: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 5

PL&C Lab, DongGuk University

• Two level Code Generation

• ILS

– 소스로부터 자동화에 의해 얻을 수 있는 형태– 소스 언어에 의존적이며 high level 이다 .

• ILT

– 후단부의 자동화에 의해 목적기계로의 번역이 매우 쉬운 형태– 목적기계에 의존적이며 low level 이다 .

• ILS to ILT

– ILS 에서 ILT 로의 번역이 주된 작업임 .

Source Front-End ILS ILS-ILT ILT Back-End Target

Page 6: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 6

PL&C Lab, DongGuk University

☞ Polish mathematician Lucasiewiez invented the parenthesis-free notation.

• Postfix(Suffix) Polish Notation• earliest IL

• popular for interpreted language - SNOBOL, BASIC

– general form :

e1 e2 ... ek OP (k ≥ 1)

where, OP : k_ary operator

ei : any postfix expression (1 ≤ i ≤ k)

Polish Notation

Page 7: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 7

PL&C Lab, DongGuk University

– example :if a then if c-d then a+c else a*c else a+b

〓〉 a L1 BZ c d - L2 BZ a c + L3 BR

L2: a c * L3 BR L1: a b + L3:

– note1) high level: source to IL - fast & easy translation

IL to target - difficulty

2) easy evaluation - operand stack

3) optimization 부적당 - 다른 IL 로의 translation 필요4) parentheses free notation - arithmetic expression

– interpretive language 에 적합

Source Translator Postfix Evaluator Result

Page 8: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 8

PL&C Lab, DongGuk University

• Internal Representation(IR)– low-level prefix polish notation - addressing structure of target

machine

• compiler-compiler IL - table driven code generation

– IR program - a sequence of root-level IR expression

– IR expression:OP e1 e2 ... ... ek (k ≥ 1)

where, OP: k-ary operator - 1-1 correspondence with target machine

instruction.

┌─ root-level operator - not appear in an operand│ root-level IR expression.⇒└─ internal operator - appear in an operand

internal IR expression.⇒

ei : operand --- single symbol or internal IR expression.

Page 9: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 9

PL&C Lab, DongGuk University

– exampleD := E⇔ := + d r ↑ + e rwhere, r : local base register

d, e : location of variable D and E + : additive operator ↑ : unary operator giving the value of the

location := : assignment operator(root-level)

– example

FOR D := E TO F DO Loop body;

:= + d r ↑+ e r := + temp r ↑+ f r j L2:L1 Loop body := + d r + ↑+ d r 1:L2 <= L1 ? ↑+ d r ↑+ temp r

D := E; TEMP := F; GOTO 21: Loop body D := D + 1; 2: IF D <= TEMP THEN GOTO 1;

Page 10: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 10

PL&C Lab, DongGuk University

– Note1) Shift-reduce parser --- prefix : fewer states than postfix

2) Several addressing mode┌─ prefix : operator 만 보고 결정 (no backup)

└─ postfix : backup 필요

ex) assumption: first operand computed in register r.

r.1 ::= (/ d. 1 r. 2)r.1 ::= (+ r. 1 r. 2)

┌ prefix - [r -> / . d r] │ first operand changed to d and continue └ postfix - [r -> . d r /] [r -> . r r +] shift r, shift r and block([r -> r r . +]) ⇒ backup

3) Easy translationIR to target - easy

source to IR - difficulty

Page 11: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 11

PL&C Lab, DongGuk University

• most popular IL, optimizing compiler

• General form:

A := B op C

where, A : result addressB, C : operand addressesop : operator

(1) Quadruple - 4-tuple notation <operator>,<operand1>,<operand2>,<result>

(2) Triple - 3-tuple notation <operator>,<operand1>,<operand2>

(3) Indirect triple - execution order table & triples

Three Address Code

Page 12: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 12

PL&C Lab, DongGuk University

– example

• A ← B + C * D / E• F ← C * D

Indirect TripleQuadruple Triple

Operations Triple

* C D T1 (1) * C D 1.(1) (1) * C D

/ T1 E T2 (2) / (1) D 2.(2) (2) / (1) E

+ B T2 T3 (3) + B (2) 3.(3) (3) + B (2)

T3 A (4) A (3) 4.(4) (4) A (3)

* C D T4 (5) * C D 5.(1) (5) F (1)

T4 F (6) F (5) 6.(5)

Page 13: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 13

PL&C Lab, DongGuk University

• Note• Quadruple vs. Triple

– quadruple - optimization 용이– triple - removal of temporary addresses

⇒ Indirect Triple

• extensive code optimization 용이– IL rearrange 가능 (triple 제외 )

• easy translation - source to IL

• difficult to generate good code– quadruple to two-address machine

– triple to three-address machine

Page 14: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 14

PL&C Lab, DongGuk University

• Abstract Syntax Tree– parse tree 에서 redundant 한 information 제거 .

• ┌ leaf node --- variable name, constant

└ internal node --- operator

– [ 예제 8] --- Text p.377{ x = 0;

y = z + 2 * y;

while ((x<n) and (v[x] != z)) x = x+1;

return x;

}

Tree Structured Code

Page 15: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 15

PL&C Lab, DongGuk University

• Tree Structured Common Language(TCOL)– Variants of AST - containing the result of semantic analysis.

– TCOL operator - type & context specific operator

– Context┌ value ----- rhs of assignment statement

├ location ----- lhs of assignment statement

├ boolean ----- conditional control statement

└ statement ----- statement

ex) . : operand --- location result --- value

while : operand --- boolean, statement

result --- statement

Page 16: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 16

PL&C Lab, DongGuk University

Example) int a; float b;

...

b = a + 1;

– Representation ----- graph orientation┌ internal notation ------ efficient

└ external notation ------ debug, interface

linear graph notation

Example) int a; float b;

...

b = a + 1;

AST: assign

b add

a 1

TCOL: assign

b float

addi

a

1.

Page 17: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 17

PL&C Lab, DongGuk University

• Note– AST ----- automatic AST generation(output of parser)

Parser Generator ┌ leaf node specification

└ operator node specification

– TCOL ----- automatic code generation : PQCC(1) intermediate level:high level --- parse tree like notation

control structure

low level --- data access

(2) semantic specification: dereferencing, coercion, type specific

operator

dynamic subscript and type checking

(3) loop optimization ----- high level control structure

easy reconstruction

(4) extensibility ----- define new TCOL operator

Page 18: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 18

PL&C Lab, DongGuk University

• Motivation• ┌ rapid development of machine architectures

└ proliferation of programming languages

– portable & adaptable compiler design --- P_CODE• porting --- rewriting only back-end

– compiler building system --- EM_CODE

M front-ends

N back-ends+ M compilers for N target machines

Abstract Machine Code

Page 19: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 19

PL&C Lab, DongGuk University

• Model

front-end

back-end

target machine

abstract machine interpreter

source program

interfacetarget code

abstract machine code

Page 20: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 20

PL&C Lab, DongGuk University

• Pascal-P Code• Pascal P Compiler --- portable compiler producing P_CODE

for an abstract machine(P_Machine).

• P_Machine ----- hypothetical stack machine designed for

Pascal language.(1) Instruction --- closely related to the PASCAL language.

(2) Registers ┌ PC --- program counter

│ NP --- new pointer

│ SP --- stack pointer

└ MP --- mark pointer

(3) Memory ┌ CODE --- instruction part

└ STORE --- data part(constant area, stack, heap)

Page 21: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 21

PL&C Lab, DongGuk University

CODE PC

STOREstack

heap

MP current activation record

SP

NP

constant area

Page 22: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 22

PL&C Lab, DongGuk University

Ucode Ucode

the intermediate form used by the Stanford Portable Pascal compiler. stack-based and is defined in terms of a hypothetical stack machine. Ucode Interpreter : Appendix B.

Addressing stack addressing ===> a tuple : (B, O)

B : the block number containing the address O : the offset in words from the beginning of the block,

offsets start at 1.

label to label any Ucode instruction with a label field. All targets of jumps and procedures must be labeled. All labels must be unique for the entire program.

Page 23: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 23

PL&C Lab, DongGuk University

Example :

Consider the following skeleton :

program main procedure P procedure Q var i : integer; j : integer;

block number main : 1 P : 2 Q : 3

variable addressing i : (3,1) j : (3,2)

Page 24: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 24

PL&C Lab, DongGuk University

Ucode Operations(35 개 )

Unary --- notop, neg Binary --- add, sub, mult, divop, modop, swp

andop, orop, gt, lt, ge, le, eq, ne

Stack Operations --- lod, str, ldr, ldp Immediate Operation --- ldc

Control Flow --- ujp, tjp, fjp, cal, ret

Range Checking --- chkh, chkl

Indirect Addressing--- ixa, sta

Procedure Specification --- proc, endop Program Specification --- bgn

Procedure Calling Sequence --- cal Symbol Table Information --- sym

Page 25: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 25

PL&C Lab, DongGuk University

Example : x = a + b * c; lod 1 1 /* a */ lod 1 2 /* b */ lod 1 3 /* c */ mult add str 1 4 /* x */

if (a>b) a = a + b; lod 1 1 /* a */ lod 1 2 /* b */ gt fjp next lod 1 1 /* a */ lod 1 2 /* b */ add str 1 1 /* a */

next

Page 26: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 26

PL&C Lab, DongGuk University

Indirect Addressing

is used to access both array elements and var parameters.

ixa --- indirect load replace stacktop by the value of the item at location stacktop. to retrieve A[i] :

lod i /* actually (Bi, Oi)) */

ldr A /* also (block number, offset) */

add /* effective address */

ixa /* indirect load gets contents of A[i] */

to retrieve var parameter x :

lod x /* loads address of actual - since x is var */

ixa /* indirect load */

Page 27: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 27

PL&C Lab, DongGuk University

• sta --- indirect store– sta stores stacktop into the address at stack[stacktop-1],

both items are popped.

– A[i] = j;

lod i

ldr A

add

lod j

sta

– x := y, where x is a var parameter

lod x

lod y

sta

Page 28: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 28

PL&C Lab, DongGuk University

Procedure Calling Sequence

procedure definition : procedure A(var a : integer; b,c : integer);

procedure call : A(x, expr1, expr2);

calling sequence :ldp

ldr x /* load the address of actual for var parameter */

… /* code to evaluate expr1 --- left on the stack */

… /* code to evaluate expr2 --- left on the stack */

cal A

Page 29: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 29

PL&C Lab, DongGuk University

Ucode Interpreter

The Ucode interpreter is called ucodei, it’s source is on plac.dongguk.ac.kr.

The interpreter uses the following files : *.ucode : file containing the Ucode program. *.lst : Ucode listing and output from the program.

Ucode format

label-field op-code operand-field

1-10 12-m m+2

m is exactly enough to hold opcode. label field --- a 10 character label(make sure its 10 characters pad with blanks) op-code --- starts at 12 column.

Page 30: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 30

PL&C Lab, DongGuk University

Programming Assignment #3

• 부록 B 에 수록된 Ucode 인터프리터를 각자 PC 에 설치하고 100 이하의 소수 (prime number) 를 구하는 프로그램을 Ucode 로 작성하시오 .

– 다른 문제의 프로그램을 작성해서 제출해도 됨 .

– Ucode 인터프리터 출력 리스트를 제출 .

• 참고 :– #1 : recursive-decent parser

– #2 : MiniPascal LR parser

Page 31: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 31

PL&C Lab, DongGuk University

• IL criteria

– intermediate level– input language --- high level

– output machine --- low level

– efficient processing– translation --- source to IL, IL to target

– interpretation

– optimization

– extensibility

– external representation

– clean separation– language dependence & machine dependence

Concluding Remarks

Page 32: PL&C Lab, DongGuk University Compiler Lecture Note, Intermediate LanguagePage 1 제 9 장 중 간 언어 컴파일러 입문

Compiler Lecture Note, Intermediate Language Page 32

PL&C Lab, DongGuk University

PolishNotation

Three AddressCode

Tree StructuredCodeIL

CriteriaPost IR Quadra Triple AST TCOL

AbstractMachine

Code

intermediate level C B B B C A B

source to ILtransration

A C B B A B C

IL to targettranslation

C A B B C A A

interpretation B B B B C C A

efficient

processing

optimization C B A C A A B

externalrepresentation

A A A A C B A

extensibility A A A A A A B

clean separation C B B B C A A

A : 좋다B : 보통이다C : 나쁘다