23
Winter 2003/4 Pls – syntax – Catriel Beeri 1 SYNTAX Syntax: form, structure The syntax of a pl: • The set of its well-formed programs • The rules that define these programs Two views: Concrete syntax: program as text Abstract syntax: program as composite structure, a tree

Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

  • View
    223

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 1

SYNTAX

Syntax: form, structure

The syntax of a pl:• The set of its well-formed programs• The rules that define these programs

Two views:• Concrete syntax: program as text• Abstract syntax: program as composite structure, a

tree

Page 2: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 2

Concrete syntax

The common view – program as text (a string)

Common practice in compilers – divide into two levels• Lexical structure - the words מבנה מילוני

– Lexical specification / analysis מפרט יניתוח מילונ ,

• Phrase structure – the sentences מבנה תחבירי – Phrase structure specification / parsing

ניתוח תחבירימפרט,

Page 3: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 3

Lexical

A word: lexeme מילה

A class of words: token אסימון

for example: int, ident, real, leftpar, if …..

2.3 (real, 2.3)

(4+5) leftpar (int, 4) plus (int, 5) rightpar

Lexical analysis – convert text to (token, lexeme) - stream

Page 4: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 4

Lexical analysis: implementaion

Token specified by regular expression

Regular expression (ndet) finite automaton,(det) finite automaton a program

– a lexical analyzer

Issues:• Many tokens• Where to stop• …..

Page 5: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 5

Phrase structure/analysis

Specified by context free grammar (CFG) (BNF --- Backus-Naur form)

• T – terminals (here, tokens – sets of lexemes)• N – non-terminals = names of syntactical categories• P –production rules

Rule: A w (w is a string on N T)

• S – start non-terminal

A CFG as a generative device: • Start from S• Replace non-terminals by strings, using rules

Page 6: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 6

Example: CFG for simple arithmetic expressions

T = {int, op}, N = {E}, S = E

Rules:

E ::= int | E op E (2 rules, | means `or’)

Generation by a derivation:

E op E => E => int op E => int op E op E =>

int op int op E => int op int op int

Could represent the expression 2 - 3 - 4

Page 7: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 7

Here are two derivations:Cont’d

E op E => E =>int op E =>

E op E op E =>

int op E op E => int op int op E => int op int op int

Are they really different?

And from both:

Page 8: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 8

A derivation corresponds to a derivation tree:

* E => E op E => E op E op E =>int op E op E =>

E

E Eop

E Eopint

int int

E => E op E => int op E =>

int op int op E => int op into op int

Page 9: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 9

Derivations vs. derivation trees

A derivation tree represents many derivations

If there is a word with several derivation trees, the CFG is ambiguous.

Example:

E

E Eop

E Eopint

E

opE

E Eop

E

int

int int int int

Page 10: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 10

The problem is addressed by:• Adopting left associativity• Allowing parentheses in expressions • Changing the CFG:

– New non-terminal T (for term)

– New rules:

E ::= E op T | T

T ::= int | (E)

Page 11: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 11

This CFG is unambiguous, and reflects left associativity

E => E op T

int op int op int

Page 12: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 12

A derivation tree

More complex than expression tree

E

E op

op

T

( )E

E T

int

intT

int

Page 13: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 13

Phrase structure -summary

• A language is specifiable by many CFG’s• A CFG needs to address:

– Ambiguity (avoid)– Associativity (express)– Precedence (express)– Efficient parsing (ensure)

Methodologies for transforming CFG’s to account for the above are known

The resulting CFG’s are complex; so are the derivation trees.

Page 14: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 14

Abstract Syntax

Consider:• (if (< x 3) 4 7) (scheme) X < 3? 4 : 7 (C) • (let ((x 5) (+ x 3) (scheme) let x = 5 in x + 3 (OCAML)

Each pair is the “same” expression, same componentsThe meaning is explained in same way: E.g., for the conditional: Evaluate the test if its value is true evaluate the 1st branch, else evaluate the 2nd

Page 15: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 15

In abstract syntax: a program/expression is viewed as a labeled tree/ a compound structure

• A labeled leaf, represents an atomic phrase. label represents the category

• A larger tree represents a compound phrase – The root label is its category– The children are its components

int (3)

Page 16: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 16

Typical building blocks:• Record:

IfExpr

test branch2branch1

E3E2E1

IfExpr : {test = E1, branch1 = E2 branch3 = E3}

Type can be expressed as an OCAML datatype

type ifexpr = IfExpr of {test : expr;

branch1 : expr; branch2 : expr}

Page 17: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 17

• Tuple:

IfExpr

E3E2E1

IfExpr : (E1, E2,E3)

type ifexpr = IfExpr of expr * expr * expr

Tuple vs. Record: field name vs. ordering

Page 18: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 18

• Sequence:

CmpdStmt : (S1, S2, … , Sn)

Tuple vs. sequence: In a tuple type, number of fields is known & fixed

type cmpd_stmt = CmpdStmt of stmt list

Page 19: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 19

Summary of abstract syntax

Abstract syntax is the structure of the program keywords, separators, conventions - not included associativity, precedence, unambiguity - non-issues

Parsing: convert from concrete to abstract syntax

Type-checking, semantics, compiler translation use abstract syntax

In rest of course: abstract syntax

Page 20: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 20

Q:

Can a cfg derivation tree serve as abstract syntax tree?

Page 21: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 21

Syntax (concrete/abstract) is an inductive definition

Example : E ::= int | id | E op E

As rules:

int

expr

i

i

Id

expr

x

x

1 2

1 2

expr expr op

expr

e e o

e o e

How will the rules look like for type expr = Int of int | Id of string | Expr of expr * exp ?

Page 22: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 22

Common informal approach to abstract syntax specification

Use a string CFG, interpret as a tree grammar • Ignore keywords • Labels and structures - left to reader to decide

This shows the category, the components Sufficient for semantics

Example : If-Expr ::= if Expr then Expr else Expr

This is the approach in the course

Page 23: Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define

Winter 2003/4 Pls – syntax – Catriel Beeri 23

A convention for abstract syntax

Use variables, declare them before rules, omit indices

Example :

A similar convention often used for inductive definitions

int, id, op, expr

: | |

i x o e

e i x e o e