Com S 342 - · PDF fileCom S 342 Overview Tentative course program: Introduction – basic concepts The algorithmic programming language Scheme Inductive sets of data

Com S 342

Principles of Programming Languages

Spring 2007

Com S 342

Com S 342

Lecturer: Dr. Markus LumpeDepartment of Computer Science113 Atanasoff Hallhttp://www.cs.iastate.edu/~cs342TR 2-3:30, Gilman 1104W9 (1), Gilman 0312W4 (2), Gilman 0312

TA’s: N/A

Grading: problem sets, 2 tests, final exam

Assignments: on a weekly basis

Com S 342

Overview

Tentative course program:

Introduction – basic concepts The algorithmic programming language Scheme Inductive sets of data The lambda-calculus – the core language of sequential programming Recursion and fixed points Type reconstruction algorithms Data abstraction, representation strategies for data types Interpreters Types, type checking, and type inference Objects and classes (optional) Continuation-Passing style (optional)

Com S 342

The Algorithmic Language Scheme

Overview DrScheme, Chez Scheme, SLIB, and EOPL2-extensions Features of Scheme Expressions Data types Procedures

References Revised5 Report on the Algorithmic Language Scheme. Daniel P. Friedman and Matthias Felleisen, “The Little Schemer”, Fourth Edition,

MIT Press, 1996 Harold Abelson et al., “Structure and Interpretation of Computer Programs”, MIT

Press, 1996 Daniel P. Friedman et al., “Essentials of Programming Languages”, Second Edition,

MIT Press, 2001

Com S 342

Chez Scheme, SLIB, and EOPL2-extensions

DrScheme (or Chez Scheme) is a complete high-performanceimplementation of ANSI/IEEE standard Scheme.

SLIB is a portable Scheme library meant to provide compatibilityand utility functions for all standard Scheme implementations.

Running Chez Scheme (Com S342): SCHEME_LIBRARY_PATH: The SLIB library directory. CS342LIB: The EOPL2-extensions path On every platform use the command “scheme” to start an interactive

Scheme-session. The file “chez.prelude” defines everything you need to use Chez Scheme in

Com S342. This file is loaded after “chez.init” – the SLIB initialization file.

Com S 342

DrScheme

Com S 342

CS342 Chez Scheme

Com S 342

Scheme

Scheme is a statically scoped programming language. Each use of a variable is associated with a lexically apparent binding of that

variable.

Scheme is a dynamically typed language. Types are associated with values rather than with variables. Scheme has latent as opposed to manifest types.

All objects in the course of a Scheme computation have unlimited extent. No Scheme object is ever destroyed. However, if an object becomes inaccessible, then it can be garbage-collected.

Procedures in Scheme are first-class. Procedures can be created dynamically, stored in data structures, returned as

results of procedures, etc.

Com S 342

General Facts and Conventions

Scheme employs a fully parenthesized prefix notation for programs and data.

(define fact (lambda (n) (if (= n 0) 1 (* n (fact (- n 1)))) ) )

Names of procedures that always return a boolean value usually end with “?”. Theseprocedures are called predicates.

Names of procedures that store values into previously allocated locations usually endwith “!”.

“->” appears within the names of procedures that take an object of one type andreturn an analogous object of another type.

Upper and lower case forms of a letter are never distinguished except withincharacters and string constants. For example, ‘Foo’ is the same identifier as ‘FOO’, and‘#x1AB’ is the same number as ‘#X1ab’.

Com S 342

Identifiers

Most identifies allowed by other programming languages are alsoacceptable to Scheme.

Identifiers begin with a character or a special character, followedby a sequence of characters, special characters, or digits.

Lexical structure:<identifier> ::= <initial> <subsequent>* | <peculiar identifier><initial> ::= <letter> | <special initial><special initial> ::= ‘!’ | ‘$’ | ‘%’ | ‘&’ | ‘*’ | ‘/’ | ‘:’ | ‘<’ | ‘=’ | ‘>’ | ‘?’ | ‘^’ | ‘_’ | ‘~’<subsequent> ::= <initial> | <digit> | <special subsequent><special subsequent> ::= ‘+’ | ‘-’ | ‘.’ | ‘@’<peculiar identifier> ::= ‘+’ | ‘-’ | ‘…’

Com S 342

Examples of Identifiers

lambda qlist->vector soup+ V17a<=? a34kTMNsthe-word-recursion-has-many-meanings

Any identifier may be used as a variable or as a syntactic keyword.

When an identifier appears as a literal or within a literal, it is being used todenote a symbol, that is, equality (eqv?) is defined for identifiers.

Com S 342

Variables, Syntactic Keywords, and Regions

An identifier may name a type of syntax, or it may name a location where avalue can be stored. An identifier that names a location is called variable and is said to be bound to that

location. An identifier that names a type of syntax is called a syntactic keyword and is said

to be bound to that syntax.

Like C/C++, Pascal, and Java, Scheme is a statically scoped language withblock structure. To each place where an identifier is bound in a program there corresponds a

region of the program text within which the binding is visible. The region isdetermined by the particular binding construct that establishes the binding.

Com S 342

Type Predicates

No object satisfies more than one of the following predicates:

boolean? pair?symbol? number?char? string?vector? port?procedure?

These predicates define the types boolean*, pair, symbol, number, character,string, vector, port, and procedure.* All values in Scheme count as true except ‘#f’.

Note, the empty list is a special object of its own type; it satisfies none of thetype predicates.

Com S 342

Variable References

An expression consisting of a variable is a variable reference. The value of thevariable reference is the value stored in the location to which the variable isbound.

Syntax:<variable>

Example:> (define x 28)> x28

Com S 342

Literal Expressions

Literals are either external representations of Scheme objects or constants thatevaluate “to themselves”.

Syntax: (quote <datum>) | ‘<datum> | <constant>

Examples:> (quote a) ; evaluates to <datum>a> (quote (+ 1 2)) ; evaluates to (+ 1 2)(+ 1 2)> ‘(+ 1 2) ; abbreviation of (quote (+ 1 2))(+ 1 2)> “string” ; evaluates to itself, it does not need to be quoted“string”

Com S 342

Procedure Calls

A procedure call is written by simply enclosing in parenthesis expressions forthe procedure to be called and the arguments to be passed to it.

Syntax: ( <operator> {<operand>}* )

Examples: > (+ 3 4)7> ((if #f + *) 3 4)12

Scheme uses the call-by-value parameter passing mechanism; everyargument is evaluated, including the expression that denotes the procedure,before the procedure is called.

Com S 342

Load a Script and Exit

To load a script, the Scheme system provides the built-in procedure “load”.

Syntax: (load <filename-string>)

Example: > (load “test.ss”)> (load “fib.scm”)

To leave the Scheme interpreter, call “exit”.

Syntax: (exit)

Com S 342

Procedures

Procedures are parameterized abstractions over expressions.

Syntax: (lambda <formals> <body>) <formals> should be a formal argument list.

<formals> ::= ( {<variable>}+ ) | <variable> | ( <variable1> ... <variablen> . <variablen+1> )

<body> should be a sequence of one or more expressions.

Example: > ((lambda (x) (+ x x)) 4)8> ((lambda x x) 3 4 5 6)(3 4 5 6)> ((lambda (x y . z) z) 3 4 5 6)(5 6)

Com S 342

Definitions

Definitions are used to assign expressions to names.

Syntax: (define <variable> <expression>)

Examples: > (define add3(lambda (x) (+ x 3)))

> (add3 4)7> (define first car)> (first `(1 2))1

Com S 342

Assignment

The assignment expression is used to store the value of an expression in alocation, which is bound to a variable.

Syntax: (set! <variable> <expression>)

Example: > (define x 2)> (+ x 1)3> (set! x 4)> (+ x 1)5

Note: <variable> must be bound either in some enclosing region (as in theexample) or at top level (in the program).

Com S 342

Conditionals

(if <test> <consequent> <alternative>)(if <test> <consequent>)

(cond {<clause>}+ )<clause> ::= (<test> {<expression>}* )

| (else {<expression>}+)

(case <key> {<case-clause>}+ )<case-clause> ::= (({<datum>}+) {<expression>}* )

| (else {<expression>}+)

Com S 342

Other Expressions

Sequencing: (begin {<expression>}+ )

Iteration: (do ( {( <variable> <init> <step> )}+ ) (<test> {<expression>}* ) {<command>}+ )

Binding: (let ({<binding>}+) <body>)<binding> ::= (<variable> <init>)

Com S 342

A Small Example – Bubble Sort

(define bubble-sort ; procedure bubble-sort (lambda (lon) ; procedure with one argument (if (null? lon) ; test whether list is empty

`() ; if list is empty, we are done; set variable lon-with-elem-left(let ((lon-with-max-elem-left (reverse (put-max-last lon)))) ; build sorted list using recursion (cons (car lon-with-max-elem-left) ; select head of lon-with-max-elem-left

; apply bubble-sort to the tail of lon-with-max-elem-left (bubble-sort (cdr lon-with-max-elem-left))))

) ; end if ) ; end lambda) ; end define

Com S 342

Procedure put-max-last

(define put-max-last ; define procedure name “put-max-last” (lambda (lon) ; define procedure (if (null? lon) ; test whether lon is empty `() ; if list is empty, we are done (if (null? (cdr lon)) ; test whether lon is a singleton lon ; if yes, we are done (if (< (car lon) (cadr lon)) ; left < right?

(cons (car lon) (put-max-last (cdr lon))) ; true (cons (cadr lon) (put-max-last (cons (car lon) (cddr lon)))) ; false ) ; end if ) ; end if


Com S 342

Trace of bubble-sort

| (bubble-sort (3 6 2 8 9 8))| (bubble-sort (8 8 6 2 3))| |(bubble-sort (3 2 6 8))| | (bubble-sort (6 3 2))| | |(bubble-sort (2 3))| | | (bubble-sort (2))| | | |(bubble-sort ())| | | |()| | | (2)| | |(3 2)| | (6 3 2)| |(8 6 3 2)| (8 8 6 3 2)|(9 8 8 6 3 2)(9 8 8 6 3 2)

> (trace bubble-sort)(bubble-sort)> (bubble-sort `(3 6 2 8 9 8))

Com S 342

Lists and Pairs

A pair (sometimes called a dotted pair) is a record structure with two fieldscalled the car and cdr fields.

Pairs are created by the procedure cons.

The car and the cdr fields are accessed by the procedures car and cdr.

Pairs are used primarily to represent lists. A list can be defined recursively aseither the empty list or a pair whose cdr is a list.

Example: (a b c d e) and (a . (b . (c . (d . (e . ()))))) are equivalent notations for lists of symbols.

Com S 342

Check for Pairs and Lists

(pair? obj)pair? returns #t if obj is a pair, otherwise returns #f.

(pair? `(a . b)) #t(pair? `(a b c)) #t ; a list is a pair: (a . (b . (c . ())))(pair? `()) #f ; the empty list is not a pair

(null? obj)null? returns #t if obj is the empty list, otherwise returns #f.

(list? obj)list? returns #t if obj is a list, otherwise returns #f.

Com S 342

list? & null?

Given a list the procedure reverse returns a newly allocated list consisting ofthe elements of a-list in reverse order:

(define reverse (lambda (a-list) (if (list? a-list) (if (null? a-list) `() (append (reverse (cdr a-list)) (list (car a-list))) ) ; end if ) ; end if ) ; end lambda) ; end define

procedure definition

Is a-list indeed a list?

test for ()

build new list

Com S 342

car, cdr

(car pair)Returns the contents of the car field of pair.

(car `(a b c)) a(car `((a) b c)) (a)(car `()) error, () has no car field

(cdr pair)Returns the contents of the cdr field of pair.(cdr `((a) b c)) (b c) ; tail of list ((a) b c) is again a list(cdr `(1 . 2)) 2(cdr `()) error, () has no cdr field

Com S 342

Variations of car & cdr

The Scheme library provides procedures that represent the arbitrarycomposition of up to four car and cdr procedures:

(caar pair)(cadr pair)

.

.

.(cdddar pair)(cddddr pair)

There are twenty-eight of these procedures.

Example: (define caddr (lambda (x) (car (cdr (cdr x)))))

Com S 342

car, cadr, cdr, and cddr

Example: lon = `(1 2 3 4)

…(if (< (car lon) (cadr lon))

(cons (car lon) (put-max-last (cdr lon))) (cons (cadr lon) (put-max-last (cons (car lon) (cddr lon))))

…

(car lon) 1(cadr lon) = (car (cdr lon)) 2

(cdr lon) (2 3 4)

(cddr lon) = (cdr (cdr lon) (3 4)

Com S 342

List Builder cons

(cons obj1 obj2)Returns a newly allocated pair whose car is obj1 and whose cdr is obj2.

(cons à `()) (a)(cons `(a) `(b c d)) ((a) b c d)(cons `(a b) `c) ((a b) . c)

Note: The procedure cons preserves the structure of both the car and the cdrfield. Therefore, if obj1 is a list, then car applied to the result of cons returns alist as well.

Example: (car (cons `(a) `(b c d))) (a)

Com S 342

List Builder append

(append list …)Returns a list consisting of the elements of the first list followed by theelements of the other lists.

(append `(x) `(y)) (x y)(append `(a) `(b c d)) (a b c d)(append `(a (b)) `((c))) (a (b) (c))

Note: The procedure append removes the top-level pair of parenthesis from theargument lists.

Example: (car (append `(a) `(b c d))) a

Com S 342

List Builder list

(list obj …)Returns a newly allocated list of its arguments.

(list à (+ 3 4) `c) (a 7 c)(list) ()

Com S 342

Bubble Sort - Review

(define bubble-sort ; procedure bubble-sort (lambda (lon) ; procedure with one argument (if (null? lon) ; test whether list is empty

`() ; if list is empty, we are done; set variable lon-with-elem-left(let ((lon-with-max-elem-left (reverse (put-max-last lon)))) ; build sorted list using recursion (cons (car lon-with-max-elem-left) ; select head of lon-with-max-elem-left

; apply bubble-sort to the tail of lon-with-max-elem-left (bubble-sort (cdr lon-with-max-elem-left))))


Com S 342

Procedure put-max-last - Review

(define put-max-last ; define procedure name “put-max-last” (lambda (lon) ; define procedure (if (null? lon) ; test whether lon is empty `() ; if list is empty, we are done (if (null? (cdr lon)) ; test whether lon is a singleton lon ; if yes, we are done (if (< (car lon) (cadr lon)) ; left < right?

(cons (car lon) (put-max-last (cdr lon))) ; true (cons (cadr lon) (put-max-last (cons (car lon) (cddr lon)))) ; false ) ; end if ) ; end if


Com S 342

Equivalence Predicates

Scheme provides three equivalence predicates: (eq? obj1 obj2) – compares object identities

This is the finest or most discriminating equivalence predicate.(eq? à à) #t

(eqv? obj1 obj2) – compares primitive valuesReturns #t if obj1 and obj2 should normally be regarded as the same object.(eqv? #t #t) #t

(equal? obj1 obj2) – structural equivalenceThe predicate equal? Recursively compares the contents of pairs, vectors, andstrings, applying eqv? on other objects such as numbers and symbols.(equal? `(a b c) `(a b c)) #t

Com S 342

Why Do We Study Programming Languages?

Some have suggested that there is no need to develop newcomputer languages nor even to teach language design andcompiler theory.

Correct/Wrong

Com S 342

What Is a Programming Language?

A formal notation for describing computation A “user interface” to a computer

A more precise tool than any natural language

Programming paradigms – different expressive power Syntax + semantics Compiler, or interpreter, or translator

Com S 342

Core Properties of Programming Languages

Languages provide the framework for the way we organizecomplexity in our own minds.

Languages are the means by which we communicate ourunderstanding.

Com S 342

Reasons for Studying Concepts of Programming Languages

The potential benefits of studying language concepts are:

Increased capacity to express ideas. The way we think is greatly influenced by the expressive power of the language in

which we communicate our thoughts.

Improved background for choosing appropriate languages. Many programmers, when given a choice of languages for a new project, continue

to use the language with which they are most familiar, even if it is poorly suited tothe new project.

Increased ability to learn, to design, and to implement a new language.

Com S 342

How Do Programming Languages Differ?

Generations: 1GL: machine code 2GL: symbolic assemblers 3GL: (machine independent) languages

Fortran, Algol, Pascal, Smalltalk, C++, Java, Lisp, Haskell, Scheme, Prolog 4GL: domain specific application generators

Common Constructs:basic data types (numbers, etc.); variables; expressions; statements;keywords; control constructs; procedures; comments …

Uncommon Constructs:type declarations; special types (strings, arrays, matrices, …);concurrency constructs; packages/modules; generics; exceptions; …

Com S 342

Key Theses

Thesis 1: Speak the programming language that you need towork with.

Every programming language meets some specialized goal.

Thesis 2: Programming languages are invented while you sleep,and spread before you wake up.

Many languages already address your problem; the user onlyneeds to find the appropriate language.

Thesis 3: Understanding programming languages is the key toyour job.

Com S 342

Programming Domains

All programming languages have been developed with differentgoals in mind. Every language has its designated applicationdomain, which, in general, requires a specific sets ofprogramming abstractions or/and runtime models.

Scientific applications: floating-point arithmetic (Fortran, Algol) Business applications: reports, decimal arithmetic (Cobol) Artificial intelligence: symbolic computation (Lisp, Prolog) System programming: operating systems (C, Pascal) Scripting languages: system configuration (sh, awk, Perl, Tcl)

Com S 342

Programming Paradigms

Imperative style: program = algorithms + data

Functional style: program = function · function

Logic programming style:program = facts + rules

Object-oriented style:program = objects + messages

Other styles and paradigms:blackboard, events, pipes and filters, constraints, lists, …

Com S 342

Imperative Programming

This is the oldest style of programming, in which the algorithm for thecomputation is expressed explicitly in terms of instructions such asassignments, tests, branching and so on.

Execution of the algorithm requires data values to be held in variables whichthe program can access and modify.

Languages so classified include assembly languages, Fortran, Algol, Pascal, Cand Ada.

Imperative programming corresponds naturally to the earliest, basic and stillused model for the architecture of the computer, the von Neumann model.

Com S 342

Functional Programming

Functional programming takes a much more mathematical approach, basedon the lambda calculus.

The concept of variables is not used in pure functional programming.Instead, the computation is described as a function, which is applied to theinput data and which gives the result(s) as output data.

This style is more abstract since it requires the algorithm to be described ina way that is independent of the data.

However, there are only a few pure functional languages, because thisconcept is often considered to be cumbersome and program are verytedious to write.

Most prominent languages of this style are Lisp, ML, Scheme, Haskell.

Com S 342

Logic Programming

Logic programming is like functional programming, it also takes amathematical approach, but this time through formal logic.

A program is described in terms of predicates, which are the rules that governthe problem. At run-time the use of logical inference enables new formulae tobe deduced from those given, or the truth or falsehood of a formula to bededuced from the predicates (full unification).

Logical inference is very much like the process of proving a theorem inmathematics, starting from the axioms and theorems already proved.

The best-known logic language is Prolog.

Com S 342

Object-oriented Programming

In general, object-oriented languages are based on the concepts of classand inheritance, which may be compared to those of type and variablerespectively in a language like Pascal.

A class describes the characteristics common to all its instances, in a formsimilar to the record of Pascal, and thus defines a set of fields.

In object-oriented programming, instead of applying global procedures orfunctions to variables, we invoke the methods associated with theinstances, an action called “message passing”.

The basic concept inheritance is used to derive new classes from exitingones by modifying or extending the inherited class(es).

The most prominent object-oriented languages are Smalltalk, C++, Eiffel,Java, ObjectPascal.

Com S 342

Sequential Languages

Instructions are executed one after another in an order that isdeduced from the text of the program.

These are the most widely used languages, since theycorrespond to the classic von Neumann architecture.

Pascal, Haskell, Smalltalk, Java, for example, are members of theclass of sequential languages.

Com S 342

Parallel Languages

In contrast to sequential languages, several program instructions can beexecuted simultaneously.

These languages are designed to develop programs for multi-processor(distributed memory) architectures.

Parallel languages demand special constructs for communication andsynchronization.

The general model for programming in terms of objects can easily be madeparallel – actor languages.

The most prominent parallel languages are Occam, Actor.

Com S 342

Special-purpose Languages

Shell, Awk, Perl, Python, JavaScript: Rapid prototyping System administration Program configuration

Postscript, HPGL, Tex, RTF: Text setting Description of text, graphical shapes, and images

HTML, XML: Markup languages

Com S 342

A Brief History

Fortran, 1957

Lisp, 1960

Algol-68

Ada, 1983

Pascal, 1975

Prolog, 1970

Algol-60

C, 1972

Smalltalk, 1983

Simula, 1962

PL/I, 1965

Java, 1993Haskell, 1990

C++, 1986

Basic, 1964

Cobol, 1960

Scheme, 1975

Com S 342

Syntax

The syntax of a programming language is concerned with theform of programs, i.e., how expressions, commands,declarations, etc., are put together to form programs.

A well-designed programming language will have a well-designed syntax. However, the syntax definition given for aspecific language is not power-full enough to define aprogramming language completely. The purpose of a well-defined syntax is to guide the programmer to understand thelanguage’s semantics.

Com S 342

Semantics

The semantics of a programming language is concerned withthe meaning of programs, i.e., how they behave when executedon computers.

The semantics of a programming language assigns a precisemeaning to every sentence of the language that can be formedusing the given syntax definition. There are three approaches todefine the semantics of a programming language: Axiomatic semantics, Operational semantics, Denotational semantics.

Com S 342

The Hilbert-style Proof System

A Hilbert-style proof system consists of axioms and proof rules. An axiom of a proof system is a formula that is provable by definition. An inference rule asserts that if some list of formulas is provable, then so

is another formula. A proof is a structured object built from formulas according to constraints

established by a set of axioms and inference rules.

The rule format:

We construct a proof from proofs:

Conclusion

PremisePremisePremisen21

...

Conclusion

Conclusion

Premise

Conclusion

Premise

Conclusion

Premise

n

n

2

2

1

1 ...

Com S 342

Axiomatic Semantics

The axiomatic semantics is a formal (proof) system for deriving equationsbetween expressions.

The basic idea of the axiomatic method is to define the meaning oflanguage elements indirectly using logical assertions. For example, we canwrite {E1} C {E2}, called a Hoare triple, to state that if the booleanexpression E1 holds prior the computation of C, and if C terminates, thenthe boolean expression E2 must hold as well.

Examples:

{ a > 0 } a := a + 1 { a > 1 }

{E1} C1 {E2} {E2} C2 {E3}

{E1} C1;C2 {E3}

Com S 342

Example Rules

C =def if C.Test then C.Then else C.Else

C =def C.Target := C.Source

Rule of Consequence

Proof RuleProof RuleStatement TypeStatement Type

C{Q}C.Target]}\e{Q[C.Sourctrue

{P}C{Q}}P}C.Else{Q C.Test {}P}C.Then{Q {C.Test !¬!

{P}C{Q}Q Q'}}C{Q'{P'P' P !!

Com S 342

Using axiomatic semantics, we need to prove the validity of agiven Hoare triple.

Example: {true} if (a >= b) then

C =def m = a; else

m = b;{m = max(a, b)}

Example

Com S 342

Proof

b)} max(a, {ma; b}m {ab)} max(a, {ma; b)}m max(a, {a

trueb) max(a, a b a

==!

===="!

b)} max(a, {mb; b}m {ab)} max(a, {mb; b)}m max(a, {b

trueb) max(a, b b a

==<

====!<

Premise I:

Premise II:

b)} max(a, {true}C{mb)} max(a, {mb; true}m b {ab)} max(a, {ma; true}m b {a

=

==!<==!"

Com S 342

Operational Semantics

The operational semantics is based on a directed form of equational reasoningcalled “reduction”. Reduction may be regarded as a form of symbolicevaluation.

The basic idea of the operational method is to define the meaning of thelanguage elements by means of a (labeled) transition system.

The operational semantics definition provides means to display thecomputation steps undertaken when a program is evaluated to its output.

Some forms of operational semantics are interpreted-based, with instructioncounters, data structures, and the like, and others are inference rule-based,with proof trees that show control flows and data dependencies.

Com S 342

Example Transition Rules



Transition RuleTransition RuleStatement TypeStatement Type

)} ,{(C.Target C.Source) :(C.Target (C.Source)

!""!"#$=

=

' C.Else) else C.Then thenC.Test (if' (C.Then)true) (C.Test !!

!!!"

"=

' C.Else) else C.Then thenC.Test (if' (C.Else)false) (C.Test !!

!!!"

"=

Com S 342

Denotational Semantics

The denotational semantics, or model theory, is defined in the spirit ofequational logic or first-order logic. A denotational semantics definition(model) consists of a family of sets, one for each type, with the property thateach well-typed expression may be interpreted as a specific element of theappropriate set.

The denotational semantics is a recursive definition that maps well-typedderivation trees to their mathematical meanings. For example, the set Boolconsists of two meanings: Bool = {true, false} and an operationnot : Bool Bool with not(false) = true, not(true) = false.

The denotational method does not maintain states, but the meaning of aprogram is given as a function that interprets all language elements of a givenprogram as elements of a corresponding set of values.

Com S 342

Example Meaning Functions



Meaning FunctionsMeaning FunctionsStatement TypeStatement Type

ERROR else

))} ,C.Source( ), ,C.Target({( then

nil) ) ,C.Target(( if

) ,C.Source :C.Target (

!!!!

!

"#"#$="#

="=#

ff

f

f

) ,C.Else( else

) ,C.Then( then

) ,C.Test( if

) ,C.Else else C.Then thenC.Test if(

!!

!!

"#"#

"#="#

f

f

f

f

Com S 342

Types and Type Systems

Types are collections of values that share some common properties. Whenwe say that v is a value of type T, we mean that v ∈ T.

In some systems, there may be types with types as members. Types withtypes as members are usually called something else, such as universes,orders or kinds, to avoid the impression of circularity.

In a type system, types provide a division or classification of some universeof possible values. A type system defines in a mathematical way (axioms anddeduction-rules), which expressions are typable, i.e., which expressions canbe assigned a valid type using the underlying type system.

In most programming languages, types are “checked” in some way, eitherduring program compilation, or during execution. The main purpose of typechecking is the detection of errors, documentation, program optimization,etc.

Com S 342

Values

In computer science we classify as a value everything that may be evaluated,stored, incorporated in a data structure, passed as an argument to aprocedure or function, returned as a function result, and so on.

In computer science, as in mathematics, an “expression” is used (solely) todenote a value.

Which kinds of values are supported by a specific programming language isheavily depended on the underlying paradigm and its application domain.

Most programming languages share some basic sets of values like truthvalues, integers, real number, records, lists, etc.

Com S 342

Constants

Constants are named abstractions of values.

Constants are used to assign an user-defined meaning to a value.

Examples: EOF = -1 TRUE = 1 FALSE = 0 PI = 3.1415927 MESSAGE = ”Welcome to Com S 342”

Constants do not have an address, i.e., they do not have a location.

At compile time, applications of constants are substituted by theircorresponding definition.

Com S 342

Primitive Values

Primitive values are these values that cannot furtherdecomposed. Some of these values are implementation andplatform dependent.

Examples: Truth values, Integers, Characters, Strings, Enumerands, Real numbers.

Com S 342

Composite Values

Composite values are built up using primitive values andcomposite values. The layout of composite values is in generalimplementation dependent.

Examples: Records, Arrays, Enumerations, Sets, Lists, Tuples, Files.

Com S 342

Pointers

Pointers are references to values, i.e., they denote locations of a values.

Pointers are used to store the address of a value (variable or function) –pointer to a value, and pointers are also used to store the address of anotherpointer – pointer to pointer.

In general, it not necessary to define pointers with a greater reference levelthan pointer to pointer.

In modern programming languages, we find pointers to variables, pointers topointer, function pointers, and object pointers, but not all programminglanguages provide means to use pointers directly (e.g. Java, Scheme).

Com S 342

Inductive Sets of Data

Overview Inductive Specification Backus-Naur Form Proof by Induction

References Daniel P. Friedman et al., “Essentials of Programming Languages”, Second

Edition, MIT Press, 2001 John C. Mitchell, “Foundations of Programming Languages”, MIT Press

1996

Com S 342

Sets

A set is a collection of elements (or values), possibly empty.

All elements satisfy a possibly complex characterizing property. Formally, wewrite:

{ x ∈ A | P(x) = True }

to define a set, where all elements satisfy the property P.

The basic axiom of set theory is that there exists an empty set, ∅, with noelements. Formally,

∀x, x ∉ ∅

In words, “for every x, x is not an element of ∅”.

Com S 342

Inductive Specification

Sometimes it is difficult to define a set explicitly, in particular if the elementsof the set have a complex structure.

However, it may be easy to define the set in terms of itself. This process iscalled inductive specification or recursion.

Example:Let the set S be the smallest set of natural numbers satisfying the followingtwo properties: 0 ∈ S, and Whenever x ∈ S, then x + 3 ∈ S.

The first property is called base clause and the second property is calledinductive/recursive clause. An inductive specification may have multiple baseand inductive clauses.

Com S 342

The “Smallest Set”

If we use inductive specification, we always define the smallestset that satisfies all given properties. That is, inductivespecification is free of redundancy.

It is easy to see that there can only be one such set:If S1 and S2 both satisfy all given properties, and both are thesmallest, then we have S1 ⊆ S2 (since S1 is the smallest), andS2 ⊆ S1 (since S2 is the smallest), hence S1 = S2.

Com S 342

List of Numbers

The set list-of-numbers is the smallest set of values satisfying the twoproperties: The empty list is a list-of-numbers, and If l is a list-of-numbers and n is a number, then the pair (n . l) is a list-of-numbers.

A pair “(x . y)” (also called dotted pair) is a record structure with two fieldscalled the car (head) and cdr (tail) field. Pairs are created using the procedurecons.

Examples: () is a list-of-numbers, since () satisfies property 1. (14 . ()) is a list-of-numbers, since 14 is a number and () is a list-of-numbers. (4 . (14 . ())) is a list-of-numbers, since 3 is a number and (14 .()) is a list-of-

numbers.

Com S 342

Well-formed Formulae

Well-formed formulae for compound boolean propositions are defined asfollows: True and False are well-formed formulae, p, where p is a propositional variable, is a well-formed formula, (¬ p) is a well-formed formula, if p is a well-formed formula, (p ∧ q), (p ∨ q), (p → q), (p ↔ q) are well-formed formulae, if both p and q are

well-formed formulae.

Examples: p → ¬ q (p → q) ↔ ((¬ p ∨ q) → q)

Note: This inductive specification of well-formed boolean propositions definesalso a extended boolean term algebra TΣ(X), where the carrier set consistsprecisely of all the terms which can be generated from the constants,variables and operations of the signature Σ (i.e., the inductive specification).

Com S 342

The Backus-Naur Form

The process of describing more complex data types becomes quitecumbersome. In order to simplify this process, we specify complex valuesusing a context-free grammar (or type 2 grammar).

We use a notation called Backus-Naur Form, of BNF, to specify values using acontext-free grammar: The general rule format is: lhs ::= rhs, where lhs is a nonterminal, and rhs may be

a list, separated with “|” of strings of terminals and nonterminals All nonterminals are enclosed in brackets,<>.

Example: list-of-numbers<list-of-numbers> ::= ()

| (<number> . <list-of-numbers>)

Note: In BNF, some nonterminals (e.g. <number>) are left undefined, whentheir meaning is sufficiently clear from the context.

Com S 342

Types of Grammars

Type 0: A grammar that has no restrictions on its productions.

Type 1 – context-sensitive: A grammar can only have productions of the formw1 w2, where the length of w2 is greater than or equal to the length of w1,or the the form w1 ε.

Type 2 – context-free: A grammar can only have productions of the formw1 w2, where w1 is a single nonterminal.

Type 3 - regular: A grammar can only have productions of the form w1 w2,with w1 is a nonterminal and w2 is either aB, Ba, a, or ε, where B is anonterminal and a is a terminal.

Com S 342

Kleene Star

The Kleene star, written { … }*, is used to specify a sequence ofany number of instances of a given string.Example:

<s-list> ::= ({<symbol-expression>}*)<symbol-expression> ::= <symbol> | <s-list>

()(a b c)(fun1 (fun2 arg1 arg2) arg3 arg4)

Com S 342

Kleene Plus

The Kleene plus, written { … }+, is used to specify a sequence ofone or more of instances of a given string.Example:

<nonempty-list> ::= ({<datum>}+)<datum> ::= <number> | <symbol> | <string>

(a b “ComS342”)(fun1 (fun2 arg1 arg2) 3 “An argument”)

Com S 342

Separated List Notation

The separated list notation, written { … }*(c) or { … }+(c), can beused to specify any number of instances of a given string thatare separated with a non-empty character sequence.Example:

<list-of-expressions> ::= ({<expression>}*(,))

(1 , 2 , 3)

NOTE: This form is not used in the syntax specification of Scheme!

Com S 342

Induction

Having defined set inductively, we can use the inductive definition to proveproperties about members of the set.

The proof technique used is called mathematical induction.

The most common forms are induction on the structure of expressions andinduction on the length or structure of proofs.

A simple and intuitive way to think of induction is that it is a method forwriting down an infinite proof in a finite way.

Note: We can construct infinitely many values from a given inductivespecification.

Com S 342

Mathematical Induction

A proof by mathematical induction that a given property P is true for everypositive integer n, we write P(n), consists of two steps:

1. Basic step. The proposition P(1) (or P(0)) is shown to be true.

2. Inductive step. The implication P(n) P(n+1) is shown to be true for every positive integer n.

Note: In a proof by mathematical induction it is not assumed that P(n) is true forall positive integers! It is only shown that if it is assumed that P(n) is true,then P(n+1) is also true. In general, we use an inference rule called “Modusponens”, that is, [p ∧ (p → q)] → q. In words, if a property “p” is true and“p” implies “q”, then “q” is also true. The rule “Modus ponens” is a“tautology”.

Com S 342

Example: Sum(n) = n(n+1)/2

As a young boy, the later mathematician Carl Friedrich Gauss was asked byhis teacher to add up the first hundred numbers, in order to keep him quietfor a while. As we know today, this did not work out, since:

sum(n) = n(n+1)/2

Proof: Base case: We must show that sum(0) = 0(0+1)/2. This is an easy calculation,

and we have sum(0) = 0.

Inductive set: Assume sum(n) = n(n+1)/2 holds. We must show that sum(n+1) =(n+1)(n+2)/2 holds as well. First, sum(n+1) is just the sum of the first n numbersplus (n+1). Therefore, we have sum(n+1) = sum(n) + (n+1). Using the inductionhypothesis, we have

sum(n+1) = n(n+1)/2 + (n+1) = (n(n+1) + 2(n+1))/2 = (n2 + n + 2n + 2)/2 = (n+1)(n+2)/2

as required.

Com S 342

Theorem 1.1.1

Let s ∈ <binary-tree>, where <binary-tree> is defined by

<binary-tree> ::= <number> | (<symbol> <binary-tree> <binary-tree>)

Then s contains an odd number of nodes.

We prove this theorem by induction of the size of s, where wetake the size of s to be the number of nodes in s.

Com S 342

Proof of Theorem 1.1.1

The statement P(n) for a fixed positive integer n is called“induction hypothesis”. When we complete both steps of a proofby mathematical induction, we have shown that P(n) is true forall positive integers n, that is we have shown that ∀n P(n) istrue.

[P(1) ∧ ∀n (P(n) → P(n+1))] → P(n+1)

To prove Theorem 1.1.1, first we have to show that P(0) is true(s has no nodes at all), and then we prove that whenever n is anumber of nodes such that P(n) is true for n, then P(n+1) is alsotrue.

Com S 342

The Proof Steps

The induction hypothesis, P(n), is that any tree of size ≤n has an odd number ofnodes. We use induction on the size of binary trees. There are no trees with 0 nodes, so P(0) holds trivially. Let n be a size such that P(n) holds, that is a tree with ≤ n nodes has actually an

odd number of nodes. We need to show that P(n+1) holds as well, that is any treewith ≤ n+1 nodes has an odd number of nodes. We proceed by case analysis ofthe structure of binary trees:

s ≡ n:In this case s has exactly one node, and one is odd.

s ≡ (sym s1 s2):By assumption, s has ≤ n+1 nodes. Therefore, both s1 and s2 must have fewer nodesthan s, that is, s1 and s2 must have ≤ n nodes. Using the induction hypothesis, thenumber of nodes must be odd, say 2n1+1 and 2n2+1. Hence, the total number of nodesin the tree s is

((2n1+1) + (2n2+1)) + 1 = (2n1+2n2 +2) + 1 = 2(n1+n2 +1) + 1which is odd again.

Com S 342

Structural Induction

Structural induction uses the fact that substructures of a givenobject are always smaller than the object itself.

We have used this fact in the proof of Theorem 1.1.1.

Structural induction is done as follows: Base step: The induction hypothesis is true on simple structures (those

without substructures).

Induction step: If the induction hypothesis is true on the substructures ofa given object, say s, then it is true on s itself.

Com S 342

The Hilbert-style Proof System

A Hilbert-style proof system consists of axioms and proof rules. An axiom of a proof system is a formula that is provable by definition. An inference rule asserts that if some list of formulas is provable, then so

is another formula. A proof is a structured object built from formulas according to constraints

established by a set of axioms and inference rules.

The rule format:

We construct a proof from proofs:

Conclusion

PremisePremisePremisen21

...

Conclusion

Conclusion

Premise

Conclusion

Premise

Conclusion

Premise

n

n

2

2

1

1 ...

Com S 342

Recursive Program Specification

Overview Inductive program specification Deriving programs from a Backus-Naur form Pattern of recursion

References Daniel P. Friedman et al., “Essentials of Programming Languages”, Second

Edition, MIT Press, 2001 Harold Abelson et al., “Structure and Interpretation of Computer

Programs”, MIT Press, 1996

Com S 342

From BNF to a Program

With the help of BNF-rules, starting with simple members of adata set, we are able to specify inductively complex datastructures.

We can use the same approach to construct programs thatmanipulate these data structures.

First we define the program’s behavior on simple inputs, andthen we use this behavior to build inductively programs that canprocess with more complex arguments.

Com S 342

Exponentiation

Consider the problem of computing the integer exponential of a given integernumber.

A program that this problem should take as arguments a base b and apositive integer exponent n and computes bn.

b * b * ... * b = bn

orb0 = 1, b1 = b, b2 = b * b, ..., bn = bn-1 * b

In general,

!"

!#

$

>

==

0 n) 1-n b, e( * b

0 n1 ) n b, e(

Com S 342

Is e( b, n ) = bn correct?

To show that e( b, n ) = bn is indeed correct, we proceed by induction on n:

Base step: n = 0Then we have e( b, 0 ) = 1 = b0.

Induction Step:Assume e( b, n ) = bn is correct. We must show thate( b, n+1 ) = b(n+1). Then by the definition of e, it holds thate( b, n+1 ) = b * e( b, n ). Using the induction hypothesis,

we have e( b, n+1 ) = b * bn = b1 * bn = bn+1, as desired.

Com S 342

Procedure Exponential

The Scheme procedure for e( b, n ) is defined as follows:

(define exponential (lambda (b n) (if (zero? n) 1 (* b (exponential b (- n 1))))))

The two branches of the if expression correspond to the two cases of theinductive definition of e( b, n ).

If we can reduce a given problem to a subproblem, we can recursively call theprocedure that solves the original problem to solve the subproblem.

Com S 342

Exponentiation with negative Exponent

(define exponential (lambda (b n) (if (zero? n) 1 (if (negative? n) (* (/ 1 b) (exponential b (+ n 1))) (* b (exponential b (- n 1)))))))

This procedure works on all integers (including negative integers).

It holds that b-n = 1/bn for all integers n. Moreover, we can use inductiveprogram specification, since b-n = 1/b * 1/b(n-1).

Com S 342

Recursion

If a procedure that contains within its body calls to itself, then this procedureis called to be recursively defined.

This approach of program specification is called recursion and is found notonly in programming.

If we the define a procedure recursively, then there must exist at least onesubproblem that can be solved directly, that is without calling the procedureagain.

Note: A recursively defined procedure must always contain a directly solvablesubproblem. Otherwise, this procedure does not terminate.

Com S 342

Direct Program Derivation

An inductive proof can often be used to directly derive the correspondingcomputer program.

For example, the proof of Theorem 1.1.1 (A binary tree contains an oddnumber of nodes) leads directly to the following program:

(define count-nodes (lambda (s) ;; s in <binary-tree> (if (number? s) ;; s = <number> 1 (+ (count-nodes (cadr s)) ;; s = (sym s1 s2), cadr = cdr+car (count-nodes (caddr s)) ;; caddr = cdr+cdr+car 1))))

> (count-node `(s 1 1)) ==> 3

Com S 342

Rule of Thumb

When defining a program based on structural induction,the structure of the program must be patterned

according the structure of the data.

In general, this means that we have to define one procedure foreach syntactic category used to specify our data. Then eachprocedure has to examine the input to see, which right-hand-side it corresponds to. Furthermore, for every nonterminal thatappears in the right-hand-side, there will be a recursive call to aprocedure for that nonterminal. This approach is also calledrecursive-descent-parsing.

Com S 342

Always Remember

FOLLOW THE GRAMMAR

Com S 342

Predicate list-of-numbers?

<list-of-numbers> ::= ()| (<number> . <list-of-numbers>)

The predicate list-of-numbers? is recursively defined procedure, which analyses agiven list l according to the BNF specification:

(define list-of-number? (lambda (l)

(if (null? l) ;; null? returns #t if the argument is () #t

(and ;; second case: check the pair (number? (car l)) ;; (car (1 . 2)) = 1 (list-of-numbers? (cdr l)))))) ;; (cdr (1 . 2)) = 2

> (list-of-numbers? `(1 . (1 2))) ;; (1 2) = (1 . (2 .())), see R5RS page 25f#t

Com S 342

Introduction to the Lambda Calculus

Overview: What is Computability? – Church’s Thesis The Lambda Calculus Scope and lexical address The Church-Rosser Property

References: Daniel P. Friedman et al., “Essentials of Programming Languages”, Second Edition,

MIT Press, 2001 H.P. Barendregt, “The Lambda Calculus – Its Syntax and Semantics”, North-

Holland, 1984 David A. Schmidt, “The Structure of Typed Programming Languages”, MIT Press,

1994 Carl A. Gunter, “Semantics of Programming Languages”, MIT Press, 1992

Com S 342

What Is Computable?

Computation is usually modeled as a mapping from inputs to outputs,carried out by a formal “machine”, or program, which processes its input ina sequence of steps.

An “effectively computable” function is one that can be computed in a finiteamount of time using finite resources.

Problem

input

yes

no

output

“effectively computable”

function

program/machine

Com S 342

Church’s Thesis

Effectively computable functions [from positive integers to positive integers]are just those definable in the lambda calculus.

Or, equivalently:

It is not possible to build a machine that is more powerful than a Turingmachine.

Church’s thesis cannot be proven because “effectively computable” is anintuitive notion, not a mathematical one. It can only be refuted by given acounter-example – a machine that can solve a problem not computable be aTuring machine.

So far, all models of effectively computable functions have shown to beequivalent to Turing machines (or the lambda calculus).

Com S 342

Turing Machine

A Turing machine is an abstract representation of a computingdevice. It consists of a read/write head that scans a (possiblyinfinite) one-dimensional (bi-directional) tape divided intosquares, each of which is inscribed with a 0 or 1.

Computation begins with the machine, in a given "state",scanning a square. It erases what it finds there, prints a 0 or 1,moves to an adjacent square, and goes into a new state.

This behavior is completely determined by three parameters: the state the machine is in, the number on the square it is scanning, and a table of instructions.

Turing machine is more like a computer program (software) thana computer (hardware).

Com S 342

Example

(3,1)(3,0)3

(3,1)(3,0)2

(3,1)(2,0)1

(3,1)(1,0)0

10

0 1 2

3

1/1

0/0 0/0

0/01/11/1

Both specification describe the same Turing machine.

Com S 342

Uncomputability

A problem that cannot be solved by any Turing machine in finite time (or anyequivalent formalism) is called uncomputable.

Assuming Church’s thesis is true, an uncomputable problem cannot be solved by any real computer.

The Halting ProblemGiven an arbitrary Turing machine and its input tape, will the machineeventually halt?

The Halting Problem is provably uncomputable – which means that it cannotbe solved in practice.

Com S 342

Ackermann Function

The Ackermann function is the simplest example of a well-defined totalfunction which is computable but not primitive recursive.

The function f(x) = A(x, x), while Turing computable, grows much faster thanpolynomials or exponentials. The definition is:

A(0, y) = y + 1A(m+1, 0) = A(m, 1)A(m+1, n+1) = A(m, A(m+1, n))

Examples: A(2, 3) = 9A(3, 5) = 253A(4, 1) = 65533A(4, 3) = 2265536-3

Com S 342

The Lambda Calculus

Lambda calculus is a language with clear operational and denotationalsemantics capable of expressing algorithms. Also it forms a compactlanguage to denote mathematical proofs.

Logic provides a formal language in which mathematical statements can beformulated and provides deductive power to derive these. Type theory is aformal system, based on lambda calculus and logic, in which statements,computable functions and proofs all can be naturally represented.

The lambda calculus is a good medium to represent mathematics on acomputer with the aim to exchange and store reliable mathematicalknowledge.

Com S 342

The Definition of the Lambda Calculus

The Lambda Calculus was invented by Alonzo Church [1932] as amathematical formalism for expressing computation by functions.

Syntax: e ::= x a variable | λx . e an abstraction (function ) | e1 e2 a (function) application

(Operational) Semantics:α-conversion (renaming): λx . e ↔ λy . [y/x]e where y is fresh (in e)β-reduction (application): (λx . e1) e2 [e2/x]e1 avoiding name captureη-reduction: λx . (e x) e if x is not free in e

The lambda calculus can be viewed as the simplest possible pure functionalprogramming language.

Com S 342

The Scheme Syntax

<expression> ::= <identifier> | (lambda (<identifier>) <expression>) | (<expression> <expression>)

Examples:id = (lambda (x) x)Ω = ((lambda (x) (x x)) (lambda (x) (x x)))pair(x, y) = (lambda (x) (lambda (y) (lambda (z) ((z x) y))))

Com S 342

Reference or Declaration

In a program, variables can appear in two different ways:

as declarations: (lambda (x) …) or (let ((x …)) …)

The occurrence of x in both the lambda-abstraction and the let-clauseintroduces the variable as a name for some value. In particular, in thelambda expression, the value of the variable x will be supplied when theprocedure is called, whereas in the let expression the value of the variableis determined by the value of the first “…” (init expression).

as references: (f x y)

Here all variables, f, x, y, appear as references, whose meanings aredefined by an enclosing declaration.

Com S 342

Binding

A value named by a variable is also called denotation (meaning). Thedenotation must come from some declaration, we say the variable is boundby that declaration, or it refers to that declaration.

Declarations in most programming languages, including Scheme, havelimited scope (the area, where the variable is applicable). Therefore, a thesame variable name may occur multiple times in the program text, but beingused for different purposes. We use binding rules to determine thedeclaration to which a concrete variable use refers.

Scoping rules: We call a language statically scoped, if we can determine the declaration of a

variable by analyzing the program text alone. We call a language dynamically scoped, if we cannot determine the declaration of

a variable until the program is executed.

Com S 342

Binding Rules in Lambda Calculus

In (lambda (<identifier>) <expression>), the occurrence of<identifier> is a declaration that binds all occurrences of thatvariable in <expression> unless some intervening declaration ofthe same variable occurs.

Examples:(lambda (x) (lambda (y) (y x)))(lambda (x) (lambda (y) ((lambda (x) (lambda (y) (x y))) x) y))

Com S 342

Occurs Free, Occurs Bound

A variable x occurs free in e if and only if there is some use of x in e, that isnot bound by any declaration of x in e.

A variable x occurs bound in an expression e if and only if there is some useof x in e that is bound by a declaration of x in e.

Examples:((lambda (x) x) y): x is bound, but y is free(lambda (f) (lambda (x) (f x))): both f and x are bound

Note: Lambda expressions with no free variables are called combinators. Everyprocedure, when applied to all its necessary arguments, is a combinator.Therefore, procedure calls are called combinations in Scheme.

Com S 342

Free and Bound Variables

The variable x is bound by the enclosing λ in the expression λx. e. A variablethat is not bound, is free:

fv( x ) = { x } bv( x ) = ∅fv( λx . e ) = fv( e ) \ { x } bv( λx . e ) = bv( e ) ∪ { x }fv( e1 e2 ) = fv( e1 ) ∪ fv( e2 ) bv( e1 e2 ) = bv( e1 ) ∪ bv( e2 )

An expression with no free variables is closed (otherwise it is open). Forexample, y is bound and x is free in the (open) expression λy . x y.

Syntactic substitution will not always work:(λx . λy . x y) y [y/x](λy . x y) β-reduction

≠ (λy . y y) incorrect substitution!Since y is already bound in (λy . x y), we cannot directly substitute y for x.

Com S 342

The Scope of a Variable

Problem:For each variable reference find the corresponding declarationto which it refers.

This problem is easier to solve, when we ask:Given a declaration, which variable references refer to it?

In the definition of programming languages, binding rules for variablestypically associate with each declaration of a variable a region of the programwithin the declaration is effective.

Examples:(lambda (x) …): The region of x is the body of the lambda expression.(define x …): The region of x is the whole program.

Com S 342

Blocks

In lambda-calculus as in many modern programming languages regions canbe nested within each other. We call these languages block-structured, andregions are also called blocks.

> (define x ; first declaration of x (lambda (x) ; second declaration of x (map

(lambda (x) ; third declaration of x (+ x 1)) ; refers to third x))) ; refers to second

> (x `( 1 2 3)) ; refers to first(2 3 4)

Com S 342

Visibility

The scope of a variable, say x, can include inner regions that hide the variablex. Within these inner region the outer declaration of the variable x is hidden,that is, the scope of x has a hole.

We say the declaration of a variable is visible at the point of a variablereference, if the this declaration contains a variable reference within its scope.

Example: (lambda (x) (lambda (y) ((lambda (x) (lambda (y) (x y))) x) y))

Com S 342

Contour Diagrams

We use contour diagrams to picture the borders of a region:

The lexical (or static) depth of a variable reference is the number of contourscrossed to find the associated declaration.

The lexical depth is used in compilers to tell how many static links to traverseto find a variable.

Environment

Com S 342

Lexical Address

The declarations associated with a region may be numbered in the order oftheir appearance in the text. Each variable reference may then be associatedwith two numbers: its lexical depth and its position (both start with 0).

To illustrate lexical addresses, we replace every variable reference x with anexpression (x : d p), where d is the lexical depth and p is the declarationposition of v.

(lambda (x y) ((lambda (a) ((x : 1 0) ((a : 0 0) (y : 1 1)))) (x : 0 0)))

Note: The lexical address can be used by a compiler: lexical depth = number ofstatic links, and the lexical position = offset within activation frame.

Com S 342

Beta Reduction

Beta reduction is the computational engine of the lambda calculus:

Define: I ≡ λx . x

Now consider:I I = (λx . x) (λx . x) [(λx . x)/x]x β-reduction

= (λx . x) substitution= I

We can implement most lambda expressions directly in Scheme:> (define i (lambda (x) x))>(i 5)5> (i (i 5))5

Com S 342

Substitution

We must define substitution carefully to avoid name capture:

[e/x]x = e[e/x]y = y if x ≠ y[e/x](e1 e2) = ([e/x]e1 [e/x]e2)[e/x](λx . e1) = (λx . e1)[e/x](λy . e1) = (λy . [e/x] e1) if x ≠ y and y ∉ fv( e )[e/x](λy . e1) = (λz . [e/x] [z/y] e1) if x ≠ y and z ∉ (fv( e ) ∪ fv( e1 ))

Consider:(λx . ((λy . x ) (λx . x)) x ) y [y/x]((λy . x) (λx . x)) x

= ((λz . y) (λx . x)) y

Com S 342

Alpha Conversion

Alpha conversions allows one to rename bound variables.

A bound name x in the lambda abstraction (λx. e) may be substituted by anyother name y, as long as there are no free occurrences of y in e:

Consider:(λx . λy . x y) y (λx . λz . x z) y α-conversion

[y/x] (λz . x z) β-reduction (λz . y z) = y η-reduction

Com S 342

Eta Reduction

η-reductions allows one to remove “redundant lambdas”.

Suppose that f is a closed expression (i.e., x does not occur freein f ). Then:

(λx . f x) y ([y/x]f ) ([y/x]x) = f y β-reduction

More generally, this will hold whenever x does not occur free inf. In such cases, we can always rewrite (λx . f x) as f.

Com S 342

Currying

Since a lambda abstraction only binds a single variable, functions withmultiple parameters must be modeled as curried higher-order functions. Thismethod is named after the logician H. B. Curry, who popularized theapproach.

To improve readability, multiple lambdas can be suppressed, so:λx y . x = λx . λy . xλb x y . b x y = λb . λx . λy . (b x) y

Scheme:(lambda (x y) x) = (lambda (x) (lambda (y) x))(lambda (b x y) (b x y)) = (lambda (b) (lambda (x) (lambda (y) ((b x) y))))

Com S 342

Normal Forms

A lambda expression is in normal form if it can no longer be reduced by the β- or η-reduction rules.

But not all lambda expressions have normal forms!

Ω = (λx . x x) (λx . x x) [(λx . x x)/x] (x x)= (λx . x x) (λx . x x) β-reduction(λx . x x) (λx . x x) β-reduction(λx . x x) (λx . x x) β-reduction(λx . x x) (λx . x x) β-reduction...

Reduction of a lambda expression to a normal form is analogous to the factthat a Turing machine halts or a program terminates.

Com S 342

Evaluation Order

Most programming languages are strict, that is, all expressions passed to afunction call are evaluated before control is passed to the function (e.g.Scheme).

Most modern functional languages, on the other hand, use lazy evaluation,that is, expressions are only evaluated when they are needed.

Consider: square n = n * n

Applicative-order reduction:square (2 + 5) square 7 7 * 7 49

Normal-order reduction:square (2 + 5) (2 + 5) * (2 + 5) 7 * (2 + 5) 7 * 7 49

Com S 342

Applicative-Order Reduction

Motivation: Modeling call-by-value in programming languages In function calls, evaluate arguments then invoke function

In the lambda-calculus, this means: In (e1 e2), reduce e2 to normal form using applicative order reduction Then reduce e1 to normal form using applicative order reduction If e1 is a lambda abstraction, do beta reduction, and reduce the result to normal

form using applicative order reduction

Syntax makes it easy: Write expression using fully parenthesized notation Always perform rightmost beta reduction by Repeatedly scanning for rightmost (left parenthesis) occurrence of ((λx . e1) e2) Note, this includes reduction of primitives, e.g. ((add 1) 2)

Com S 342

Applicative-Order Example

Consider:((λx . ((λy . add y y) (mul x x))) (sub 3 1))

Applicative order reduction gives((λx . ((λy . add y y) (mul x x))) (sub 3 1))((λx . ((λy . add y y) (mul x x))) 2)((λx . (add (mul x x) (mul x x))) 2)(add (mul 2 2) (mul 2 2))(add (mul 2 2) 4)

(add 4 4)8

Com S 342

Applicative-Order Example - Scheme

Consider:((lambda (x) ((lambda (y) (+ y y)) (* x x))) (- 3 1))

Applicative order reduction gives((lambda (x) ((lambda (y) (+ y y)) (* x x))) (- 3 1))((lambda (x) ((lambda (y) (+ y y)) (* x x))) 2)((lambda (y) (+ y y)) (* 2 2))((lambda (y) (+ y y)) 4)(+ 4 4)8

Com S 342

Normal-Order Example

Consider:((λx . ((λy . add y y) (mul x x))) (sub 3 1))

Normal-order reduction gives((λx . ((λy . add y y) (mul x x))) (sub 3 1))((λy . add y y) (mul (sub 3 1) (sub 3 1)))(add (mul (sub 3 1) (sub 3 1)) (mul (sub 3 1) (sub 3 1)))(add (mul 2 2) (mul 2 2))(add 4 4)8

Com S 342

The Church-Rosser Property

“If an expression can be evaluated at all, it can be evaluated by consistentlyusing normal- order evaluation. If an expression can be evaluated in severaldifferent orders (mixing normal-order and applicative-order reduction), thenall of these evaluation orders yield the same result”.

So, evaluation order “does not matter” in the lambda calculus. However,applicative order reduction may not terminate, even if a normal form exists!

(λx . y) ((λx . x x) (λx . x x))

Applicative-order reduction Normal-order reduction(λx . y) ((λx . x x) (λx . x x)) y(λx . y) ((λx . x x) (λx . x x)). . .

Com S 342

SKI Combinator Reduction

SKI combinator reduction is an implementation technique that yields normal-order (lazy) evaluation in the most natural way.

A lambda calculus expression (that denotes a program) can be transformedinto an equivalent combinator expression that contains only constants andapplications. Moreover, this combinator expression will contain neither anylambda abstractions nor any variables.

The reduction of combinator expressions is based on a combinator calculusthat does not have a beta reduction, hence term rewriting does not need tomanipulate variables and environments explicitly.

Com S 342

Combinators & Combinator Reduction

C x y z = x z y

B x y z = x (y z)

S x y z = x z (y z)

K x y = x

I x = x

ReductionNameCombinator

Swap functionC = λx . λy . λ z . x z y

Composition functionB = λx . λy . λz . x (y z)

Distribution functionS = λx . λy . λz . x z (y z)

Constant functionK = λx . λy . x

IdentityI = λx . x

The first three combinators I, K, and S are sufficient to transform everylambda expression into an equivalent combinator expression.

Com S 342

A Combinator Language

Syntax:<expression> ::= k ; constant

| S | K | I | ( <expression> <expression> )

Let e be a lambda calculus expression. Then the function U( e ) translates einto an equivalent combinator expression:

U( e ) = e ; e does not contain any λU( λx . e ) = [x](U( e ) )U( e1 e2 ) = U( e1 ) U( e2 )

Com S 342

[x]( e )

The function [x]( e ) is defined as follows:

[x]( k ) = K k [x]( x ) = I [x]( y ) = K y [x]( e1 e2 ) = S ([x]( e1 ) ) ([x]( e2 ) )

Com S 342

Building a Combinator Expression

U( λx . λy . x y ) = [x](U( λy . x y ) )= [x]([y](U( x y ) ) )

= [x]([y]( x y ) )= [x]( S ( [y]( x ) ) ( [y]( y ) ) )= [x]( S ( K x ) I )= S ( [x]( S ( K x ) ) ) ( [x]( I ) )= S ( S ( [x]( S ) ) ([x]( K x ) ) ) ( K I )= S ( S ( K S ) (S ( [x]( K ) ) ( [x]( x ) ) ) ) ( K I )= S ( S ( K S ) (S ( K K ) I ) ) ( K I )

Com S 342

Reducing a Combinator Expression

(S (S (K S) (S (K K) I)) (K I)) A B ; S x y z = x z (y z)= S (K S) (S (K K) I) A (K I A) B ; S x y z = x z (y z)= K S A (S (K K) I A) (K I A) B ; K x y = x= S (S (K K) I A) (K I A) B ; S x y z = x z (y z)= S (K K) I A B ((K I A) B) ; S x y z = x z (y z)= K K A (I A) B ((K I A) B) ; K x y = x= K (I A) B ((K I A) B) ; K x y = x= I A ((K I A) B) ; I x = x= A ((K I A) B) ; K x y = x= A (I B) ; I x = x= A B

( λx . λy . x y ) A B A B

Com S 342

Data Abstraction

Overview: Abstract data types The procedure define-datatype Abstract syntax Representation strategies for data types

References: Daniel P. Friedman et al., “Essentials of Programming Languages”, Second

Edition, MIT Press, 2001 David A. Schmidt, “Denotational Semantics”, MIT Press, 1986 Carl A. Gunter, “Semantics of Programming Languages”, MIT Press, 1992

Com S 342

New Sets of Values

The definition of a new data type (i.e. a new set of values)consists of two ingredients: Some set of values, called the interface, that serves as representation of

the newly define data type, and Some set of procedures, called the implementation, that provides the

operations, which can be used to manipulate entities of the newly defineddata type.

Example: <s-list> ::= ( {<symbol> . <s-list>}* ) (define up …), (define swapper …), (define flatten …)

Com S 342

Representation Independence

The representation of new data types can be often very complex.

When working with new data types, we usually do not want to be concernedwith their actual representation. In fact, program become more reliable androbust, if they do not depend on the actual representation of data type. Datatypes that do not expose their actual representation are called representationtransparent.

Data types in C/C++ and Scheme are in general not representationtransparent (e.g. the size of integers in C/C++ is platform dependent,boolean values in Scheme are represented by #t and #f).

Data types in Java are basically representation transparent (arrays are anexception, since they are represented by objects).

Com S 342

Opaque vs. Transparent Implementations

A data type is opaque if there is no way to find out its representation, even byprinting.

Example:;; initialize a location with;; some value x(define make-cell (lambda (x) (vector x)))

;; extract value from location(define cell-ref (lambda (cell)

(vector-ref cell 0)))

> (define my-cell (make-cell 342))> (vector? my-cell)#t> my-cell#(342)> (cell-ref my-cell)342

Com S 342

Pros & Cons

Opaque data types enforce the use of defining procedure.

Opaque data types are more secure. Access to values of opaquedata types is only possible by means of access proceduresdefined in an interface.

Transparent data types are easier to debug and to extend.

The fact that transparent data types expose their internalrepresentation is also a disadvantage (limited security).

Com S 342

Abstract Data Type

The technique used to define new data types independently of their actualrepresentation is called data abstraction.

Data abstraction divides the data types in interfaces and implementations. Interfaces are used to specify the set of values the data types represents, the

operations, which are available for that data type, and properties these operationsmay be guaranteed to have.

Implementations provide a specific representation of the data and code for theoperations.

A data type, which has been defined in this way is called abstract data type.A client (program) can use values of an abstract data type by means of theinterface without knowing their actual representation (which can change overtime). Data abstraction enforces representation independence.

Com S 342

Examples of Abstract Data Types

Files

Lists, hash tables, vectors, bags

Strings, records, arrays

Objects with private instance variables and public methods

Standardized integers (e.g. in Java the type int is representedusing 32 bits and big endian format, on every platform)

Com S 342

An Abstraction for Inductive Data Types

Data types can be defined inductively using a BNF-grammar.

Problem: What is a suitable representation of an inductivelyspecified set of values?

Example: <bintree> ::= <number> | (<symbol> <bintree> <bintree>)

What should the interface for this data type look like?

Com S 342

Constructors and Access Procedures

In order to create, to manipulate, and to verify that a givenvalue is of the desired data type, we need the followingingredients: Constructors that allow us to build values of a given data type, A predicate that tests whether a given value is a representation of a

particular data type, and Some access procedures that allow us to extract a particular information

from a given representation of a data type.

Solution: a tool that provides a standard representation forinductively specified data types: define-datatype

Com S 342

Bintree

(define-datatype bintree bintree? (leaf-node (datum number?)) (interior-node (key symbol?) (left bintree?) (right bintree?)))

This says that a bintree is either a leaf-node consisting of a number called datum, or An interior-node consisting of a key that is a symbol and two bintree’s called left

and right.

Com S 342

The Elements of define-datatype

The abstraction (define-datatype bintree bintree? ... ) defines: A representation for the data type bintree. A 1-argument constructor, leaf-node, to build a leaf-node. This procedure

tests whether the argument is a number; if this test fails, an error isreported.

A 3-argument constructor, interior-node, to build an interior-node. Thisprocedure tests the first argument with symbol? and its second and thirdarguments with bintree? to ensure that the values are of an appropriatetype.

A 1-argument predicate, bintree?, that returns true (#t) if the passedargument is either a leaf-node or an interior-node. For all other argumentsbintree? returns false (#f).

Com S 342

Arrays, Records, and Unions

A data type that contains values of other types is calledcomposite or aggregate type.

Arrays and records are composite types:C/C++: struct { int f1; struct { int *a; int *b; } f2; char f3; } arecord;

int (* afp[100])(int, char);

A union type is one whose values are one or the other ofmultiple given types:C/C++: union { int f1; char f2; struct { int *a; int *b; } f3; } aunion;

Com S 342

Disjoint Union

A disjoint union (sum) type is a union type, with the exception that everyvalue is annotated with the type the value comes from (see discriminatedunion type EOPL2 page 44).

Scheme values belong to a disjoint (discriminated) union of all the primitivetypes provided by the Scheme implementation.

Inductively specified data types can be represented as disjoint union of recordtypes, called variant record:C/C++: union { struct { char a; int b [7]; } f1; struct { int *a; int *b; } f2; }VB/Delphi/COM+: The OleVariant type represents variants that contain only

COM-compatible types.

Com S 342

define-datatype

A define-datatype declaration, which can only appear at the top-level of aprogram, has the following general form:

(define-datatype <type-name> <type-predicate-name> { ( <variant-name> { (<field-name> <predicate>) }* ) }* )

This abstraction creates a variant-record data type, named type-name. Eachvariant has a variant-name and zero or more fields, each with its own field-name and associated predicate.

Note: In the variant-records, no two types may have the same name. No two variant-records may have the same name.

Com S 342

Abstract Syntax

BNF-specifications are used to describe the concrete syntax, or externalrepresentation of values.

Abstract syntax specifications are used to describe the internal representationof values.

In abstract syntax specifications terminal symbols disappear entirely.

The building blocks of abstract syntax specifications are tokens rather thanterminals.

Unlike BNF-specifications, abstract syntax specifications are allowed togenerate ambiguous syntax trees.

Com S 342

Simple Expressions

BNF-specification:<expression> := <expression> (+|-) <term> | <term><term> := <term> (*|/) <number> | <number>

4

<expression>

<expression>

<number> <number> <number>

* 2 + 1

<term><term>

<term>

Com S 342

Abstract Syntax of Simple Expressions

<expression> := <operator> <expression> <expression> | <number>

<operator> := + | - | * | /

Example:4 * 2 + 1 (+ 1 (* 4 2))

Note: There is no syntactic sugar in the abstract syntax specification. A parseruses the concrete syntax (i.e., it generates unique syntax trees for everyinput) and it generates a syntax tree, which structure is generated by theabstract syntax specification.

Com S 342

Lambda Calculus Expressions

<expression> ::= <identifier> | (lambda (<identifier>) <expression>)

| (<expression> <expression>)

(define-datatype expression expression? (variable

(id symbol?)) (abstraction

(id symbol?)(body expression?))

(application(function expression?)(argument expression?)))

> (expression? (variable `x))#t> (abstraction `x (variable `x))(abstraction x (variable x))

Com S 342

BNF vs. Abstract Syntax

<expression> ::= <identifier>

(variable (id symbol?))

::= (lambda (<identifier>) <expression>)

(abstraction (id symbol?) (body expression?))

::= (<expression> <expression>)

(application (function expression?) (argument expression?))

Com S 342

The Syntactic Form Cases

The form cases is used to determine the variant to which a givenobject of a data type belongs, and to extract its components.

The general syntax of cases is:

(cases <type-name> <expression> { (<variant-name> ( {<field-name>}* ) <consequent> ) }*

(else <default> ) )

Com S 342

is-abstraction?

(cases expression e… (abstraction (id body)

<consequent> )…

)

(define is-abstraction? (lambda (e) (and (list? e) (= (length e) 3) (eqv? `lambda (car e)) (let ((arg (cadr e))) (and (list? arg) (= (length arg) 1) (is-variable? (car arg)))) (is-expresssion? (caddr e)))))

Com S 342

free-variables

(define free-variables (lambda (expr) (cases expression expr (variable (id)

(list id)) (abstraction (id body)

(difference (free-variables body) (list id))) (application (e1 e2)

(union (free-variables e1) (free-variables e2))))))

> (free-variables (abstraction `x (application (variable `x) (variable `y))))(y)

(define free-variables … ((is-abstraction? e) (difference (free-variables (caddr e))

(list (caadr e)))) …) )))

Com S 342

Parse Expression

(define parse-expression (lambda (datum) (cond ((symbol? datum) (variable datum)) ((pair? datum) (if (eqv? (car datum) 'lambda) (abstraction (caadr datum) (parse-expression (caddr datum)))

(application (parse-expression (car datum)) (parse-expression (cadr datum)))))

(else (eopl:error 'parse-expression "Invalid concrete syntax ~s" datum)))))

> (parse-expression `(lambda (x) (x y)))(abstraction x (application (variable x) (variable y)))

WARNING:Accepts ill-formed expressions!

Com S 342

Unparse Expression

(define unparse-expression (lambda (expr) (cases expression expr (variable (id) id) (abstraction (id body) (list 'lambda (list id) (unparse-expression body))) (application (function argument) (list (unparse-expression function)

(unparse-expression argument))))))

> (unparse-expression (abstraction `x (application (variable `x) (variable `y))))(lambda (x) (x y))

Com S 342

Representation Strategies for Data Types

Abstract Data Type

Procedural Representation Record-based Representation

Given an interface for a data type we can change the underlyingrepresentation if needed using different strategies.

Com S 342

Booleans

We can represent boolean values and operations that manipulate booleanvalues as functions:

TRUE ≡ (define my_true (lambda (x y) x))

FALSE ≡ (define my_false (lambda (x y) y))

not b ≡ (define my_not (lambda (b) (b my_false my_true)))

if b then x else y ≡ (define my_if (lambda (b x y) (b x y)))

Example: if TRUE then x else y ≡ (my_if my_true `x `y) = (my_true x y) = x

Com S 342

Pairs

Although tuples are not supported by the lambda calculus, they can easily bemodeled as higher-order functions that “wrap” pairs of values. n-tuples canbe modeled by composing pairs ...

pair ≡ (define PAIR (lambda (x y) (lambda (z) (z x y)))) first ≡ (define FIRST (lambda (p) (p TRUE))) second ≡ (define SECOND (lambda (p) (p FALSE)))

> (define a-pair (PAIR 1 2))> (FIRST a-pair)1> (SECOND a-pair)2

Com S 342

Church Numbers

A number n is represented by a functional, which applies an argumentfunction n-times to another argument. The number zero (0) is represented bya functional that yields the identity function for its argument.

Define: n ≡ λs . λz . s(n) z0 ≡ λs . λz . zsucc ≡ λn . λs . λz . s (n s z)iszero ≡ λn . n (λx . FALSE) TRUEadd ≡ λm . λn . m succ n

Then: 1 = succ 0 = (λn . λs . λz . s (n s z)) (λs . λz . z) λs . λz . s ((λf . λx . x) s z) λs . λz . s ((λx . x) z) λs . λz . s z

Com S 342

Church Numbers in Scheme

0 ≡ (define NULL (lambda (s z) z))succ ≡ (define SUCC (lambda (n) (lambda (s z) (s (n s z))))iszero ≡ (define ISZERO (lambda (n) (n (lambda (x) FALSE) TRUE)))add ≡ (define ADD (lambda (m n) (m SUCC n)))

> (IF (ISZERO NULL) “is zero” “not zero”)“is zero”> (IF (ISZERO (SUCC NULL)) “is zero” “not zero”)“not zero”> (IF (ISZERO (ADD NULL NULL)) “is zero” “not zero”)“is zero”> (IF (ISZERO (ADD NULL (SUCC NULL))) “is zero” “not zero”)“not zero”

Com S 342

Functional Sets

In order to build a set of elements of some data type we can use a propertyfunction f? where f? returns #t (true) if an only if the given argument satisfiesthe property defined by f?.

We write { x | f(x) }, called set builder, to define a set of some data typewhere all elements x satisfy property f?, that is

∀ x, f(x) = true.

In fact, the “characterizing” function f? yields true for elements of the set andit yields false for all other arguments.

Example: f? = isPrime, (f? 3) = #t, (f? 6) = #f

Note, a set builder is a function that uses a predicate to build a concrete set.

Com S 342

Representation of Functional Sets

(define make-set (lambda (pred) pred))

(define fs-union (lambda (fs1 fs2) (lambda (elem) (or (fs1 elem) (fs2 elem)))))

(define fs-intersection (lambda (fs1 fs2) (lambda (elem) (and (fs1 elem) (fs2 elem)))))

(define fs-difference (lambda (fs1 fs2) (lambda (elem) (and (fs1 elem) (not (fs2 elem))))))

(define fs-symdiff (lambda (fs1 fs2) (lambda (elem) (fs-union (fs-difference fs1 fs2) (fs-difference fs2 fs1)))))

(define is-fs-member? (lambda (elem fs) (fs elem)))

Com S 342

Application of Functional Sets

> (make-set number?)#<procedure>> (is-fs-member? 2 (make-set number?))#t> (is-fs-member? 2 (fs-intersection (make-set number?) (make-set (lambda (x) (= x 2)))))#t

Com S 342

A Real Set

(define fs-filter (lambda (fs lst) (if (null? lst) `() (append (if (fs (car lst)) (list (car lst)) `())

(fs-filter fs (cdr lst))))))

> (fs-filter (make-set number?) `(1 a 3))(1 3)

merge lists

Com S 342

Record Representation of Booleans

(define-datatype my_bool my_bool? (my_true) (my_false))

(define my_not (lambda (b) (cases my_bool b (my_true () (my_false)) (my_false () (my_true)))))

(define my_if (lambda (b x y) (cases my_bool b (my_true () x) (my_false () y))))

> (my_if (my_true) `x `y)X> (my_not (my_true))(my_false)

Com S 342

A Data Type for Environments

An environment maps (free) symbols (of an expression) to values.

An environment is a function whose domain is the set of symbols, and whosecodomain (range) is the set of all values.

In general, the environment function is a total function, since the domain ofthe function is restricted to free symbols of the corresponding expression, thatis dom(f) = free-vars(e).

If we adopt the usual mathematical convention that a function is a set ofordered pairs, then we need to represent all sets of the form

{(s1,v1), …, (sn,vn)}where all si are pairwise distinct symbols and vi are any values.

Com S 342

The Environment Interface

The interface for environments has three procedures:

(empty-env) = ∅(apply-env f s) = f(s)

(extend-env`(s1 … sn)`(v1 … vn) f ) = g,

where g(s’) =!"

!#

$ %%=

otherwise)f(s'

n i 1 i, somefor s s' ifvi i

Com S 342

Procedural Representation

(define empty-env (lambda () (lambda (sym) (eopl:error 'apply-env "No binding for ~s" sym))))

(define extend-env (lambda (syms vals env) (lambda (sym) (let ((pos (list-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym))))))

(define apply-env (lambda (env sym) (env sym)))

> (apply-env (extend-env `(a b c) `(1 2 3) (empty-env)) `c)3> (apply-env (extend-env `(a b c) `(1 2 3) (empty-env)) `d)Error reported by apply-env:No binding for d

Com S 342

Call Trace

> (apply-env (extend-env `(a b c) `(1 2 3) (empty-env)) `c)

| (empty-env)| (extend-env (a b c) (1 2 3) (lambda (sym) (eopl:error …)))| (apply-env (lambda (sym) (let ((pos (list-find-position sym (a b c)))) (if (number? pos) (list-ref (1 2 3) pos) (apply-env (lambda (sym) (eopl:error …)) sym)))) c)| (let ((pos (list-find-position c (a b c)))) pos = 2 (if (number? pos) #t (list-ref (1 2 3) pos) (1 2 3)[2] = 3 (apply-env (lambda (sym) (eopl:error …)) c)))3

Com S 342

Helper

(define list-find-position (lambda (sym los) (list-index (lambda (sym1) (eqv? sym1 sym)) los)))

(define list-index (lambda (pred ls) (cond ((null? ls) #f) ((pred (car ls)) 0) (else (let ((list-index-r (list-index pred (cdr ls)))) (if (number? list-index-r) (+ list-index-r 1) #f))))))

Com S 342

BNF & Abstract Syntax Specification

<env-rep> ::= (empty-env)

(empty-env-record)

::= (extend-env ({<symbol>}*) ({<value>}*) <env-rep> )

(extend-env-record (syms (list-of symbol?)) (vals (list-of scheme-value?)) (env environment?))

Com S 342

The Environment Data Type

(define-datatype environment environment? (empty-env-record) (extended-env-record (syms (list-of symbol?)) (vals (list-of scheme-value?)) (env environment?)))

(define scheme-value? (lambda (v) #t))

> (extended-env-record `(a b c) `(1 2 3) (empty-env-record))(extended-env-record (a b c) (1 2 3) (empty-env-record))

Com S 342

List-Of

(define list-of (lambda (pred) (lambda (val) (or (null? val) (and (pair? val) (pred (car val)) ((list-of pred) (cdr val)))))))

> ((list-of number?) `(1 2 3 4))#t

list-of is a procedure that when applied toa predicate yields as value a procedure.

Com S 342

The Environment Operations

(define empty-env (lambda () (empty-env-record)))

(define extend-env (lambda (syms vals env) (extended-env-record syms vals env)))

(define apply-env (lambda (env sym) (cases environment env (empty-env-record () (eopl:error 'apply-env "No binding for ~s" sym)) (extended-env-record (syms vals env) (let ((pos (list-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym)))))))

The procedures empty-env andentend-env are data type constructors.

The procedure apply-env is adata type observer.

Com S 342

Environment-Passing Interpreters

Overview: Semantics of fundamental programming language features The construction of an interpreter Scanning and parsing Local binding, closures, recursion, parameter-passing

References: Daniel P. Friedman et al., “Essentials of Programming Languages”, Second

Edition, MIT Press, 2001 Andrew W. Appel, “Modern Compiler Implementation in [C,Java,ML]”,

Cambridge University Press, 1998 Carl A. Gunter, “Semantics of Programming Languages”, MIT Press, 1992

Com S 342

A Language Interpreter

An interpreter consists of two parts: A front end that converts program text (a program in the source language) to a

abstract syntax tree (the internal representation of the program text) and An evaluator (the actual interpreter) that looks at a data structure and performs

some associated actions, which depend on the actual data structure. In case of alanguage-processing system, the interpreter takes the abstract syntax tree andconverts it, possibly using external inputs, to an answer.

Examples: A calculator Basic Perl, Python, sh, awk, Tcl JVM

Com S 342

Execution via Interpreter

Front End

Interpreter

Program Text

Abstract Syntax Tree

Output

Input

read-eval-loop

Com S 342

A Language Compiler

A compiler translates program text into some other language (the targetlanguage)

The building blocks of a compiler are: A front end that converts program text (a program in the source language) to a

abstract syntax tree (the internal representation of the program text), A set of independent compiler phases, each has assigned a particular task in the

compilation process (e.g. semantics analysis, optimization, register allocation, codeemission), and

The evaluator of a compiled languages may be an interpreter (e.g. JVM) or simplya hardware machine (e.g. von Neumann computer).

Examples of compiled languages: C/C++/C# Pascal, Java Haskell

Com S 342

Execution via Compiler

Fron

t En

d

Abst

ract

Mac

hine

or

Har

dwar

e M

achi

ne

Prog

ram

Tex

t

Abstract Syntax Tree

Output

Input

Sem

antic

Ana

lysi

s

Opt

imiz

atio

n

Code

Em

issi

onSymbol Table

Machine Code

Analyzer Phases Translator Phases

Com S 342

Simple Interpreters

We have already developed interpreters for small languages: <list-of-numbers> <expression>

Each interpreter is a data-driven procedure that assigns aninterpretation (meaning) to every element of the abstract syntax.

Examples of interpreters: occurs-free?, occurs-bound? parse-expression, unparse-expression alpha, substitution

Com S 342

Programming Language Values

In the specification of programming languages we have always at least twosets of values: Expressed values – values that can be specified (by means of expressions) in the

given programming languageExamples: numbers, pairs, characters, strings

Denoted values – values that are bound to variablesExamples: locations containing expressed values (Scheme)

Note: In general, a denotation assigns a term (symbols, strings, or expression) ina language a precise mathematical meaning. The symbol “1” is assigned the meaning 1 – the number 1. The expression “1 * v” is assigned the meaning (times 1 (loc v)) with “times”

being the usual operation for multiplication, and “loc” being an environmentfunction that maps “v” to a value defined in the environment.

Com S 342

Source, Host, and Target Language

The source language (or defined language) is the language inwhich we write programs that should be evaluated by aninterpreter.

The host language (or defining language) is the language inwhich we specify the interpreter.

The target language is the language a source language intranslated to by a compiler. A target language may be a higher-level programming language (e.g. C) or assembly language (ormachine language).

Com S 342

A First Interpreter

In a first language the set of expressed values is equal to the setof integers and the set of denoted values is the same as the setof expressed values:

Expressed Value = NumberDenoted Value = Number

Note: We use always an equational specification to define both theset of expressed and denoted values.

Com S 342

XML

What is XML: XML stands for EXtensible Markup Language. XML was designed to describe data. XML tags are not predefined. You must define your own tags.

XML is not a programming language: XML does not do anything. XML was not designed to do anything. XML was designed to structure data.

When should you use XML? When you need a buzzword in your resume.

Com S 342

A XML-based Programming Language

We use XML to define data: Numbers, Strings, Records, List, and even Expressions or whole Programs

can be considered as data.

Has XML been used to define a programming language before? Yes: eXtensible Stylesheet Languages: Transformations (XSLT)

Our goal: XMLScheme XMLScheme is an XML-based programming language, whose semantics is

given in Scheme. XMLScheme uses a strict order of tags to facilitate parsing.

Com S 342

<program> ::= <expression>

<expression> ::= "<integer" "value" "=" <number> "/>"

::= "<reference" "name" "=" <identifier> "/>"

::= "<" <prim-op> "<arguments" {<expression>}* "/>" "/>"

<prim-op> ::= "add" | “sub" | “mul" | “inc" | “”dec"

A Small Language

a-program (exp)

lit-exp (num)

var-exp (id)

primapp-exp (prim rands)

Com S 342

Language Characteristics

A program is just an expression.

An expression is either a number, an identifier, or a primitive applicationconsisting of a primitive operator, a left parenthesis, a list of expressionsseparated by commas, and a right parenthesis.

Example:<inc <arguments <add <arguments <integer value = 3 /> <reference value = x /> /> /> />/>

(inc (add 3 x))

Com S 342

The Abstract Syntax

We use variant records to specify the abstract syntax:

(define-datatype program program?(a-program (exp expression?)))

(define-datatype expression expression?(lit-exp (num number?))(var-exp (id symbol?))(primapp-exp (prim prim-op?) (rands (list-of expression?))))

(define-datatype prim-op prim-op?(add-prim) (sub-prim) (mult-prim) (inc-prim) (dec-prim))

Com S 342

eval-program

(define eval-program (lambda (pgm) (cases program pgm

(a-program (body) (eval-expression body (init-env))))))

The main procedure, eval-program, is passed an abstract syntax tree of aprogram and returns its value.

We use the rule “follow the grammar” to define all evaluation procedures.

We need to use “cases”, even though there is only one case.

The procedure eval-expression is passed a “suitable” environment that mapsall free variables in the abstract syntax tree to denoted values.

Com S 342

eval-expression

(define eval-expression (lambda (exp env) (cases expression exp

(lit-exp (datum) datum)(var-exp (id) (apply-env env id))(primapp-exp (prim rands) (let ((args (eval-rands rands env))) (apply-primitive prim args))) )))

The procedure eval-expression takes an expression and an environment, andreturns the denoted value of the expression using the environment to map allfree variables to denoted values.

There are three cases: lit-exp, var-exp, and primapp-exp.

Com S 342

eval-rands

(define eval-rands (lambda (rands env) (map (lambda (x) (eval-rand x env)) rands)))

(define eval-rand (lambda (rand env) (eval-expression rand env)))

The procedure eval-rands applies the procedure(lambda (x) (eval-rand x env))

to each element of rands (list of expressions), and returns a list of denotedvalues.

Com S 342

apply-primitive

(define apply-primitive (lambda (prim args) (cases prim-op prim (add-prim () (+ (car args) (cadr args)))

(sub-prim () (- (car args) (cadr args)))(mult-prim () (* (car args) (cadr args)))(inc-prim () (+ (car args) 1))(dec-prim () (- (car args) 1)) )))

The procedure apply-primitive takes a primitive operation and a list ofdenoted values and returns a value associated with the application of theapplication of the primitive operator to the given arguments.

The procedure apply-primitive does not need an environment, because allvariable references have already been replaced with denoted values.

Com S 342

Comments

Our interpreter needs an initial (predefined) environment that maps all freevariables of a program to denoted values:

(define init-env (lambda () (extend-env `(i v x) ‘(1 5 10) (empty-env))))

The procedure apply-primitive assigns a meaning to all operators. Moreover,this procedure maps all operators to their usual mathematical interpretation(unary inc and dec, binary +, -, and *). If we want to change the arity of theoperators, we need to change apply-primitive.

The interpreter assigns an operational semantics to our language. Themeaning of both the expressions and the operators is defined using Z - thenatural numbers.

Com S 342

The Front End

The front end of a interpreter/compiler translates the program text into anabstract syntax tree.

As far as common programming languages are concerned, programs are juststrings of characters.

The front end groups the characters of the program into meaningful units,which are called tokens.

The front end is usually divided into two stages: Scanning: The process of dividing a sequence of characters into words, numbers,

punctuations, operators, comments, and the like. These unit are called tokens. Parsing: The process of organizing the sequence of tokens into a hierarchical

syntactical structures such as expressions, statements, and blocks. The parsertakes a sequence of tokens and produces an abstract syntax tree.

Com S 342

Lexical Analysis

Lexical analysis is in general not very complicated.

A programming language classifies lexical tokens into a finite set of tokentypes: identifiers, numbers, punctuations, comments.

A language is a set of strings; a strings is a finite sequence of symbols. Thesymbols themselves are taken from a finite alphabet (e.g. the ASCII characterset).

We use regular expressions to specify the set of strings of a language: A symbol “a” in the alphabet is a regular expression and denotes just the string a. Alternation (|), concatenation (.), not (¬), epsilon (ε), or repetition (*) applied to

regular expression are regular expressions. There are not other forms of regular expressions.

Com S 342

Parsing

The definition of a parser can be a very complicated and tedious task.

Several different techniques exist to construct a parser: Table-based parsing

LL(k)-parsing (top-down parsing) LR(k)-parsing (bottom-up parsing)

Recursive-descent parsing

When defining a language, we use a context-free grammar (type 2 or BNF) tospecify the building blocks of the language.

The grammar must not be ambiguous in order to define a parser (exceptionspossible).

The standard approach to build a front-end (which is the most easiestapproach available) is to use a parser generator (e.g. YACC, LEX, SLLGEN).

Com S 342

SLLGEN

SLLGEN stands for Scheme LL(1) parser GENerator.

This parser generator takes as input a lexical specification and agrammar, and produces as output a scanner and a parser forthem.

SLLGEN operations: (sllgen:make-string-parser scanner-spec grammar) generates a parser. (sllgen:make-string-scanner scanner-spec grammar) generates a scanner

(mainly used for debugging). (sllgen:make-define-datatypes scanner-spec grammar) generates each of

the define-datatype expressions from the grammar for use by cases.

Com S 342

Scanner Specification in SLLGEN

<scanner-spec> ::= ( {<regexp-and-action>}* )<regexp-and-action> ::= ( <name> ( {<regexp>}* ) <outcome> )<name> ::= <symbol><regexp> ::= <string> | letter | digit | whitespace | any

::= (not <character> ) | (or {<regexp>}* )::= (arbno <regexp> ) | (concat {<regexp>}* )

<outcome> ::= skip | symbol | number | string

Outcome: skip: This means this is the end of the token, but no token is emitted. symbol: The characters in the buffer are converted into a Scheme symbol. number: The characters in the buffer are converted into a Scheme number. string: The characters in the buffer are converted into a Scheme string.

Note: If there is a tie for the longest match between two regular expressions, string takesprecedence over symbol.

Com S 342

Grammar Specification in SLLGEN

<grammar> ::= ( {<production>}* )<production> ::= ( <lhs> ( {<rhs-item>}* ) <prod-name> )<lhs> ::= <symbol><rhs-item> ::= <symbol> | <string>

::= (arbno {<rhs-item>}* )::= (separated-list {<rhs-item>}* <string> )

<prod-name> ::= <symbol>

A grammar specification in SLLGEN must allow the parser to determine, whichproduction to use knowing only: What nonterminal it is looking for, and The first symbol (token) of the string being parsed.

Com S 342

LL(1)-Grammar

LL(1) means: We use only one lookahead symbol to determine, which action is to be

performed next. All leftmost symbols in the {<rhs-item>}* of all productions must be

pairwise disjoint (FIRST(rule1) ∩ FIRST(rule2) = ∅). All production must not have direct or in-direct left-recursive application of

the same production:Example of an ill-formed LL(1) rule:

(term (term “+” number) sum-term)

SLLGEN produces a warning if the input grammar fails to meetany restriction.

Com S 342

FIRST

FIRST(A) = {A}, if A is a terminalFIRST(A) = {ε}, if A = ε

FIRST(A) = { a | A ∈ nonterminals and A ::= B1 B2 … Bn, a = FIRST(A), if FIRST(Bi) = a, 1 <= i <= n, and ε ∈ FIRST(B1), …, FIRST(Bi-1), or

a = ε, if ε ∈ FIRST(B1), …, FIRST(Bn) }

Com S 342

<program> ::= <expression>

<expression> ::= "<integer" "value" "=" <number> "/>"

::= "<reference" "name" "=" <identifier> "/>"

::= "<" <prim-op> "<arguments" {<expression>}* "/>" "/>"

<prim-op> ::= "add" | “sub" | “mul" | “inc" | “”dec"

A Small Language

a-program (exp)

lit-exp (num)

var-exp (id)

primapp-exp (prim rands)

Com S 342

The Scanner Specification

(define scanner-spec ‘( (white-sp (whitespace) skip) (comment ("%" (arbno (not #\newline))) skip) (identifier (letter (arbno (or letter digit "?"))) symbol) (number (digit (arbno digit)) number) ))

Com S 342

The Grammar Specification

(define grammar ‘( (program (expression) a-program) (expression ("<integer" "value" "=" number "/>") lit-exp) (expression ("<reference" "value" "=" identifier "/>") var-exp) (expression ("<" prim-op "<arguments" (arbno expression) "/>" "/>") primapp-exp) (prim-op ("add") add-prim) (prim-op ("sub") sub-prim) (prim-op ("mul") mult-prim) (prim-op ("inc") inc-prim) (prim-op ("dec") dec-prim) ))

Com S 342

The Interpreter

;; define datatypes here

;; build the scanner and parser(define front-end (sllgen:make-string-parser scanner-spec grammar))

;; load the functional environment definition(load "environment.scm")

;; define initial environment here

;; define interpreter(define interpreter (lambda (string) (eval-program (front-end string))))

Com S 342

A-Read-Eval-Loop

(define read-eval-loop (sllgen:make-rep-loop

"$ " eval-program (sllgen:make-stream-parser scanner-spec grammar)))

(sllgen:make-rep-loop prompt eval-fn stream-parser) takes a prompt-string, a1-argument procedure, and a stream parser, and produces a read-eval-printloop.

(sllgen:make-stream-parser scanner-spec grammar) generates a streamparser.

Example: > (read-eval-loop) $ <reference value = x /> 10 $

Com S 342

Run-From-File

(define read-file (lambda (fname) (let* ((fp (open-input-file fname)) (contents (read-source fp))) (close-input-port fp) contents)))

(define read-source (lambda (in-port) (let ((char (read-char in-port))) (if (eof-object? char) “” (string-append (string char) (read-source in-port))))))

(define run-from-file (lambda (fname) (interpreter (read-file fname))))

Com S 342

Language Extensions

To study the semantics and implementation of a wide range ofprogramming language features, we add these features to ouralready defined language step-by-step.

For each feature, we add a production to the grammar, Specify an abstract syntax for that production, and Add an appropriate evaluation function (new procedure or cases clause) to

handle the new language feature.

Com S 342

<expression> ::= “<if” “<condition” <expression> “/>” “<then” <expression> “/>” “<else” <expression> “/>” “/>”

if-exp (test-exp then-exp else-exp)

We use the C/C++ style, that is, the number 0 means false, any othernumber means true.

(define is-true? (lambda (x) (not (zero? x))))

Conditional Evaluation

Com S 342

Evaluation of If-Then-Else

(define eval-expression (lambda (exp env) (cases expression exp

(if-exp (test-exp then-exp else-exp) (if (is-true? (eval-expression test-exp env))

(eval-expression then-exp env) (eval-expression else-exp env)))

…)))

We use the Scheme if-form to define the meaning of if-then-else in ourlanguage. Therefore, our understanding of the defined language depends onour understanding of the defining language.

Com S 342

If-Then-Else Examples

6

7

<if <condition <sub <arguments <integer value = 3 /> <integer value = 3 /> /> /> /> <then <mul <arguments <integer value = 2 /> <integer value = 3 /> /> /> /> <else <add <arguments <integer value = 3 /> <integer value = 4 /> /> /> />/>

<if <condition <integer value = 3 /> /> <then <mul <arguments <integer value = 2 /> <integer value = 3 /> /> /> /> <else <add <arguments <integer value = 3 /> <integer value = 4 /> /> /> />/>

Com S 342

The Dangling If-Then-Else Conflict

The “dangling” if-then-else conflict is a problem in languagedesign, where the grammar for the language contains alternativerule of the form:

A ::= B A D AA ::= B A

C/C++, Java, Pascal are languages, whose grammars containrules that generate the dangling if-then-else conflict.

Grammars with this conflict are not LL(1), that is, we cannotdefine a parser with the SLLGEN-parser generator.

Com S 342

The Conflict Illustrated

<expression> ::= if <expression> then <expression> else <expression> ::= if <expression> then <expression>

How do we parse the following construct:

if 3 then if 5 then 8 else 9 ?

Com S 342

Multiple Parsing Strategies

expression expression

if 3 then if 5 then 8 else 9

if 3 then <expression>if 3 then <expression> else 9

if 5 then 8 else 9if 5 then 8

if 3 then if 5 then 8 else 9

Solution: The innermost else is associated with the innermost then.

Com S 342

We can create new variable bindings with a let-form.

<expression> ::= "<let" <declarations> <expression> "/>"

let-exp (decls body)

<declarations> ::= "<declarations"

{"<declaration" "<variable" "value" "=" <identifier> "/>" <expression> "/>"}* "/>“

let-decls (ids rands)

Local Binding

Com S 342

Facts about Let-Bindings

The let-form introduces “named value abstractions”.

The scope of the variable bindings is the body of the let-form.

The entire let-form is an expression. Therefore, let-forms may benested.

Com S 342

Abstract Syntax of the Let-Form

A variant-record for the let-form:

(let-exp (decls declarations?) (body expression?))

(define-datatype declarations declarations? (let-decls (ids (list-of symbol?)) (rands (list-of expression?))) )

Com S 342

Evaluation of the Let-Form

A new case in eval-expression:

(let-exp (decls body) (let ((args (eval-rands (get-rands decls) env)))

(eval-expression body (extend-env (get-ids decls) args env))))

We use the Scheme let-form to define the meaning of local binding. First,we evaluate all expressions, which shall be bound to the newly introducedvariables. Then, we extend the original environment with the new bindingsand evaluate the body of the let-form with the extended environment.

Com S 342

Auxiliaries

(define get-rands (lambda (decls) (cases declarations decls (let-decls (ids rands) rands) ) ) )

(define get-ids (lambda (decls) (cases declarations decls (let-decls (ids rands) ids) ) ) )

Com S 342

A Let Example

28

140

<let <declarations <declaration <variable value = f /> <integer value = 4 /> /> <declaration <variable value = t /> <add <arguments <integer value = 3 /> <integer value = 4 /> /> /> /> /> <mul <arguments <reference value = f /> <reference value = t /> /> />/>

<let <declarations <declaration <variable value = f /> <integer value = 4 /> /> <declaration <variable value = t /> <add <arguments <integer value = 3 /> <integer value = 4 /> /> /> /> /> <mul <arguments <reference value = f /> <reference value = t /> <reference value = v /> /> />/>

(apply-env env `v) 5

Com S 342

Procedures

Procedures introduce “names expression abstractions”

We represent procedures as first-class values in our language.

Expressed Value = Denoted Value = Number + ProcVal

ProcVal is the set of values representing procedures.

We need two new constructs: declaration of procedures procedure calls

Com S 342

Syntax for Procedures

<expression> ::= "<proc" <formals> <expression> "/>"

proc-exp (ids body)

::= "<invoke" <expression> <arguments> "/>"

app-exp (rator rands)

<arguments> ::= "<arguments" {<expression>}* "/>"

<formals> ::= "<params" {<param>}* "/>“

param-decls (ids)

Com S 342

Grammar Specification

(expression ("<proc" formals expression "/>") proc-exp)

(expression ("<invoke" expression

"<arguments" (arbno expression) "/>" "/>") app-exp)

(formals ("<params"

(arbno "<param" "value" "=" identifier "/>") "/>") param-decls)

Com S 342

Requirements for Procedure Application

When a procedure is applied, its body is evaluated in anenvironment that binds the formal parameters of the procedureto the arguments of the application.

Variables that occur free (references without declarations withinthe scope of the procedure) must obey the lexical binding rule,that is, they need to be defined in the enclosing region.

The mechanism that resolves all variable references at the timethe procedure is created is called static scoping.

Com S 342

Example of Static Scoping

If f is called, its body should be evaluated in the following environment:(((y z) (2 28)) . (((x) (5)) . ()))

y = 2, z = 28, x = 5

<let <declarations <declaration <variable value = x /> <integer value = 5 /> /> /> <let <declarations <declaration <variable value = f /> <proc <params <param value = y /> <param value = z /> /> <add <arguments <reference value = y /> <sub <arguments <reference value = z /> <reference value = x /> /> /> /> /> /> /> <declaration <variable value = x /> <integer value = 28 /> /> /> <invoke <reference value = f /> <arguments <integer value = 2 /> <reference value = x /> /> /> />/>

Com S 342

Nested Procedures

(let ((cadd (lambda (n) (let ((h (lambda (m) (+ n m)))) h))) (twice (lambda (f) (let ((g (lambda (x) (f (f x)))))

g)))) (let

((seventeen ((twice (cadd 5)) 7)) (addTwentyFour (twice (twice (cadd 6))))) (addTwentyFour seventeen)))

Both cadd and twice return a procedure!

(cadd 4) (lambda (m) (+ 4 m))

(twice cadd) (lambda (x) (cadd (cadd x)))

41

Com S 342

Nested Procedures

In languages without nested functions (such as C), the runtimerepresentation of a function value can be the address of themachine code for that function. This address can be passed asan argument, stored in a variable, and so on.

But this does not work for nested procedures; if we representthe procedure h by an address, in what outer frame can it accessthe variable n? Similarly, how does the procedure g access thevariable f?

Com S 342

Closures

A closure is a package that contains The procedure body The list of all formal paramters The bindings of its free variables

(define closure (lambda (ids body env) (lambda (args) (eval-expression body (entend-env ids args env)))))

In general, it is convenient to store the entire creation environment of aprocedure, rather than just the bindings of the free variables.

Com S 342

The Representation of ProcVal

We define an abstract data type for ProcVal:

(define-datatype procval procval? (closure (ids (list-of symbol?)) (body expression?) (env environment?)))

Com S 342

Procedure Call Evaluation

(define apply-procval (lambda (proc args) (cases procval proc

(closure (ids body env) (eval-expression body (extend-env

idsargs env))))))

Is proc a procedure?

Evaluate procedure usingthe creation environment.

Com S 342

Extensions of eval-expression

(define eval-expression (lambda (exp env)

(cases expression exp … (proc-exp (params body)

(closure (get-ids params) body env)) … )))

The body of the procedure is not yetevaluated. We memorize only thecreation environment!

Com S 342

Get-Ids

(define-datatype pdeclarations pdeclarations? (param-decls (ids (list-of symbol?))) )

(define get-ids (lambda (decls) (cond ((pdeclarations? decls) (cases pdeclarations decls (param-decls (ids) ids))) ) ) )

Extract parameter name list:

Com S 342

Evaluation of Procedure Call

(define eval-expression (lambda (exp env)

(cases expression exp … (app-exp (rator rands) (let ((proc (eval-expression rator env)) (args (eval-rands rands env))) (if (procval? proc)

(apply-procval proc args) (eopl:error ‘eval-expression “Attempt to apply non-procedure ~s” proc))))

… )))

Like Scheme “x isnot a procedure”

call-by-value

Com S 342

Dynamic Scoping

Dynamic scoping means that the procedure body is evaluated inan environment obtained by extending the environment at thepoint of the procedure call.

(let ((a 3)) (let ((p (lambda (x) (+ x a)))

(a 5)) (* a (p 2))))

(* 5 (+ 2 5)) = (* 5 7) = 35

How can we implementdynamic scoping?

Com S 342

Dynamic Scoping

(define apply-procval (lambda (proc args calling-env) (cases procval proc (closure (ids body env) (eval-expression body (extend-env ids args (extend-env2 calling-env env)))))))

(app-exp (rator rands)(let ((proc (eval-expression rator env)) (args (eval-rands rands env))) (if (procval? Proc) (apply-procval proc args env) (eopl:error `val-expression

"Attempt to apply non-procedure ~s“ proc))))

Extend closure environmentwith call environment!

Com S 342

extend-env2

(define extend-env2 (lambda (calling-env creation-env) (cases environment calling-env (empty-env-rec () creation-env) (extend-env-rec (syms vals old-env) (extend-env syms vals (extend-env2 old-env creation-env))) ) ) )

Com S 342

Dynamic Scoping Example

<let <declarations <declaration <variable value = a /> <integer value = 3 /> /> /> <let <declarations <declaration <variable value = p /> <proc <params <param value = x /> /> <add <arguments <reference value = x /> <reference value = a /> /> /> /> /> <declaration <variable value = a /> <integer value = 5 /> /> /> <mul <arguments <reference value = a /> <invoke <reference value = p /> <arguments <integer value = 2 /> /> /> /> /> /> />

Com S 342

Split Programs

XMLScheme is a very verbose language.

To facilitate the writing of programs XMLScheme should alsosupport a mechanism that allows programmers to divide a givenprogram into different compilation units.

Languages like C/C++, Scheme, and HTML support source codeinclusions. In XMLScheme we will adapt this approach andintroduce a new form of expression: hyper-references.

Com S 342

Hyper-References

<expression> ::= "<href" "value" "=" <identifier> "/>"

extern-exp (unit)

A hyper-reference is an expression that is defined in a separate compilationunit.

Hyper-reference may also occur nested.

The compilation unit is loaded when a hyper-reference is evaluated.

Com S 342

Hyper-Reference Evalution

(extern-exp (unit-name) (let ((contents (read-file (string-append (symbol->string unit-name) ".xml")))) (eval-expression (expression-parser contents) env)))

Build filename

Load expression definition

Call expression parser

Com S 342

Expression Parser

We define a new grammar for expressions.

The root symbol in the new grammar is <expression>.

The new grammar will contain only rules that are reachable fromthe root symbol <expression>.

(define expression-parser (sllgen:make-string-parser scanner-spec expression-spec))

Com S 342

Hyper-Reference Example

<let <declarations <declaration <variable value = h /> <proc <params <param value = m /> /> <add <arguments <reference value = n /> <reference value = m /> /> /> /> /> /> <reference value = h />/>

h.xml:

… <declaration <variable value = cadd /> <proc <params <param value = n /> /> <href value = h /> /> />…

Com S 342

Recursion

Most programming languages support the definition of recursiveabstractions: Records (to construct linked lists) Procedure (to implement inductively specified data types) Mutual depended data structures like classes

Recursion is a challenging mechanism and may often lead tocomplications in program understanding.

We will study only one form of recursive definition here:letrec – recursive procedure definition à la Scheme.

Com S 342

A Recursive Problem

Suppose we want to define the operation plus using only the operatorsincrement and decrement.

We may write:$ let plus = proc (n, m) if n then (plus dec(n) inc(m)) else m

in (plus 2 3)Error reported by apply-envNo binding for plus

Unfortunately this is not a definition, since we are trying to use “plus” beforeit is defined.

Task: Although recursion is fundamental to programming, it is not yetprimitive in the our language, so we must find a way to “program” it!

Com S 342

Recursive Functions As Fixed Points

However, we can obtain a closed expression by abstracting over plus:

rplus = proc (plus, n, m) if n then (plus dec(n) inc(m)) else m

Now, let “fplus” be the actual addition function we want. We must pass it to“rplus” as a parameter before we can perform any additions.

(rplus fplus) is the function we want. In other words, we are looking for anfplus such that:

rplus fplus ↔ fplus

That Is, we are searching for a fixed point of “rplus”.

Com S 342

Fixed Points

In general, a fixed point of a function is a value in the function’s domain,which is mapped to itself by the function. Therefore, a fixed point of afunction f is a value p such that (f p) = p.

Examples: (factorial 1) = 1(factorial 2) = 2(fibonacci 0) = 0(fibonacci 1) = 1

However, not all functions have exactly one fixed point: “inc(n) = n + 1” hasnone.

We need to represent the fixed-point operation in our language.

Com S 342

Fixed-Point Theorem

Fixed-point Theorem:For every F there exists a fixed-point X such that F X ↔ X.

Proof:Let

Y ≡ λf . (λx . f (x x)) (λx . f (x x))

Now consider:X ≡ Y F (λx . F (x x)) (λx . F (x x))

F ((λx . F (x x)) (λx . F (x x))) F X

Therefore, the “Y combinator” can always be used to find a fixed-point ofan arbitrary lambda expression, if such a fixed-point exists.

Com S 342

Unfolding Recursive Lambda Expressions


plus is a fixed point of rplus. By the fixed point theorem, we can take:

plus ≡ Y rplus

plus 1 1= (Y rplus) 1 1 rplus plus 1 1 if 1 then (plus (pred 1) (succ 1)) else 1 (plus (pred 1) (succ 1)) (rplus plus (pred 1) (succ 1)) if (pred 1) then (plus (pred (pred 1)) (succ (succ 1))) else (succ 1) if 0 then (plus (pred (pred 1)) (succ (succ 1))) else (succ 1) (succ 1) 2

Com S 342

Strict Fixed-Point Operator

The fixed-point operator Y is useless in a call-by-value setting, since theexpression Y g diverges for any g. In call-by-value settings we use,therefore, the operator fix:

fix ≡ λf . (λx . f (λy . x x y)) (λx . f (λy . x x y))

Com S 342

Unfolding Recursive Lambda Expressions II


We can take: plus ≡ fix rplus plus 1 1= (fix rplus) 1 1 (h h) 1 1 where h = (λx . rplus (λy . x x y)) rplus fct 1 1 where fct = λy . h h y if 1 then (fct 0 2) else 1 fct 0 2 h h 0 2 rplus fct 0 2 if 0 then (fct (pred 0) (succ 2)) else 2 2

call-by-value

Com S 342

Syntax for Mutual Recursive Definitions

<expression> ::= "<reclet" <declarations> <expression> "/>"

reclet-exp ( decls body)

The syntax for mutual recursive definitions is like the let-syntax.

However, the evaluation of mutual recursive definitions requiresthe application of a corresponding fixed-point operator.

Com S 342

Applications of a Recursive Definition

<reclet <declarations <declaration <variable value = even /> <href value = even /> /> <declaration <variable value = odd /> <href value = odd /> /> /> <invoke <reference value = odd /> <arguments <integer value = 13 />

/> />/>

odd ∈ fv(even) and even ∈ fv(odd)

Com S 342

Even & Odd<proc <params <param value = x/> /> <if <condition <reference value = x /> /> <then <invoke <reference value = odd /> <arguments <dec <arguments <reference value = x /> /> /> /> /> /> <else <integer value = 1 /> /> />/>

<proc <params <param value = x/> /> <if <condition <reference value = x /> /> <then <invoke <reference value = even /> <arguments <dec <arguments <reference value = x /> /> /> /> /> /> <else <integer value = 0 /> /> />/>

Com S 342

Building the Fixed-Point

Fixed-point semantics:

Let e’ be (extend-env-recursively proc-names bodies e), Then

If name is one of the names in proc-names, then

(apply-env e’ name) = (closure ids body e’),

where ids and body are the formal arguments and the bodyof the recursive procedure, respectively.

If not, then (apply e’ name) = (apply-env e name).

Com S 342

Evaluation of reclet

…(reclet-exp (decls body)

(let* ((args (eval-rands (get-rands decls) env)) ;; filter recursive procedure ids (rec-proc-ids (map car (filter (lambda (p) (procval? (cadr p))) (zip (get-ids decls) args)))) ;; now change closure to closure-rec (new-args (map (lambda (v) (if (procval? v) (build-rec-proc v rec-proc-ids) v)) args)))

(eval-expression body (extend-env (get-ids decls) new-args env))))…

Com S 342

Filter

(define filter (lambda (p lst) (if (null? lst) '() (if (p (car lst)) (cons (car lst) (filter p (cdr lst))) (filter p (cdr lst)) ) ) ) )

> (filter odd? '(1 2 3 4 5 6 7 8))'(1 3 5 7)

(define odd? (lambda (n) (= (modulo n 2) 1)))

Com S 342

build-rec-proc

(define build-rec-proc (lambda (v rec-ids) (cases procval v (closure (ids body env) (closure-rec ids body rec-ids env)) (else v) ;; we should never reach this case ) ) )

(define-datatype procval procval? (closure (ids (list-of symbol?)) (body expression?) (env environment?)) (closure-rec (ids (list-of symbol?)) (body expression?) (rec-proc-ids (list-of symbol?)) (env environment?)) )

Com S 342

New Approach to Call Procedures

(app-exp (rator rands) (let ((proc (eval-expression rator env))

(args (eval-rands rands env))) (if (procval? proc) ;; add calling-env to resolve occurring recursive procedures

(apply-procval proc args env) (eopl:error 'eval-expression

"Attempt to apply non-procedure ~s" proc))))

Com S 342

New apply-procval

(define apply-procval (lambda (proc args calling-env) (cases procval proc (closure (ids body env) (eval-expression body (extend-env ids args env))) (closure-rec (ids body rec-proc-ids env) (eval-expression body (extend-env ids args (extend-rec-env rec-proc-ids calling-env env)))) ) ) )

Com S 342

extend-rec-env

(define extend-rec-env (lambda (rec-proc-ids calling-env creation-env) (if (not (null? rec-proc-ids)) (extend-env rec-proc-ids (map (lambda (id) (apply-env calling-env id)) rec-proc-ids) creation-env) creation-env ) ) )

Build fixed-point for allrecursive procedures

Com S 342

Expression Sequences

Command expression:

<expression> ::= "<sequence" {<expression>}+ "/>"

Kleene star: A+ = A + A*

<expression> ::= "<sequence" <expression> {<expression>}* "/>"

seq-exp (exp exps)

Com S 342

Implementation of Sequencing

(define eval-expression (lambda (exp env) (cases expression exp … (seq-exp (exp exps) (eval-exp-sequence exp exps env)) …)))

Com S 342

Sequence Evaluation

(define eval-exp-sequence (lambda (exp exps env) ; while loop (let continue ((res (eval-expression exp env)) (tail exps)) (if (null? tail) res (continue (eval-expression (car tail) env) (cdr tail)) ) ) ) )

Return valueof last

expression

Scheme Loop

Com S 342

Control Context

The standard recursive implementation of factorial uses a call to procedurefactorial as an operand that requires the creation of a control context:

(define factorial (lambda (n) (if (zero? n) 1 (* n (factorial (- n 1))) ) ))

(factorial 6)| (factorial 5)| |(factorial 4)| | (factorial 3)| | |(factorial 2)| | | (factorial 1)| | | |(factorial 0)| | | |1| | | 1| | |2| | 6| |24| 120720

New contextrequired

Com S 342

Tail Form

A procedure call that does not grow control context is the same as a jump.Such a procedure call is said to be a tail call.

Iterative factorial:

(define factorial (lambda (n) (let fact-iter ((rest n) (res 1)) (if (zero? rest)

res (fact-iter (- rest 1) (* res rest))) ) ) )

(factorial 6)| (fact-iter 6 1)| |(fact-iter 5 6)| | (fact-iter 4 30)| | |(fact-iter 3 120)| | | (fact-iter 2 360)| | | |(fact-iter 1 720)| | | | (fact-iter 0 720)| | | | 720| | | |720| | | 720| | |720| | 720| |720| 720720

Com S 342

Sequencing Example

<sequence <add <arguments <integer value = 2 /> <integer value = 3 /> /> /> <mul <arguments <integer value = 2 /> <integer value = 3 /> /> />/>

6

Com S 342

New Operators

<prim-op> ::= "add" | "sub" | "mul" | "inc" | "dec" |"div" | "equal" | "less" | "greater" |"not" | "and" | "or"

(define-datatype primitive primitive? (add-prim) (sub-prim) (mult-prim) (inc-prim) (dec-prim) (div-prim) (equal-prim) (less-prim) (greater-prim)

(not-prim) (and-prim) (or-prim))

Com S 342

New Operator Evaluation Approach

(add-prim () (eval (append '(+) args)))(sub-prim () (eval (append '(-) args)))(mult-prim () (eval (append '(*) args)))(inc-prim () (car (reverse (map (lambda (n) (+ n 1)) args))))(dec-prim () (car (reverse (map (lambda (n) (- n 1)) args))))(div-prim () (eval (append '(/) args)))

Return lastexpression

Com S 342

Evaluation of Relational Operators

(equal-prim () (let cont ((tail (cdr args))) (if (null? tail) 1 ; identity ==> true (if (eqv? (car args) (car tail)) (cont (cdr tail)) 0 ; not all elements are equal ) ) ))

(less-prim () (b->n (eval (append '(<) args))))(greater-prim () (b->n (eval (append '(>) args))))

(define b->n (lambda (b) (if b 1 0)))

Com S 342

Evaluation of Boolean Operators

(not-prim () (b->n (car (reverse (map (lambda (v) (if (number? v) (not (is-true? v)) (not v))) args)))))

(and-prim () (b->n (eval (append '(and) (map is-true? args)))))(or-prim () (b->n (eval (append '(or) (map is-true? args)))))

(define b->n (lambda (b) (if b 1 0)))

Com S 342

Equal & Or

<equal <arguments <integer value = 3 /> <dec <arguments <integer value = 4 /> /> /> <inc <arguments <integer value = 2 /> /> /> <add <arguments <integer value = 2 /> <integer value = 1 /> /> /> /> />

<or <arguments <integer value = 3 /> <dec <arguments <integer value = 4 /> /> /> <inc <arguments <integer value = 2 /> /> /> <add <arguments <integer value = 2 /> <integer value = 1 /> /> /> /> />

Com S 342

Loop Expression

A loop expression provides a general looping construct similar tothe for statement in C/C++ or Java.

<expression> ::= "<loop" [ <declarations> ] [ "<conditions" {<expression>}+ /> ] [ "<increments" {<expression>}+ /> ]

<expression> "/>"

loop-exp (decls conds incrs body)

Com S 342

Loop Syntax

(opt-declarations() empty-decl-list)

(opt-declarations(declarations) decl-list)

(opt-conditions() empty-exp-list)

(opt-conditions("<conditions" expression (arbno expression) "/>") exp-list)

(opt-increments() empty-exp-list)

(opt-increments("<increments" expression (arbno expression) "/>") exp-list)

(expression("<loop" opt-declarations opt-conditions opt-increments

expression "/>") loop-exp)

Com S 342

Loop Evaluation

(loop-exp (decls conds incrs body) (let ((new-decls (eval-declaration-list decls env)) (conditions (make-exp-list conds)) (increments (make-exp-list incrs))) ;; check for correct increments arity (if (= (length new-decls) (length increments)) (eval-loop new-decls conditions increments body env) ;; arity error (eopl:error 'eval-expression "Arity mismatch in loop increments" ) ) ))

Com S 342

make-exp-list

(define make-exp-list (lambda (exps) (cases expression-list exps (empty-exp-list () '()) (exp-list (exp tail) (cons exp tail)) ) ) )

Com S 342

Evaluate Loop Body

(define eval-loop (lambda (loop-decls conditions increments body env) (let ((p (unzip loop-decls)) ; map conditions to or-prim (loop-test (primapp-exp (or-prim) (car conditions) (cdr conditions)))) (let continue ((new-env (extend-env (car p) (cadr p) env)) ; loop env (res 0)) ; res #f ;; do loop (if (is-true? (eval-expression loop-test new-env)) (let ((new-res (eval-expression body new-env)) ; eval body => res (step-res (map (lambda (e) (eval-expression e new-env)) increments))) ;; next iteration (continue (extend-env (car p) step-res env) new-res) ) res ) ) ) ) )

Com S 342

A Loop Example

<loop <declarations <declaration <variable value = i /> <integer value = 5 /> /> <declaration <variable value = j /> <integer value = 6 /> /> <declaration <variable value = k /> <integer value = 7 /> /> /> <conditions <greater <arguments <reference value = i /> <integer value = 0 /> /> /> <greater <arguments <reference value = j /> <integer value = 0 /> /> /> <greater <arguments <reference value = k /> <integer value = 0 /> /> /> /> <increments <dec <arguments <reference value = i /> /> /> <dec <arguments <reference value = j /> /> /> <dec <arguments <reference value = k /> /> /> /> <add <arguments <reference value = i /> <reference value = j /> <reference value = k /> /> />/>

[i j k](5 6 7)(4 5 6)(3 4 5)(2 3 4)(1 2 3)(0 1 2)(-1 0 1)(-2 -1 0) 0

Com S 342

Variable Assignment

In a language that supports variable assignment, every identifier denotes andaddress of a mutable location in memory.

The address is called references, and it is the contents of the reference that ismodified by a variable assignment.

References or locations are called L-values, which reflects their associationwith variables appearing on the left-hand side of assignment statements.

Analogously, expressed values, such as the values of the right-hand sideexpressions of assignment statements, are called R-values.

Com S 342

Interpreter Values

Denoted Values = Ref(Expressed Values)Expressed Values = Number + ProcVal

Com S 342

Syntax for Variable Assignment

<expression> ::= "<set" "<variable" "value" "=" <identifier> "/>" <expression> "/>“

assign-exp (id exp)

variant-record type: (assign-exp (id symbol?) (exp expression?))

Com S 342

Semantics of Assignment

What is the difference between assignment and binding? A binding creates an immutable association of a name with a value. An assignment changes the value of an existing binding.

Variable assignment enables the sharing of values between different parts ofa program (e.g., procedures).

Variable assignment is not transparent, i.e., a change of a value of a variableby an assignment is seen by all parts of the program that refer to thevariable.

If a language supports variable assignments, then procedures do in generalnot have referential transparency.

Com S 342

Sharing

<let <declarations <declaration <variable value = x /> <integer value = 0 /> /> <declaration <variable value = zero? /> <href value = zero /> /> /> <reclet <declarations <declaration <variable value = even /> <href value = evensharing /> /> <declaration <variable value = odd /> <href value = oddsharing /> /> /> <sequence <set <variable value = x /> <integer value = 13 /> /> <invoke <reference value = odd /> <arguments /> /> /> />/>

Com S 342

oddsharing

<proc <params /> <if <condition <invoke <reference value = zero? /> <arguments <reference value = x /> /> /> /> <then <integer value = 0 /> /> <else <sequence <set <variable value = x /> <dec <arguments <reference value = x /> /> /> /> <invoke <reference value = even /> <arguments /> /> /> /> />/>

Com S 342

evensharing

<proc <params /> <if <condition <invoke <reference value = zero? /> <arguments <reference value = x /> /> /> /> <then <integer value = 1 /> /> <else <sequence <set <variable value = x /> <dec <arguments <reference value = x /> /> /> /> <invoke <reference value = odd /> <arguments /> /> /> /> />/>

Com S 342

Private State

(let ((g (let ((count 0)) (lambda () (begin (set! count (+ count 1)) count))))) (+ (g) (g)))

The procedure g maintains a private variable count that storesthe number of times g has been called, so this programevaluates to 3.

Com S 342

Call-By-Value

Every time a procedure is called, we can create a new reference for eachformal parameter, a policy called call-by-value.

(let ((x 100)) (let ((p (lambda (x) (begin (set! x (+ x 1)) x)))) (+ (p x) (p x))))

This program evaluates to 202, because a new reference is created for x ateach of the procedure calls.

At each procedure call, the assignment affects only the inner binding.

Com S 342

The Reference Data Type

References can by represented by indices of a vector:

(define-datatype reference reference? (a-ref (position number?) (vec vector?)))

We need two operations: deref to access to value stored in a location setref! to set the value in a location

Com S 342

deref & setref!

(define deref(lambda (ref) (cases reference ref

(a-ref (pos vec) (vector-ref vec pos)))))

(define setref!(lambda (ref value) (cases reference ref

(a-ref (pos vec) (vector-set! vec pos value)))))

We use the Scheme vector procedures!

Com S 342

New Environment Data Type

(define-datatype environment environment? (empty-env-rec) (extended-env-rec

(syms (list-of symbol?))(vec vector?)(env environment?)))

(define empty-env (lambda ()

(empty-env-rec)))

Com S 342

A New Environment Representation

(define extend-env (lambda (syms vals env) (extended-env-rec syms (list->vector vals) env)))

(define apply-env (lambda (env sym) (deref (apply-env-ref env sym))))

(define apply-env-ref (lambda (env sym) (cases environment env (empty-env-rec () (eopl:error 'apply-env-ref "No binding for ~s" sym)) (extended-env-rec (syms vals old-env) (let ((pos (list-find-position sym syms))) (if (number? pos) (a-ref pos vals) (apply-env-ref old-env sym)))))))

convert a list to a vector

return a value

return a reference

Com S 342

Implementation of Variable Assignment

(define eval-expression (lambda (exp env) (cases expression exp …

(assign-exp (id r-value) (let ((val (eval-expression r-value env))) (setref! (apply-env-ref env id) val) val)) …)))

We need to return avalue, because thereturn value ofsetref! is unspecified.

Com S 342

Analysis of Solution

The new environment definition immediately provides a mechanism for call-by-value (parameters are elements of a vector).

<let <declarations <declaration <variable value = x /> <integer value = 100 /> /> /> <let <declarations <declaration <variable value = p /> <proc <params <param value = x /> /> <sequence <set <variable value = x /> <inc <arguments <reference value = x /> /> /> /> <reference value = x /> /> /> /> /> <add <arguments <invoke <reference value = p /> <arguments <reference value = x /> /> /> <invoke <reference value = p /> <arguments <reference value = x /> /> /> /> /> />/> 202

Com S 342

Parameter-Passing Variations

Call-by-value is the most commonly used form of parameter passing, and isthe standard against which other parameter-passing mechanisms are usuallycompared.

(let ((a 3) (p (lambda (x) (set! x 4)) (begin (p a) a))

Under call-by-value semantics, the denoted value associated with “x” is areference that initially contains the same variable as the reference associatedto “a”, but these references are distinct. Therefore, any assignment to “x”has no effect on the contents of “a”.

3

Com S 342

Call-By-Reference

The isolation between the caller and the callee, as in call-by-value, is generally desirable.

But, it is also valuable to allow a procedure to be passedvariables with the expectation that they will be assigned by theprocedure. In particular, we may want to use this approach,when the procedure returns multiple values.

The parameter-passing mechanism is called “call-by-reference”.

Com S 342

Semantics of Call-By-Reference

If an operand is a variable reference, then a reference to thevariable’s location is passed. The formal parameter of theprocedure is then bound to this location.

If the operand is some other kind of expression, then the formalparameter is bound to a new location containing the value of theoperand, just as in call-by-value.

Com S 342

The Procedure swap

(let((a 3) (b 4) (swap (lambda (x y)

(let ((temp x)) (begin

(set! x y)(set! y temp)

))))) (begin

(swap a b) (- a b)))

Under call-by-reference, thisswaps the values of “a” and“b”, so it returns 1.

Under call-by-value, thisprogram returns –1, becausethe assignments inside theswap procedure have no effecton the variables “a” and “b”.

Com S 342

Expressed Values and Denoted Values

Under call-by-reference, identifiers still denote references to expressedvalues, just as the did under call-by-value:

Denoted Value = Ref(Expressed Value) Expressed Value = Number + ProcVal

The only change occurs when new references are created. Under call-by-value, a new reference is created for every evaluation of an

operand. Under call-by-reference, a new reference is created for every evaluation of

an operand other than a variable.

Com S 342

A Problem

In our approach, call-by-value creates a new location for every operand in aprocedure application. We have put the values of all the operands in a vector,and have “apply-env-ref” create a reference to the location at variable-lookuptime.

Under call-by-reference, we will need a new location for some operands andnot for others. So, we need a different representation for references.

A reference will be, as before, a reference to a location within a vector. Butthe vector, instead of containing expressed values, will contain eitherexpressed values or references to expressed values. We call these two kindsof targets direct targets and indirect targets, respectively.

indirect targets are ref-2 pointers.

Com S 342

Targets

(define-datatype target target?(direct-target

(expval expval?))(indirect-target

(ref ref-to-direct-target?)))

Com S 342

expval? & ref-to-direct-target?

(define expval?(lambda (val)

(or (number? val) (procval? val))))

(define ref-to-direct-target?(lambda (val) (and (reference? val) (cases reference val

(a-ref (pos vec) (cases target (vector-ref vec pos) (direct-target (v) #t) (indirect-target (v) #f)))))))

Com S 342

deref

(define primitive-deref deref)

(define deref (lambda (ref) (cases target (primitive-deref ref) (direct-target (expval) expval) (indirect-target (ref-ref) (cases target (primitive-deref ref-ref) (direct-target (expval) expval) (indirect-target (p) (eopl:error 'deref "Illegal reference: ~s" ref-ref)))))))

Ref-2 pointer

Com S 342

setref!

(define primitive-setref! setref!)

(define setref! (lambda (ref expval) (let ((target-ref (cases target (primitive-deref ref)

(direct-target (aval) ref)

(indirect-target (aref) aref))))

(primitive-setref! target-ref (direct-target expval)))))

Use new reference

Use old reference

Com S 342

Environments in call-by-reference

Pseudo code:

(proc (&t, &u, &v, &w) (proc (&a, &b) (proc (&x, &y, &z) y := 13; a b 6) 3 v) 5 6 7 8)

(x y z) #[ , , 6 ]

(a b) #[ 3, ]

(t u v w) #[ 5, 6, 7, 8 ]

Both b and y point to thelocation denoted by v.

Com S 342

Specification of call-by-reference

We add different parameter modes:

<formals> ::="<params" {"<param" <formal-mode> "=" <identifier> "/>"}* "/>"

param-decls (modes ids)

<formal-mode> ::= "value" | "byref"

mode-value | mode-byref

Com S 342

Swap

<let <declarations <declaration <variable value = a /> <integer value = 3 /> /> <declaration <variable value = b /> <integer value = 4 /> /> <declaration <variable value = swap /> <proc <params <param byref = x /> <param byref = y /> /> <let <declarations <declaration <variable value = temp /> <reference value = x /> /> /> <sequence <set <variable value = x /> <reference value = y /> /> <set <variable value = y /> <reference value = temp /> /> /> /> /> /> /> <sequence <invoke <reference value = swap /> <arguments <reference value = a /> <reference value = b /> /> /> <sub <arguments <reference value = a /> <reference value = b /> /> /> />/>

1

Com S 342

Implementation of call-by-reference

In order to implement call-by-reference we have to analyzeeach place where sub-expressions are evaluated: Primitive application expression Let expression Reclet expression For expression Application expression

Com S 342

Primitive Application Expression

We do not need to change the behavior of primitive application expression,since we require it to yield a pure expressed value:

eval-expression: (primapp-exp (prim rands) (let ((args (eval-prim-rands rands env)))

(apply-primitive prim args)))

(define eval-prim-rands (lambda (rands env) (map (lambda (x) (eval-prim-rand x env)) rands)))

(define eval-prim-rand (lambda (rand env) (eval-expression rand env)))

Com S 342

Let Expression

For let-bound variables, we choose to retain the call-by-value semantics:

eval-expression: (let-exp (ids rands body)(let ((args (eval-let-rands rands env))) (eval-expression body (extend-env ids args env))))

(define eval-let-rands (lambda (rands env) (map (lambda (x) (eval-let-rand x env)) rands)))

(define eval-let-rand (lambda (rand env) (direct-target (eval-expression rand env))))

Com S 342

Reclet Expression

For reclet-bound variables, we use the same approach:…(reclet-exp (decls body)

(let* ((args (eval-rands (get-rands decls) env)) ;; filter recursive procedure ids (rec-proc-ids (map car (filter (lambda (p) (procval? (cadr p))) (zip (get-ids decls) args)))) ;; now change closure to closure-rec (new-args (map (lambda (v) (if (procval? v) (build-rec-proc v rec-proc-ids) v)) args)))

;; reclet: binder is direct-target (new-d-args (map (lambda (v) (direct-target v)) new-args)))

(eval-expression body (extend-env (get-ids decls) new-d-args env))))…

Build expressedvalues first.

Com S 342

For Expression

(define eval-loop (lambda (loop-decls conditions increments body env) (let ((p (unzip loop-decls)) (loop-test (primapp-exp (or-prim) (car conditions) (cdr conditions)))) (let continue ;; loop var denote direct targets ((new-env (extend-env (car p) (map direct-target (cadr p)) env)) (res 0)) (if (is-true? (eval-expression loop-test new-env)) (let ((new-res (eval-expression body new-env)) (step-res (map (lambda (e) (eval-expression e new-env)) increments))) ;; next iteration, direct targets (continue (extend-env (car p) (map direct-target step-res) env) new-res) ) res) ) ) ) )

Com S 342

Application Expression

For procedure applications we need to analyze the parameter modes:

… (app-exp (rator rands) ;; check procval first in order to extract parameter modes (let ((proc (eval-expression rator env))) (if (procval? proc) ;; we need parameter modes here (let ((args (eval-proc-rands (get-param-modes (closure-params proc)) rands env))) ;; add calling-env to resolve occurring recursive procedures

(apply-procval proc args env)) (eopl:error 'eval-expression

"Attempt to apply non-procedure ~s" proc))))…

Com S 342

eval-proc-rands

(define eval-proc-rands (lambda (modes rands env) (if (= (length rands) (length modes)) (map (lambda (p) (eval-proc-rand p env)) (zip modes rands)) (eopl:error 'eval-proc-rands "Parameter mismatch") )))

Com S 342

eval-proc-rand

(define eval-proc-rand (lambda (argument env) (cases pmode (car argument) (mode-value () (direct-target (eval-expression (cadr argument) env))) (mode-byref () (cases expression (cadr argument) (var-exp (id) ;; build new reference (indirect-target (let ((ref (apply-env-ref env id)))

(cases target (primitive-deref ref) (direct-target (val) ref)

(indirect-target (a-ref) a-ref))))) (else (direct-target (eval-expression (cadr argument) env))) )) )))

build new reference

return new reference

return old reference

Com S 342

Mixing Parameter Modes

<let <declarations <declaration <variable value = a /> <integer value = 3 /> /> <declaration <variable value = b /> <integer value = 4 /> /> <declaration <variable value = swap /> <proc <params <param value = x /> <param byref = y /> /> <let <declarations <declaration <variable value = temp /> <reference value = x /> /> /> <sequence <set <variable value = x /> <reference value = y /> /> <set <variable value = y /> <reference value = temp /> /> /> /> /> /> /> <sequence <invoke <reference value = swap /> <arguments <reference value = a /> <reference value = b /> /> /> <sub <arguments <reference value = a /> <reference value = b /> /> /> />/>

0

Com S 342

Object-Oriented Extensions

The object-oriented programming paradigm enables us toassociate functions to data more directly.

To add objects to XMLScheme, we need to add a new form ofvalue to the set of expressed values:

Expressed Value = Number + ProcVal + ObjectVal

Com S 342

Object Values

(define-datatype objectval objectval? (object (ivars environment?) (methods environment?) (image environment?)) )

Com S 342

Object-Oriented Language Extensions

<expression> ::= "<class" {<i-vars>}* {<methods>}* "/>“<expression> ::= "<send"

"<message" "name" "=" <identifier> "/>“ <expression> <arguments> "/>"

<i-vars> ::= "<variables" {<ivar>}* "/>"<i-var> ::= <instance" "<variable" "value" "=" <identifier> "/>"

<expression> "/>"

<methods> ::= "<methods" {<method>}* "/>"<method> ::= "<method" "<name" "value" "=" identifier "/>"

<formals> <expression> "/>"

Com S 342

Example

<let <declarations <declaration <variable value = listobj /> <invoke <reference value = List /> <arguments <integer value = 1 /> <integer value = 0 /> /> /> /> /> <sequence <send <message name = consI /> <reference value = listobj /> <arguments <integer value = 3 /> <integer value = 0 /> /> /> <send <message name = hd /> <reference value = listobj /> <arguments /> /> /> />

Com S 342

Evaluation of Classes

(class-exp (params i-decls m-decls) (let ((vtable (evaluate-methods m-decls env))) (closure-class params i-decls (car vtable) (cadr vtable) env)))

A class is encodedas a function

Com S 342

Classes As Procedures

(define-datatype procval procval? … (closure-class (params pdeclarations?) (i-decls i-vars?) (mids (list-of symbol?)) (m-decls (list-of target?)) (env environment?)) )

Com S 342

closured.scm

(closure-class (params i-decls mids mprocs env) …

Com S 342

Class List

<class <params <param value = car /> <param value = cdr /> /> <instances <instance <variable value = head /> <reference value = car /> /> <instance <variable value = tail /> <reference value = cdr /> /> /> <methods <method <name value = cons /> <params <param value = car /> <param value = cdr /> /> <sequence <set <variable value = head /> <reference value = car /> /> <set <variable value = tail /> <reference value = cdr /> /> /> /> <method <name value = hd /> <params /> <reference value = head /> /> <method <name value = tl /> <params /> <reference value = tail /> /> <method <name value = consI /> <params <param value = car /> <param value = cdr /> /> <send <message name = cons /> <reference value = self /> <arguments <reference value = car /> <reference value = cdr /> /> /> /> /> />

Com S 342

Evaluation of Methods

(send-exp (mid obj rands) (let (;; evaluate object (receiver (eval-expression obj env))) (if (objectval? receiver) (let* (;; find method (mproc (lookup-method receiver mid)) ;; evaluate arguments (args (eval-proc-rands (get-param-modes (closure-params mproc)) rands env))) ;; call method using 'self' as calling env (apply-method mproc args (self receiver)) ) (eopl:error 'eval-expression "Receiver is not an object") )))

Com S 342

Method Lookup & Self

(define lookup-method (lambda (obj mid) (cases objectval obj (object (ivars methods image) (apply-env methods mid)) ) ) )

(define self (lambda (obj) (cases objectval obj (object (ivars methods image) image) ) ) )

Com S 342

apply-method

(define apply-method (lambda (mproc args object-env) (cases procval mproc (closure (params body env) (eval-expression body (extend-env (get-param-ids params) args (link-env object-env env)))) (else (eopl:error 'apply-method "Illegal method call")) ) ) )

Add receiver object to callingenvironment of method.

Com S 342

Type Systems

Overview What is a Type? Static vs. Dynamic Typing Kinds of Typing Polymorphic types Overloading

References Daniel P. Friedman et al., “Essentials of Programming Languages”, Second Edition,

MIT Press, 2001 David Watt, “Programming Language Concepts and Paradigms”, Prentice Hall,

1990 Luca Cardelli and Peter Wegner, “On Understanding Types, Data Abstractions, and

Polymorphism”, ACM Computing Survey, 17/4, Dec. 1985, pp. 471-522

Com S 342

What Is a Type?

Type errors:> (+ 5 `())Error in +: () is not a number.Type (debug) to enter the debugger.

A type is a set of values: Integer = {…, -2, -1, 0, 1, 2, …} Boolean = {True, False} Point = { (x y) | x,y ∈ Integer }

A type is a partial specification of behavior: n, m ∈ Integer ⇒ (+ n m) is valid, but (not n) is an error

Com S 342

Static Typing

Values have static types defined by the programminglanguage.

A language is statically typed if it is always possible todetermine the (static) type of an expression based on theprogram text alone.

Com S 342

Dynamic Typing

Variables and expressions have dynamic types determined bythe values they assume at run- time.

A language is dynamically typed if only values have fixedtype. Variables and parameters may take on different typesat run-time, and must be checked immediately before theyare used.

Com S 342

Type Consistency

A language is strongly typed if it is possible to ensure thatevery expression is type consistent based on the programtext alone.

Type consistency may be assured by compile-time type-checking, type inference, or dynamic type-checking.

Com S 342

Kinds of Types

All programming languages provide some set of built-in types.

Most strongly-typed modern languages provide for additionaluser-defined types: Primitive types: Booleans, Integers, Floats, Chars, ... Composite types: Functions, Lists, Tuples, ... User-defined types: Enumerations, Recursive Types, Generic Types, ...

The Type Completeness Principle (Watt):No operation should be arbitrarily restricted in the types of values involved.

Com S 342

Types in Scheme

Scheme is a dynamically typed language. However, no object satisfies morethan one of the following predicates:

boolean? pair? symbol?number? char? string?vector? port? procedure?

These predicates define the types Boolean*, Pair, Symbol, Number,Character, String, Vector, Port, and Procedure.* All values in Scheme count as true except ‘#f’.

The empty list is a special object of its own type; it satisfies none of the typepredicates.

Com S 342

A Language for Scheme Types

<Type> ::= bool | int | symbol | char | string | <Identifier> | ( <Type> ) lists | ( {<Type>}*(+) ) tuples | ( {<Type>}*(*) -> <Type> ) functions

<Typed Expression> ::= ( <Expression> <Type> )

This type language used solely toillustrate a possible approach to addtype assignments to Scheme.

Com S 342

Function Types

Functions types allow one to deduce the types of expressionswithout the need to evaluate them:

(define increment (lambda ((x int)) (+ x 1)))

(+ ((int * int)-> int)) (1 int) (increment (int -> int)) (42 int) ⇒ ((increment 42) int)

type binding

Com S 342

List and Tuple Types

List types: A list of values of type “a” has the type “(a)”: ((1 2 3) (int))

Note: All elements in a list must have the same type! (“Hello world!” 2 #f ) – this is illegal! It cannot be typed!

Tuple types: If the expressions x1, x2, …, xn have types a1, a2, …, an respectively,

then the tuple (x1 x2 … xn) has type (a1 + a2 + … + an):

((1 (2) 3) (int + (int) + int))((“Hello world!” #f) (string + bool))(((1 2) (3 4)) ((int + int) + (int + int)))

Com S 342

Polymorphism

Languages like Pascal have monomorphic type systems: every constant,variable, parameter and function result has a unique type. Such languageshinders, however, the definition of generic abstraction, if possible at all.

Modern languages also incorporate (universally quantified) polymorphictypes. Polymorphic type expressions describe families of types. For example,“(∀ a) (a)” is the family of types consisting of, for every type “a”, the typeof lists of “a”.

Scheme also allows the definition of expression that can be assigneduniversally quantified polymorphic types. In a type language, polymorphictypes are represented by type variables.

Com S 342

Polymorphic Types

We can deduce the types of expressions using polymorphicfunctions by simply binding type variables to concrete types:

Consider:(length ((a) -> int))(string-length (string -> int))(map ((a -> b) -> (a) -> (b)))

Then:((map string-length) ((string) -> (int)))((“Hello” “World”) (string))((map string-length '("Hello" "World" "!")) (int))

The Scheme versionof map does not allowthis form!

Com S 342

Kinds of Polymophism

Universal: Parametric:

polymorphic map function in Scheme, “void *” in C “Object” in Java

Inclusion: subtyping — graphic objects

Ad Hoc: Overloading:

The operator + applies to both integers and floating point numbers. Coercion:

Integer values can be used where floating point numbers are expected andvice versa.

Com S 342

Coercion or Overloading

How does one distinguish?

3 + 43.0 + 43 + 4.03.0 + 4.0

Com S 342

The Typed Lambda Calculus

There are many variants of the lambda calculus. The typed lambdacalculus decorates terms with type annotations:

Syntax:e ::= xτ | e1

(σ τ) e2σ | (λxσ . eτ)(σ τ)

Operational Semantics:α-conversion: λxσ . eτ ↔ λyσ . [yσ/xσ]eτ where yσ is fresh (in eτ)β-reduction: (λxσ . e1

τ) e2σ [e2

σ/xσ]e1τ avoiding name capture

η-reduction: λxσ . (e1τ xσ) e1

τ if xσ is not free in e1τ

Examples:T ≡ (λxσ. (λyτ . xσ)(τ σ))(σ (τ σ))

F ≡ (λxσ. (λyτ . yτ)(τ τ))(σ (τ τ))

Com S 342

A Type System

A type assumption is a partial function Γ : V T with a finite domain V theset of variables.

A type assertion or type judgement is a triple (Γ, e, τ), where Γ is a typeassumption, e is a lambda-term (either typed or untyped, depending on thecontext), and τ is a type. The domain of Γ is exactly the set of free variablesof e (fv(e)).

A type system for the lambda-calculus:

2 : )

2e

1(e

1 :

2e )

2

1( :

1e

)2

1

( : e) . 1

:x ( 2

: e 1

: x ;

: x : x ;

!

!!!

!!!"

!!

!!

#$

#$%#$

%#$

#$

#$

Com S 342

Find Type of an Expression

(define type-check (lambda (gamma e) (if (is-expression? e) (cond ((is-constant? e) 'int) ((is-variable? e) (type-of-variable gamma e)) ((is-abstraction? e) … ) ((is-application? e) … ) ) (eopl:error 'type-check "Argument (~s) is not an expression" e) ) ))

!! : x : x ; "#

We add Integer constants.

Com S 342

Type of a Variable: t = Γ(x)

(define type-of-variable (lambda (gamma id) (if (null? gamma) (eopl:error 'type-of-variable "Undefined identifier ~s" id) (if (equal? (caar gamma) id) (cadar gamma) (type-of-variable (cdr gamma) id) ) ) ))

Com S 342

Built-in Functions

(define gamma-zero '( (add (int -> (int -> int))) (sub (int -> (int -> int))) (mul (int -> (int -> int))) (div (int -> (int -> int))) ))

Com S 342

Abstraction

… ((is-abstraction? e) (let* ((formal-type (get-formal-type e)) (type-of-e1 (type-check (cons (list (caadr e) formal-type) gamma) (caddr e)))) (list formal-type '-> type-of-e1)))…

)2

1

( : e) . 1

:x ( 2

: e 1

: x ;

!!!"

!!

#$%

$%

Com S 342

Application

…((is-application? e) (let ((type-of-e1 (type-check gamma (car e))) (type-of-e2 (type-check gamma (cadr e)))) (if (is-function-type? type-of-e1) (if (equal? (get-argument-type type-of-e1) type-of-e2) (get-result-type type-of-e1) (eopl:error 'type-check "Wrong argument type ~s" type-of-e2) ) (eopl:error 'type-check "~s is not a function" (car e)) ) ))…

2 : )

2e

1(e

1 :

2e )

2

1( :

1e

!

!!!

"#

"#$"#

Com S 342

Examples

Success:> (type-check gamma-zero '((add 1) 2))int> (type-check gamma-zero '(add 1))(int -> int)> (define t '(lambda (x (int -> (int -> int))) (lambda (y (int -> int)) (lambda (z int) ((x z) (y z)))))> (type-check '() t)((int -> (int -> int)) -> ((int -> int) -> (int -> int)))

(add (int -> (int -> int)))(sub (int -> (int -> int)))(mul (int -> (int -> int)))(div (int -> (int -> int)))

Com S 342

Examples

Type errors:> (type-check gamma-zero '(1 2))Error reported by type-check:1 is not a function> (type-check gamma-zero '(add div))Error reported by type-check:Wrong argument type (int -> (int -> int))

(add (int -> (int -> int)))(sub (int -> (int -> int)))(mul (int -> (int -> int)))(div (int -> (int -> int)))

Com S 342

The Polymorphic Lambda Calculus

Polymorphic functions like “map” cannot be typed in the typed lambdacalculus!

We need type variables to capture polymorphism:

β-reduction (ii): (λxΑ . e1τ) e2

σ [σ/A]([e2σ/xΑ]e1

τ)

Example:T ≡ (λxΑ. (λyΒ . xΑ)(Β Α))(Α (Β Α))

T (Α (Β Α)) ατ bσ (λyΒ . aτ)(Β τ) bσ

aτ

Com S 342

Polymorphism and Self Application

Even the polymorphic lambda calculus is not powerful enough toexpress certain lambda terms.

Recall that both Ω and the Y combinator, which make use of“self application”:

Ω ≡ (λx . x x) (λx . x x)Y ≡ λf . (λx . f (x x)) (λx . f (x x))

What type annotation would you assign to the expression? Aretheses terms typable at all?

Com S 342

Type Inference

Overview: The type inference problem Typed lambda terms, type assertions, and typing rules Wand’s algorithm Unification of type equations

References: Mitchell Wand, “A Simple Algorithm and Proof for Type Inference”,

Fundamenta Informaticae, 10:115-122, 1987 J. Roger Hindley, “Basic Simple Type Theory”, Cambridge University Press,

1997 John C. Mitchell, “Foundations for Programming Languages”, MIT Press,

1996

Com S 342

The Type Inference Problem

The type inference problem can be stated as follows:

“Given a term of the untyped lambda calculus, find all terms of the typedlambda calculus, which yield the given term when the type information onbound variables is deleted.”

Since such terms can differ only in their types, this problem is sometimesreferred to as finding the “possible typings” of a term.

This problem was first formulated and solved by Curry (in the 1930’s) andHindley (1969).

Milner (1978) was the first to make the connection with the unificationproblem formulated by Robinson (1965).

Com S 342

Typed Lambda-Terms

The set e of untyped lambda-terms are defined as follows:e ::= x | (λx . e ) | (e1 e2)

The set T of types is defined as follows:t ::= K - basic types | (t1 t2) - function types

The set eT of typed lambda-terms is obtained by modifying the second clausein the definition of the untyped lambda-terms:

(λx : t . e)

Com S 342

Type Inference for Closed Terms

The type inference problem for closed term can be stated as follows:

Given a closed lambda-term e, find all types t such that( ∅, e, t ).

The set of type expressions is defined by adding type variables, written τ, tothe set t of types.

Theorem: Given a closed lambda-term e, it is decidable whether there exists a type t such

that ( ∅, e, t ). If there is any such t, then there is a type expression u such that the typings of e

are precisely the types of the form σu for all substitutions σ.

Com S 342

Wand’s Algorithm - Skeleton

Input:A lambda-term e0.

Initialization:Set E = ∅ and G = {(Γ0, e0, t0)}, where t0 is a type variable and Γ0 maps thefree variables of e0 to other distinct type variables.

Loop Step:If G = ∅, then halt and return E. Otherwise, choose a subgoal (Γ, e, t) fromG, delete it from G, and add to E and G new verification conditions andsubgoals, as specified in an action table.

End of Skeleton

Com S 342

The Action Table

Case (Γ, x, t):Generate the equation t = Γ(x).

Case (Γ, (λx . e ), t):Let τ1 and τ2 be fresh type variables. Generate the equationt = (τ1 τ2) and the subgoal (Γ; x : τ1, e, τ2).

Case (Γ, (e1 e2), t):Let τ1 be a fresh type variable. Then generate the subgoals(Γ, e1, τ1 t) and (Γ, e2, τ1).

Com S 342

Example

Consider (λx . (λy . (λz . ((x z) (y z)) ) ) )

{(∅, (λx . (λy . (λz . ((x z) (y z)) ) ) ),τ0)}{((x : τ1), (λy . (λz . ((x z) (y z)) ) ),τ2)}; τ0 = (τ1 τ2){((x : τ1, y : τ3), (λz . ((x z) (y z)) ),τ4)}; τ2 = (τ3 τ4){((x : τ1, y : τ3, z : τ5), ((x z) (y z)),τ6)}; τ4 = (τ5 τ6){((x : τ1, y : τ3, z : τ5), (x z),(τ7 τ6)), ((x : τ1, y : τ3, z : τ5), (y z),τ7)}{((x : τ1, y : τ3, z : τ5), x,(τ8 (τ7 τ6))), ((x : τ1, y : τ3, z : τ5), z,τ8),

((x : τ1, y : τ3, z : τ5), (y z),τ7)}{((x : τ1, y : τ3, z : τ5), z,τ8), ((x : τ1, y : τ3, z : τ5), (y z),τ7)}; (τ8 (τ7 τ6)) = τ1

{((x : τ1, y : τ3, z : τ5), (y z),τ7)}; τ8 = τ5

{((x : τ1, y : τ3, z : τ5), y,(τ9 τ7)), ((x : τ1, y : τ3, z : τ5), z,τ9)}{((x : τ1, y : τ3, z : τ5), z,τ9)}; (τ9 τ7) = τ3

∅; τ9 = τ5

Com S 342

The Equation Set

The generated equations are:τ0 = (τ1 τ2)τ2 = (τ3 τ4)τ4 = (τ5 τ6)

(τ8 (τ7 τ6)) = τ1

τ8 = τ5

(τ9 τ7) = τ3

τ9 = τ5

Solving these equations by the unification algorithm gives the solution:

τ0 = ((τ5 (τ7 τ6)) ((τ5 τ7) (τ5 τ6)))

which is the so-called principal type of the term λx . λy . λz . x z (y z).

Com S 342

The Algorithm unify

unify(∅) = ∅

unify(E ∪ {K1 = K2}) =if K1 ≠ K2 then fail

else unify(E)

unify(E ∪ {τ = t}) =if τ ≡ t then unify(E)

else if τ occurs in t then failelse unify([t/τ]E) ° [t/τ]

unify(E ∪ {t = τ}) = unify(E ∪ {τ = t})

unify(E ∪ {(t1 t2) = (t3 t4)}) = unify(E ∪ {t1 = t3 } ∪ {t2 = t4})

Com S 342

The Algorithm PT

PT(x) = {x : τ} ∴ x : τ

PT(e1 e2) = let Γ ∴ e1’ : τ = PT(e1) Γ’ ∴ e2’ : σ = PT(e2) S = unify ({α = β | x : α ∈ Γ and x : β ∈ Γ’} ∪ {τ = (σ ρ)})

where ρ is a fresh type variable in

SΓ ∪ SΓ’ ∴ S(e1’ e2’) : Sρ

PT(λx . e) = let Γ ∴ e’ : ρ = PT(e) in

if x : τ ∈ Γthen Γ – {x : τ} ∴ λx : τ. e’ : (τ ρ)else Γ ∴ λx : σ. e’ : (σ ρ)

where σ is a fresh type variable

S: set of substitutions

Documents

Com S 342 - · PDF fileCom S 342 Overview Tentative course program: Introduction – basic concepts The algorithmic programming language Scheme Inductive sets of data