Upload
vuthuy
View
216
Download
3
Embed Size (px)
Citation preview
Com S 342
Principles of Programming Languages
Spring 2007
Com S 342
Com S 342
Lecturer: Dr. Markus LumpeDepartment of Computer Science113 Atanasoff Hallhttp://www.cs.iastate.edu/~cs342TR 2-3:30, Gilman 1104W9 (1), Gilman 0312W4 (2), Gilman 0312
TA’s: N/A
Grading: problem sets, 2 tests, final exam
Assignments: on a weekly basis
Com S 342
Overview
Tentative course program:
Introduction – basic concepts The algorithmic programming language Scheme Inductive sets of data The lambda-calculus – the core language of sequential programming Recursion and fixed points Type reconstruction algorithms Data abstraction, representation strategies for data types Interpreters Types, type checking, and type inference Objects and classes (optional) Continuation-Passing style (optional)
Com S 342
The Algorithmic Language Scheme
Overview DrScheme, Chez Scheme, SLIB, and EOPL2-extensions Features of Scheme Expressions Data types Procedures
References Revised5 Report on the Algorithmic Language Scheme. Daniel P. Friedman and Matthias Felleisen, “The Little Schemer”, Fourth Edition,
MIT Press, 1996 Harold Abelson et al., “Structure and Interpretation of Computer Programs”, MIT
Press, 1996 Daniel P. Friedman et al., “Essentials of Programming Languages”, Second Edition,
MIT Press, 2001
Com S 342
Chez Scheme, SLIB, and EOPL2-extensions
DrScheme (or Chez Scheme) is a complete high-performanceimplementation of ANSI/IEEE standard Scheme.
SLIB is a portable Scheme library meant to provide compatibilityand utility functions for all standard Scheme implementations.
Running Chez Scheme (Com S342): SCHEME_LIBRARY_PATH: The SLIB library directory. CS342LIB: The EOPL2-extensions path On every platform use the command “scheme” to start an interactive
Scheme-session. The file “chez.prelude” defines everything you need to use Chez Scheme in
Com S342. This file is loaded after “chez.init” – the SLIB initialization file.
Com S 342
DrScheme
Com S 342
CS342 Chez Scheme
Com S 342
Scheme
Scheme is a statically scoped programming language. Each use of a variable is associated with a lexically apparent binding of that
variable.
Scheme is a dynamically typed language. Types are associated with values rather than with variables. Scheme has latent as opposed to manifest types.
All objects in the course of a Scheme computation have unlimited extent. No Scheme object is ever destroyed. However, if an object becomes inaccessible, then it can be garbage-collected.
Procedures in Scheme are first-class. Procedures can be created dynamically, stored in data structures, returned as
results of procedures, etc.
Com S 342
General Facts and Conventions
Scheme employs a fully parenthesized prefix notation for programs and data.
(define fact (lambda (n) (if (= n 0) 1 (* n (fact (- n 1)))) ) )
Names of procedures that always return a boolean value usually end with “?”. Theseprocedures are called predicates.
Names of procedures that store values into previously allocated locations usually endwith “!”.
“->” appears within the names of procedures that take an object of one type andreturn an analogous object of another type.
Upper and lower case forms of a letter are never distinguished except withincharacters and string constants. For example, ‘Foo’ is the same identifier as ‘FOO’, and‘#x1AB’ is the same number as ‘#X1ab’.
Com S 342
Identifiers
Most identifies allowed by other programming languages are alsoacceptable to Scheme.
Identifiers begin with a character or a special character, followedby a sequence of characters, special characters, or digits.
Lexical structure:<identifier> ::= <initial> <subsequent>* | <peculiar identifier><initial> ::= <letter> | <special initial><special initial> ::= ‘!’ | ‘$’ | ‘%’ | ‘&’ | ‘*’ | ‘/’ | ‘:’ | ‘<’ | ‘=’ | ‘>’ | ‘?’ | ‘^’ | ‘_’ | ‘~’<subsequent> ::= <initial> | <digit> | <special subsequent><special subsequent> ::= ‘+’ | ‘-’ | ‘.’ | ‘@’<peculiar identifier> ::= ‘+’ | ‘-’ | ‘…’
Com S 342
Examples of Identifiers
lambda qlist->vector soup+ V17a<=? a34kTMNsthe-word-recursion-has-many-meanings
Any identifier may be used as a variable or as a syntactic keyword.
When an identifier appears as a literal or within a literal, it is being used todenote a symbol, that is, equality (eqv?) is defined for identifiers.
Com S 342
Variables, Syntactic Keywords, and Regions
An identifier may name a type of syntax, or it may name a location where avalue can be stored. An identifier that names a location is called variable and is said to be bound to that
location. An identifier that names a type of syntax is called a syntactic keyword and is said
to be bound to that syntax.
Like C/C++, Pascal, and Java, Scheme is a statically scoped language withblock structure. To each place where an identifier is bound in a program there corresponds a
region of the program text within which the binding is visible. The region isdetermined by the particular binding construct that establishes the binding.
Com S 342
Type Predicates
No object satisfies more than one of the following predicates:
boolean? pair?symbol? number?char? string?vector? port?procedure?
These predicates define the types boolean*, pair, symbol, number, character,string, vector, port, and procedure.* All values in Scheme count as true except ‘#f’.
Note, the empty list is a special object of its own type; it satisfies none of thetype predicates.
Com S 342
Variable References
An expression consisting of a variable is a variable reference. The value of thevariable reference is the value stored in the location to which the variable isbound.
Syntax:<variable>
Example:> (define x 28)> x28
Com S 342
Literal Expressions
Literals are either external representations of Scheme objects or constants thatevaluate “to themselves”.
Syntax: (quote <datum>) | ‘<datum> | <constant>
Examples:> (quote a) ; evaluates to <datum>a> (quote (+ 1 2)) ; evaluates to (+ 1 2)(+ 1 2)> ‘(+ 1 2) ; abbreviation of (quote (+ 1 2))(+ 1 2)> “string” ; evaluates to itself, it does not need to be quoted“string”
Com S 342
Procedure Calls
A procedure call is written by simply enclosing in parenthesis expressions forthe procedure to be called and the arguments to be passed to it.
Syntax: ( <operator> {<operand>}* )
Examples: > (+ 3 4)7> ((if #f + *) 3 4)12
Scheme uses the call-by-value parameter passing mechanism; everyargument is evaluated, including the expression that denotes the procedure,before the procedure is called.
Com S 342
Load a Script and Exit
To load a script, the Scheme system provides the built-in procedure “load”.
Syntax: (load <filename-string>)
Example: > (load “test.ss”)> (load “fib.scm”)
To leave the Scheme interpreter, call “exit”.
Syntax: (exit)
Com S 342
Procedures
Procedures are parameterized abstractions over expressions.
Syntax: (lambda <formals> <body>) <formals> should be a formal argument list.
<formals> ::= ( {<variable>}+ ) | <variable> | ( <variable1> ... <variablen> . <variablen+1> )
<body> should be a sequence of one or more expressions.
Example: > ((lambda (x) (+ x x)) 4)8> ((lambda x x) 3 4 5 6)(3 4 5 6)> ((lambda (x y . z) z) 3 4 5 6)(5 6)
Com S 342
Definitions
Definitions are used to assign expressions to names.
Syntax: (define <variable> <expression>)
Examples: > (define add3(lambda (x) (+ x 3)))
> (add3 4)7> (define first car)> (first `(1 2))1
Com S 342
Assignment
The assignment expression is used to store the value of an expression in alocation, which is bound to a variable.
Syntax: (set! <variable> <expression>)
Example: > (define x 2)> (+ x 1)3> (set! x 4)> (+ x 1)5
Note: <variable> must be bound either in some enclosing region (as in theexample) or at top level (in the program).
Com S 342
Conditionals
(if <test> <consequent> <alternative>)(if <test> <consequent>)
(cond {<clause>}+ )<clause> ::= (<test> {<expression>}* )
| (else {<expression>}+)
(case <key> {<case-clause>}+ )<case-clause> ::= (({<datum>}+) {<expression>}* )
| (else {<expression>}+)
Com S 342
Other Expressions
Sequencing: (begin {<expression>}+ )
Iteration: (do ( {( <variable> <init> <step> )}+ ) (<test> {<expression>}* ) {<command>}+ )
Binding: (let ({<binding>}+) <body>)<binding> ::= (<variable> <init>)
Com S 342
A Small Example – Bubble Sort
(define bubble-sort ; procedure bubble-sort (lambda (lon) ; procedure with one argument (if (null? lon) ; test whether list is empty
`() ; if list is empty, we are done; set variable lon-with-elem-left(let ((lon-with-max-elem-left (reverse (put-max-last lon)))) ; build sorted list using recursion (cons (car lon-with-max-elem-left) ; select head of lon-with-max-elem-left
; apply bubble-sort to the tail of lon-with-max-elem-left (bubble-sort (cdr lon-with-max-elem-left))))
) ; end if ) ; end lambda) ; end define
Com S 342
Procedure put-max-last
(define put-max-last ; define procedure name “put-max-last” (lambda (lon) ; define procedure (if (null? lon) ; test whether lon is empty `() ; if list is empty, we are done (if (null? (cdr lon)) ; test whether lon is a singleton lon ; if yes, we are done (if (< (car lon) (cadr lon)) ; left < right?
(cons (car lon) (put-max-last (cdr lon))) ; true (cons (cadr lon) (put-max-last (cons (car lon) (cddr lon)))) ; false ) ; end if ) ; end if
) ; end if ) ; end lambda) ; end define
Com S 342
Trace of bubble-sort
| (bubble-sort (3 6 2 8 9 8))| (bubble-sort (8 8 6 2 3))| |(bubble-sort (3 2 6 8))| | (bubble-sort (6 3 2))| | |(bubble-sort (2 3))| | | (bubble-sort (2))| | | |(bubble-sort ())| | | |()| | | (2)| | |(3 2)| | (6 3 2)| |(8 6 3 2)| (8 8 6 3 2)|(9 8 8 6 3 2)(9 8 8 6 3 2)
> (trace bubble-sort)(bubble-sort)> (bubble-sort `(3 6 2 8 9 8))
Com S 342
Lists and Pairs
A pair (sometimes called a dotted pair) is a record structure with two fieldscalled the car and cdr fields.
Pairs are created by the procedure cons.
The car and the cdr fields are accessed by the procedures car and cdr.
Pairs are used primarily to represent lists. A list can be defined recursively aseither the empty list or a pair whose cdr is a list.
Example: (a b c d e) and (a . (b . (c . (d . (e . ()))))) are equivalent notations for lists of symbols.
Com S 342
Check for Pairs and Lists
(pair? obj)pair? returns #t if obj is a pair, otherwise returns #f.
(pair? `(a . b)) #t(pair? `(a b c)) #t ; a list is a pair: (a . (b . (c . ())))(pair? `()) #f ; the empty list is not a pair
(null? obj)null? returns #t if obj is the empty list, otherwise returns #f.
(list? obj)list? returns #t if obj is a list, otherwise returns #f.
Com S 342
list? & null?
Given a list the procedure reverse returns a newly allocated list consisting ofthe elements of a-list in reverse order:
(define reverse (lambda (a-list) (if (list? a-list) (if (null? a-list) `() (append (reverse (cdr a-list)) (list (car a-list))) ) ; end if ) ; end if ) ; end lambda) ; end define
procedure definition
Is a-list indeed a list?
test for ()
build new list
Com S 342
car, cdr
(car pair)Returns the contents of the car field of pair.
(car `(a b c)) a(car `((a) b c)) (a)(car `()) error, () has no car field
(cdr pair)Returns the contents of the cdr field of pair.(cdr `((a) b c)) (b c) ; tail of list ((a) b c) is again a list(cdr `(1 . 2)) 2(cdr `()) error, () has no cdr field
Com S 342
Variations of car & cdr
The Scheme library provides procedures that represent the arbitrarycomposition of up to four car and cdr procedures:
(caar pair)(cadr pair)
.
.
.(cdddar pair)(cddddr pair)
There are twenty-eight of these procedures.
Example: (define caddr (lambda (x) (car (cdr (cdr x)))))
Com S 342
car, cadr, cdr, and cddr
Example: lon = `(1 2 3 4)
…(if (< (car lon) (cadr lon))
(cons (car lon) (put-max-last (cdr lon))) (cons (cadr lon) (put-max-last (cons (car lon) (cddr lon))))
…
(car lon) 1(cadr lon) = (car (cdr lon)) 2
(cdr lon) (2 3 4)
(cddr lon) = (cdr (cdr lon) (3 4)
Com S 342
List Builder cons
(cons obj1 obj2)Returns a newly allocated pair whose car is obj1 and whose cdr is obj2.
(cons `a `()) (a)(cons `(a) `(b c d)) ((a) b c d)(cons `(a b) `c) ((a b) . c)
Note: The procedure cons preserves the structure of both the car and the cdrfield. Therefore, if obj1 is a list, then car applied to the result of cons returns alist as well.
Example: (car (cons `(a) `(b c d))) (a)
Com S 342
List Builder append
(append list …)Returns a list consisting of the elements of the first list followed by theelements of the other lists.
(append `(x) `(y)) (x y)(append `(a) `(b c d)) (a b c d)(append `(a (b)) `((c))) (a (b) (c))
Note: The procedure append removes the top-level pair of parenthesis from theargument lists.
Example: (car (append `(a) `(b c d))) a
Com S 342
List Builder list
(list obj …)Returns a newly allocated list of its arguments.
(list `a (+ 3 4) `c) (a 7 c)(list) ()
Com S 342
Bubble Sort - Review
(define bubble-sort ; procedure bubble-sort (lambda (lon) ; procedure with one argument (if (null? lon) ; test whether list is empty
`() ; if list is empty, we are done; set variable lon-with-elem-left(let ((lon-with-max-elem-left (reverse (put-max-last lon)))) ; build sorted list using recursion (cons (car lon-with-max-elem-left) ; select head of lon-with-max-elem-left
; apply bubble-sort to the tail of lon-with-max-elem-left (bubble-sort (cdr lon-with-max-elem-left))))
) ; end if ) ; end lambda) ; end define
Com S 342
Procedure put-max-last - Review
(define put-max-last ; define procedure name “put-max-last” (lambda (lon) ; define procedure (if (null? lon) ; test whether lon is empty `() ; if list is empty, we are done (if (null? (cdr lon)) ; test whether lon is a singleton lon ; if yes, we are done (if (< (car lon) (cadr lon)) ; left < right?
(cons (car lon) (put-max-last (cdr lon))) ; true (cons (cadr lon) (put-max-last (cons (car lon) (cddr lon)))) ; false ) ; end if ) ; end if
) ; end if ) ; end lambda) ; end define
Com S 342
Equivalence Predicates
Scheme provides three equivalence predicates: (eq? obj1 obj2) – compares object identities
This is the finest or most discriminating equivalence predicate.(eq? `a `a) #t
(eqv? obj1 obj2) – compares primitive valuesReturns #t if obj1 and obj2 should normally be regarded as the same object.(eqv? #t #t) #t
(equal? obj1 obj2) – structural equivalenceThe predicate equal? Recursively compares the contents of pairs, vectors, andstrings, applying eqv? on other objects such as numbers and symbols.(equal? `(a b c) `(a b c)) #t
Com S 342
Why Do We Study Programming Languages?
Some have suggested that there is no need to develop newcomputer languages nor even to teach language design andcompiler theory.
Correct/Wrong
Com S 342
What Is a Programming Language?
A formal notation for describing computation A “user interface” to a computer
A more precise tool than any natural language
Programming paradigms – different expressive power Syntax + semantics Compiler, or interpreter, or translator
Com S 342
Core Properties of Programming Languages
Languages provide the framework for the way we organizecomplexity in our own minds.
Languages are the means by which we communicate ourunderstanding.
Com S 342
Reasons for Studying Concepts of Programming Languages
The potential benefits of studying language concepts are:
Increased capacity to express ideas. The way we think is greatly influenced by the expressive power of the language in
which we communicate our thoughts.
Improved background for choosing appropriate languages. Many programmers, when given a choice of languages for a new project, continue
to use the language with which they are most familiar, even if it is poorly suited tothe new project.
Increased ability to learn, to design, and to implement a new language.
Com S 342
How Do Programming Languages Differ?
Generations: 1GL: machine code 2GL: symbolic assemblers 3GL: (machine independent) languages
Fortran, Algol, Pascal, Smalltalk, C++, Java, Lisp, Haskell, Scheme, Prolog 4GL: domain specific application generators
Common Constructs:basic data types (numbers, etc.); variables; expressions; statements;keywords; control constructs; procedures; comments …
Uncommon Constructs:type declarations; special types (strings, arrays, matrices, …);concurrency constructs; packages/modules; generics; exceptions; …
Com S 342
Key Theses
Thesis 1: Speak the programming language that you need towork with.
Every programming language meets some specialized goal.
Thesis 2: Programming languages are invented while you sleep,and spread before you wake up.
Many languages already address your problem; the user onlyneeds to find the appropriate language.
Thesis 3: Understanding programming languages is the key toyour job.
Com S 342
Programming Domains
All programming languages have been developed with differentgoals in mind. Every language has its designated applicationdomain, which, in general, requires a specific sets ofprogramming abstractions or/and runtime models.
Scientific applications: floating-point arithmetic (Fortran, Algol) Business applications: reports, decimal arithmetic (Cobol) Artificial intelligence: symbolic computation (Lisp, Prolog) System programming: operating systems (C, Pascal) Scripting languages: system configuration (sh, awk, Perl, Tcl)
Com S 342
Programming Paradigms
Imperative style: program = algorithms + data
Functional style: program = function · function
Logic programming style:program = facts + rules
Object-oriented style:program = objects + messages
Other styles and paradigms:blackboard, events, pipes and filters, constraints, lists, …
Com S 342
Imperative Programming
This is the oldest style of programming, in which the algorithm for thecomputation is expressed explicitly in terms of instructions such asassignments, tests, branching and so on.
Execution of the algorithm requires data values to be held in variables whichthe program can access and modify.
Languages so classified include assembly languages, Fortran, Algol, Pascal, Cand Ada.
Imperative programming corresponds naturally to the earliest, basic and stillused model for the architecture of the computer, the von Neumann model.
Com S 342
Functional Programming
Functional programming takes a much more mathematical approach, basedon the lambda calculus.
The concept of variables is not used in pure functional programming.Instead, the computation is described as a function, which is applied to theinput data and which gives the result(s) as output data.
This style is more abstract since it requires the algorithm to be described ina way that is independent of the data.
However, there are only a few pure functional languages, because thisconcept is often considered to be cumbersome and program are verytedious to write.
Most prominent languages of this style are Lisp, ML, Scheme, Haskell.
Com S 342
Logic Programming
Logic programming is like functional programming, it also takes amathematical approach, but this time through formal logic.
A program is described in terms of predicates, which are the rules that governthe problem. At run-time the use of logical inference enables new formulae tobe deduced from those given, or the truth or falsehood of a formula to bededuced from the predicates (full unification).
Logical inference is very much like the process of proving a theorem inmathematics, starting from the axioms and theorems already proved.
The best-known logic language is Prolog.
Com S 342
Object-oriented Programming
In general, object-oriented languages are based on the concepts of classand inheritance, which may be compared to those of type and variablerespectively in a language like Pascal.
A class describes the characteristics common to all its instances, in a formsimilar to the record of Pascal, and thus defines a set of fields.
In object-oriented programming, instead of applying global procedures orfunctions to variables, we invoke the methods associated with theinstances, an action called “message passing”.
The basic concept inheritance is used to derive new classes from exitingones by modifying or extending the inherited class(es).
The most prominent object-oriented languages are Smalltalk, C++, Eiffel,Java, ObjectPascal.
Com S 342
Sequential Languages
Instructions are executed one after another in an order that isdeduced from the text of the program.
These are the most widely used languages, since theycorrespond to the classic von Neumann architecture.
Pascal, Haskell, Smalltalk, Java, for example, are members of theclass of sequential languages.
Com S 342
Parallel Languages
In contrast to sequential languages, several program instructions can beexecuted simultaneously.
These languages are designed to develop programs for multi-processor(distributed memory) architectures.
Parallel languages demand special constructs for communication andsynchronization.
The general model for programming in terms of objects can easily be madeparallel – actor languages.
The most prominent parallel languages are Occam, Actor.
Com S 342
Special-purpose Languages
Shell, Awk, Perl, Python, JavaScript: Rapid prototyping System administration Program configuration
Postscript, HPGL, Tex, RTF: Text setting Description of text, graphical shapes, and images
HTML, XML: Markup languages
Com S 342
A Brief History
Fortran, 1957
Lisp, 1960
Algol-68
Ada, 1983
Pascal, 1975
Prolog, 1970
Algol-60
C, 1972
Smalltalk, 1983
Simula, 1962
PL/I, 1965
Java, 1993Haskell, 1990
C++, 1986
Basic, 1964
Cobol, 1960
Scheme, 1975
Com S 342
Syntax
The syntax of a programming language is concerned with theform of programs, i.e., how expressions, commands,declarations, etc., are put together to form programs.
A well-designed programming language will have a well-designed syntax. However, the syntax definition given for aspecific language is not power-full enough to define aprogramming language completely. The purpose of a well-defined syntax is to guide the programmer to understand thelanguage’s semantics.
Com S 342
Semantics
The semantics of a programming language is concerned withthe meaning of programs, i.e., how they behave when executedon computers.
The semantics of a programming language assigns a precisemeaning to every sentence of the language that can be formedusing the given syntax definition. There are three approaches todefine the semantics of a programming language: Axiomatic semantics, Operational semantics, Denotational semantics.
Com S 342
The Hilbert-style Proof System
A Hilbert-style proof system consists of axioms and proof rules. An axiom of a proof system is a formula that is provable by definition. An inference rule asserts that if some list of formulas is provable, then so
is another formula. A proof is a structured object built from formulas according to constraints
established by a set of axioms and inference rules.
The rule format:
We construct a proof from proofs:
Conclusion
PremisePremisePremisen21
...
Conclusion
Conclusion
Premise
Conclusion
Premise
Conclusion
Premise
n
n
2
2
1
1 ...
Com S 342
Axiomatic Semantics
The axiomatic semantics is a formal (proof) system for deriving equationsbetween expressions.
The basic idea of the axiomatic method is to define the meaning oflanguage elements indirectly using logical assertions. For example, we canwrite {E1} C {E2}, called a Hoare triple, to state that if the booleanexpression E1 holds prior the computation of C, and if C terminates, thenthe boolean expression E2 must hold as well.
Examples:
{ a > 0 } a := a + 1 { a > 1 }
{E1} C1 {E2} {E2} C2 {E3}
{E1} C1;C2 {E3}
Com S 342
Example Rules
C =def if C.Test then C.Then else C.Else
C =def C.Target := C.Source
Rule of Consequence
Proof RuleProof RuleStatement TypeStatement Type
C{Q}C.Target]}\e{Q[C.Sourctrue
{P}C{Q}}P}C.Else{Q C.Test {}P}C.Then{Q {C.Test !¬!
{P}C{Q}Q Q'}}C{Q'{P'P' P !!
Com S 342
Using axiomatic semantics, we need to prove the validity of agiven Hoare triple.
Example: {true} if (a >= b) then
C =def m = a; else
m = b;{m = max(a, b)}
Example
Com S 342
Proof
b)} max(a, {ma; b}m {ab)} max(a, {ma; b)}m max(a, {a
trueb) max(a, a b a
==!
===="!
b)} max(a, {mb; b}m {ab)} max(a, {mb; b)}m max(a, {b
trueb) max(a, b b a
==<
====!<
Premise I:
Premise II:
b)} max(a, {true}C{mb)} max(a, {mb; true}m b {ab)} max(a, {ma; true}m b {a
=
==!<==!"
Com S 342
Operational Semantics
The operational semantics is based on a directed form of equational reasoningcalled “reduction”. Reduction may be regarded as a form of symbolicevaluation.
The basic idea of the operational method is to define the meaning of thelanguage elements by means of a (labeled) transition system.
The operational semantics definition provides means to display thecomputation steps undertaken when a program is evaluated to its output.
Some forms of operational semantics are interpreted-based, with instructioncounters, data structures, and the like, and others are inference rule-based,with proof trees that show control flows and data dependencies.
Com S 342
Example Transition Rules
C =def if C.Test then C.Then else C.Else
C =def C.Target := C.Source
Transition RuleTransition RuleStatement TypeStatement Type
)} ,{(C.Target C.Source) :(C.Target (C.Source)
!""!"#$=
=
' C.Else) else C.Then thenC.Test (if' (C.Then)true) (C.Test !!
!!!"
"=
' C.Else) else C.Then thenC.Test (if' (C.Else)false) (C.Test !!
!!!"
"=
Com S 342
Denotational Semantics
The denotational semantics, or model theory, is defined in the spirit ofequational logic or first-order logic. A denotational semantics definition(model) consists of a family of sets, one for each type, with the property thateach well-typed expression may be interpreted as a specific element of theappropriate set.
The denotational semantics is a recursive definition that maps well-typedderivation trees to their mathematical meanings. For example, the set Boolconsists of two meanings: Bool = {true, false} and an operationnot : Bool Bool with not(false) = true, not(true) = false.
The denotational method does not maintain states, but the meaning of aprogram is given as a function that interprets all language elements of a givenprogram as elements of a corresponding set of values.
Com S 342
Example Meaning Functions
C =def if C.Test then C.Then else C.Else
C =def C.Target := C.Source
Meaning FunctionsMeaning FunctionsStatement TypeStatement Type
ERROR else
))} ,C.Source( ), ,C.Target({( then
nil) ) ,C.Target(( if
) ,C.Source :C.Target (
!!!!
!
"#"#$="#
="=#
ff
f
f
) ,C.Else( else
) ,C.Then( then
) ,C.Test( if
) ,C.Else else C.Then thenC.Test if(
!!
!!
"#"#
"#="#
f
f
f
f
Com S 342
Types and Type Systems
Types are collections of values that share some common properties. Whenwe say that v is a value of type T, we mean that v ∈ T.
In some systems, there may be types with types as members. Types withtypes as members are usually called something else, such as universes,orders or kinds, to avoid the impression of circularity.
In a type system, types provide a division or classification of some universeof possible values. A type system defines in a mathematical way (axioms anddeduction-rules), which expressions are typable, i.e., which expressions canbe assigned a valid type using the underlying type system.
In most programming languages, types are “checked” in some way, eitherduring program compilation, or during execution. The main purpose of typechecking is the detection of errors, documentation, program optimization,etc.
Com S 342
Values
In computer science we classify as a value everything that may be evaluated,stored, incorporated in a data structure, passed as an argument to aprocedure or function, returned as a function result, and so on.
In computer science, as in mathematics, an “expression” is used (solely) todenote a value.
Which kinds of values are supported by a specific programming language isheavily depended on the underlying paradigm and its application domain.
Most programming languages share some basic sets of values like truthvalues, integers, real number, records, lists, etc.
Com S 342
Constants
Constants are named abstractions of values.
Constants are used to assign an user-defined meaning to a value.
Examples: EOF = -1 TRUE = 1 FALSE = 0 PI = 3.1415927 MESSAGE = ”Welcome to Com S 342”
Constants do not have an address, i.e., they do not have a location.
At compile time, applications of constants are substituted by theircorresponding definition.
Com S 342
Primitive Values
Primitive values are these values that cannot furtherdecomposed. Some of these values are implementation andplatform dependent.
Examples: Truth values, Integers, Characters, Strings, Enumerands, Real numbers.
Com S 342
Composite Values
Composite values are built up using primitive values andcomposite values. The layout of composite values is in generalimplementation dependent.
Examples: Records, Arrays, Enumerations, Sets, Lists, Tuples, Files.
Com S 342
Pointers
Pointers are references to values, i.e., they denote locations of a values.
Pointers are used to store the address of a value (variable or function) –pointer to a value, and pointers are also used to store the address of anotherpointer – pointer to pointer.
In general, it not necessary to define pointers with a greater reference levelthan pointer to pointer.
In modern programming languages, we find pointers to variables, pointers topointer, function pointers, and object pointers, but not all programminglanguages provide means to use pointers directly (e.g. Java, Scheme).
Com S 342
Inductive Sets of Data
Overview Inductive Specification Backus-Naur Form Proof by Induction
References Daniel P. Friedman et al., “Essentials of Programming Languages”, Second
Edition, MIT Press, 2001 John C. Mitchell, “Foundations of Programming Languages”, MIT Press
1996
Com S 342
Sets
A set is a collection of elements (or values), possibly empty.
All elements satisfy a possibly complex characterizing property. Formally, wewrite:
{ x ∈ A | P(x) = True }
to define a set, where all elements satisfy the property P.
The basic axiom of set theory is that there exists an empty set, ∅, with noelements. Formally,
∀x, x ∉ ∅
In words, “for every x, x is not an element of ∅”.
Com S 342
Inductive Specification
Sometimes it is difficult to define a set explicitly, in particular if the elementsof the set have a complex structure.
However, it may be easy to define the set in terms of itself. This process iscalled inductive specification or recursion.
Example:Let the set S be the smallest set of natural numbers satisfying the followingtwo properties: 0 ∈ S, and Whenever x ∈ S, then x + 3 ∈ S.
The first property is called base clause and the second property is calledinductive/recursive clause. An inductive specification may have multiple baseand inductive clauses.
Com S 342
The “Smallest Set”
If we use inductive specification, we always define the smallestset that satisfies all given properties. That is, inductivespecification is free of redundancy.
It is easy to see that there can only be one such set:If S1 and S2 both satisfy all given properties, and both are thesmallest, then we have S1 ⊆ S2 (since S1 is the smallest), andS2 ⊆ S1 (since S2 is the smallest), hence S1 = S2.
Com S 342
List of Numbers
The set list-of-numbers is the smallest set of values satisfying the twoproperties: The empty list is a list-of-numbers, and If l is a list-of-numbers and n is a number, then the pair (n . l) is a list-of-numbers.
A pair “(x . y)” (also called dotted pair) is a record structure with two fieldscalled the car (head) and cdr (tail) field. Pairs are created using the procedurecons.
Examples: () is a list-of-numbers, since () satisfies property 1. (14 . ()) is a list-of-numbers, since 14 is a number and () is a list-of-numbers. (4 . (14 . ())) is a list-of-numbers, since 3 is a number and (14 .()) is a list-of-
numbers.
Com S 342
Well-formed Formulae
Well-formed formulae for compound boolean propositions are defined asfollows: True and False are well-formed formulae, p, where p is a propositional variable, is a well-formed formula, (¬ p) is a well-formed formula, if p is a well-formed formula, (p ∧ q), (p ∨ q), (p → q), (p ↔ q) are well-formed formulae, if both p and q are
well-formed formulae.
Examples: p → ¬ q (p → q) ↔ ((¬ p ∨ q) → q)
Note: This inductive specification of well-formed boolean propositions definesalso a extended boolean term algebra TΣ(X), where the carrier set consistsprecisely of all the terms which can be generated from the constants,variables and operations of the signature Σ (i.e., the inductive specification).
Com S 342
The Backus-Naur Form
The process of describing more complex data types becomes quitecumbersome. In order to simplify this process, we specify complex valuesusing a context-free grammar (or type 2 grammar).
We use a notation called Backus-Naur Form, of BNF, to specify values using acontext-free grammar: The general rule format is: lhs ::= rhs, where lhs is a nonterminal, and rhs may be
a list, separated with “|” of strings of terminals and nonterminals All nonterminals are enclosed in brackets,<>.
Example: list-of-numbers<list-of-numbers> ::= ()
| (<number> . <list-of-numbers>)
Note: In BNF, some nonterminals (e.g. <number>) are left undefined, whentheir meaning is sufficiently clear from the context.
Com S 342
Types of Grammars
Type 0: A grammar that has no restrictions on its productions.
Type 1 – context-sensitive: A grammar can only have productions of the formw1 w2, where the length of w2 is greater than or equal to the length of w1,or the the form w1 ε.
Type 2 – context-free: A grammar can only have productions of the formw1 w2, where w1 is a single nonterminal.
Type 3 - regular: A grammar can only have productions of the form w1 w2,with w1 is a nonterminal and w2 is either aB, Ba, a, or ε, where B is anonterminal and a is a terminal.
Com S 342
Kleene Star
The Kleene star, written { … }*, is used to specify a sequence ofany number of instances of a given string.Example:
<s-list> ::= ({<symbol-expression>}*)<symbol-expression> ::= <symbol> | <s-list>
()(a b c)(fun1 (fun2 arg1 arg2) arg3 arg4)
Com S 342
Kleene Plus
The Kleene plus, written { … }+, is used to specify a sequence ofone or more of instances of a given string.Example:
<nonempty-list> ::= ({<datum>}+)<datum> ::= <number> | <symbol> | <string>
(a b “ComS342”)(fun1 (fun2 arg1 arg2) 3 “An argument”)
Com S 342
Separated List Notation
The separated list notation, written { … }*(c) or { … }+(c), can beused to specify any number of instances of a given string thatare separated with a non-empty character sequence.Example:
<list-of-expressions> ::= ({<expression>}*(,))
(1 , 2 , 3)
NOTE: This form is not used in the syntax specification of Scheme!
Com S 342
Induction
Having defined set inductively, we can use the inductive definition to proveproperties about members of the set.
The proof technique used is called mathematical induction.
The most common forms are induction on the structure of expressions andinduction on the length or structure of proofs.
A simple and intuitive way to think of induction is that it is a method forwriting down an infinite proof in a finite way.
Note: We can construct infinitely many values from a given inductivespecification.
Com S 342
Mathematical Induction
A proof by mathematical induction that a given property P is true for everypositive integer n, we write P(n), consists of two steps:
1. Basic step. The proposition P(1) (or P(0)) is shown to be true.
2. Inductive step. The implication P(n) P(n+1) is shown to be true for every positive integer n.
Note: In a proof by mathematical induction it is not assumed that P(n) is true forall positive integers! It is only shown that if it is assumed that P(n) is true,then P(n+1) is also true. In general, we use an inference rule called “Modusponens”, that is, [p ∧ (p → q)] → q. In words, if a property “p” is true and“p” implies “q”, then “q” is also true. The rule “Modus ponens” is a“tautology”.
Com S 342
Example: Sum(n) = n(n+1)/2
As a young boy, the later mathematician Carl Friedrich Gauss was asked byhis teacher to add up the first hundred numbers, in order to keep him quietfor a while. As we know today, this did not work out, since:
sum(n) = n(n+1)/2
Proof: Base case: We must show that sum(0) = 0(0+1)/2. This is an easy calculation,
and we have sum(0) = 0.
Inductive set: Assume sum(n) = n(n+1)/2 holds. We must show that sum(n+1) =(n+1)(n+2)/2 holds as well. First, sum(n+1) is just the sum of the first n numbersplus (n+1). Therefore, we have sum(n+1) = sum(n) + (n+1). Using the inductionhypothesis, we have
sum(n+1) = n(n+1)/2 + (n+1) = (n(n+1) + 2(n+1))/2 = (n2 + n + 2n + 2)/2 = (n+1)(n+2)/2
as required.
Com S 342
Theorem 1.1.1
Let s ∈ <binary-tree>, where <binary-tree> is defined by
<binary-tree> ::= <number> | (<symbol> <binary-tree> <binary-tree>)
Then s contains an odd number of nodes.
We prove this theorem by induction of the size of s, where wetake the size of s to be the number of nodes in s.
Com S 342
Proof of Theorem 1.1.1
The statement P(n) for a fixed positive integer n is called“induction hypothesis”. When we complete both steps of a proofby mathematical induction, we have shown that P(n) is true forall positive integers n, that is we have shown that ∀n P(n) istrue.
[P(1) ∧ ∀n (P(n) → P(n+1))] → P(n+1)
To prove Theorem 1.1.1, first we have to show that P(0) is true(s has no nodes at all), and then we prove that whenever n is anumber of nodes such that P(n) is true for n, then P(n+1) is alsotrue.
Com S 342
The Proof Steps
The induction hypothesis, P(n), is that any tree of size ≤n has an odd number ofnodes. We use induction on the size of binary trees. There are no trees with 0 nodes, so P(0) holds trivially. Let n be a size such that P(n) holds, that is a tree with ≤ n nodes has actually an
odd number of nodes. We need to show that P(n+1) holds as well, that is any treewith ≤ n+1 nodes has an odd number of nodes. We proceed by case analysis ofthe structure of binary trees:
s ≡ n:In this case s has exactly one node, and one is odd.
s ≡ (sym s1 s2):By assumption, s has ≤ n+1 nodes. Therefore, both s1 and s2 must have fewer nodesthan s, that is, s1 and s2 must have ≤ n nodes. Using the induction hypothesis, thenumber of nodes must be odd, say 2n1+1 and 2n2+1. Hence, the total number of nodesin the tree s is
((2n1+1) + (2n2+1)) + 1 = (2n1+2n2 +2) + 1 = 2(n1+n2 +1) + 1which is odd again.
Com S 342
Structural Induction
Structural induction uses the fact that substructures of a givenobject are always smaller than the object itself.
We have used this fact in the proof of Theorem 1.1.1.
Structural induction is done as follows: Base step: The induction hypothesis is true on simple structures (those
without substructures).
Induction step: If the induction hypothesis is true on the substructures ofa given object, say s, then it is true on s itself.
Com S 342
The Hilbert-style Proof System
A Hilbert-style proof system consists of axioms and proof rules. An axiom of a proof system is a formula that is provable by definition. An inference rule asserts that if some list of formulas is provable, then so
is another formula. A proof is a structured object built from formulas according to constraints
established by a set of axioms and inference rules.
The rule format:
We construct a proof from proofs:
Conclusion
PremisePremisePremisen21
...
Conclusion
Conclusion
Premise
Conclusion
Premise
Conclusion
Premise
n
n
2
2
1
1 ...
Com S 342
Recursive Program Specification
Overview Inductive program specification Deriving programs from a Backus-Naur form Pattern of recursion
References Daniel P. Friedman et al., “Essentials of Programming Languages”, Second
Edition, MIT Press, 2001 Harold Abelson et al., “Structure and Interpretation of Computer
Programs”, MIT Press, 1996
Com S 342
From BNF to a Program
With the help of BNF-rules, starting with simple members of adata set, we are able to specify inductively complex datastructures.
We can use the same approach to construct programs thatmanipulate these data structures.
First we define the program’s behavior on simple inputs, andthen we use this behavior to build inductively programs that canprocess with more complex arguments.
Com S 342
Exponentiation
Consider the problem of computing the integer exponential of a given integernumber.
A program that this problem should take as arguments a base b and apositive integer exponent n and computes bn.
b * b * ... * b = bn
orb0 = 1, b1 = b, b2 = b * b, ..., bn = bn-1 * b
In general,
!"
!#
$
>
==
0 n) 1-n b, e( * b
0 n1 ) n b, e(
Com S 342
Is e( b, n ) = bn correct?
To show that e( b, n ) = bn is indeed correct, we proceed by induction on n:
Base step: n = 0Then we have e( b, 0 ) = 1 = b0.
Induction Step:Assume e( b, n ) = bn is correct. We must show thate( b, n+1 ) = b(n+1). Then by the definition of e, it holds thate( b, n+1 ) = b * e( b, n ). Using the induction hypothesis,
we have e( b, n+1 ) = b * bn = b1 * bn = bn+1, as desired.
Com S 342
Procedure Exponential
The Scheme procedure for e( b, n ) is defined as follows:
(define exponential (lambda (b n) (if (zero? n) 1 (* b (exponential b (- n 1))))))
The two branches of the if expression correspond to the two cases of theinductive definition of e( b, n ).
If we can reduce a given problem to a subproblem, we can recursively call theprocedure that solves the original problem to solve the subproblem.
Com S 342
Exponentiation with negative Exponent
(define exponential (lambda (b n) (if (zero? n) 1 (if (negative? n) (* (/ 1 b) (exponential b (+ n 1))) (* b (exponential b (- n 1)))))))
This procedure works on all integers (including negative integers).
It holds that b-n = 1/bn for all integers n. Moreover, we can use inductiveprogram specification, since b-n = 1/b * 1/b(n-1).
Com S 342
Recursion
If a procedure that contains within its body calls to itself, then this procedureis called to be recursively defined.
This approach of program specification is called recursion and is found notonly in programming.
If we the define a procedure recursively, then there must exist at least onesubproblem that can be solved directly, that is without calling the procedureagain.
Note: A recursively defined procedure must always contain a directly solvablesubproblem. Otherwise, this procedure does not terminate.
Com S 342
Direct Program Derivation
An inductive proof can often be used to directly derive the correspondingcomputer program.
For example, the proof of Theorem 1.1.1 (A binary tree contains an oddnumber of nodes) leads directly to the following program:
(define count-nodes (lambda (s) ;; s in <binary-tree> (if (number? s) ;; s = <number> 1 (+ (count-nodes (cadr s)) ;; s = (sym s1 s2), cadr = cdr+car (count-nodes (caddr s)) ;; caddr = cdr+cdr+car 1))))
> (count-node `(s 1 1)) ==> 3
Com S 342
Rule of Thumb
When defining a program based on structural induction,the structure of the program must be patterned
according the structure of the data.
In general, this means that we have to define one procedure foreach syntactic category used to specify our data. Then eachprocedure has to examine the input to see, which right-hand-side it corresponds to. Furthermore, for every nonterminal thatappears in the right-hand-side, there will be a recursive call to aprocedure for that nonterminal. This approach is also calledrecursive-descent-parsing.
Com S 342
Always Remember
FOLLOW THE GRAMMAR
Com S 342
Predicate list-of-numbers?
<list-of-numbers> ::= ()| (<number> . <list-of-numbers>)
The predicate list-of-numbers? is recursively defined procedure, which analyses agiven list l according to the BNF specification:
(define list-of-number? (lambda (l)
(if (null? l) ;; null? returns #t if the argument is () #t
(and ;; second case: check the pair (number? (car l)) ;; (car (1 . 2)) = 1 (list-of-numbers? (cdr l)))))) ;; (cdr (1 . 2)) = 2
> (list-of-numbers? `(1 . (1 2))) ;; (1 2) = (1 . (2 .())), see R5RS page 25f#t
Com S 342
Introduction to the Lambda Calculus
Overview: What is Computability? – Church’s Thesis The Lambda Calculus Scope and lexical address The Church-Rosser Property
References: Daniel P. Friedman et al., “Essentials of Programming Languages”, Second Edition,
MIT Press, 2001 H.P. Barendregt, “The Lambda Calculus – Its Syntax and Semantics”, North-
Holland, 1984 David A. Schmidt, “The Structure of Typed Programming Languages”, MIT Press,
1994 Carl A. Gunter, “Semantics of Programming Languages”, MIT Press, 1992
Com S 342
What Is Computable?
Computation is usually modeled as a mapping from inputs to outputs,carried out by a formal “machine”, or program, which processes its input ina sequence of steps.
An “effectively computable” function is one that can be computed in a finiteamount of time using finite resources.
Problem
input
yes
no
output
“effectively computable”
function
program/machine
Com S 342
Church’s Thesis
Effectively computable functions [from positive integers to positive integers]are just those definable in the lambda calculus.
Or, equivalently:
It is not possible to build a machine that is more powerful than a Turingmachine.
Church’s thesis cannot be proven because “effectively computable” is anintuitive notion, not a mathematical one. It can only be refuted by given acounter-example – a machine that can solve a problem not computable be aTuring machine.
So far, all models of effectively computable functions have shown to beequivalent to Turing machines (or the lambda calculus).
Com S 342
Turing Machine
A Turing machine is an abstract representation of a computingdevice. It consists of a read/write head that scans a (possiblyinfinite) one-dimensional (bi-directional) tape divided intosquares, each of which is inscribed with a 0 or 1.
Computation begins with the machine, in a given "state",scanning a square. It erases what it finds there, prints a 0 or 1,moves to an adjacent square, and goes into a new state.
This behavior is completely determined by three parameters: the state the machine is in, the number on the square it is scanning, and a table of instructions.
Turing machine is more like a computer program (software) thana computer (hardware).
Com S 342
Example
(3,1)(3,0)3
(3,1)(3,0)2
(3,1)(2,0)1
(3,1)(1,0)0
10
0 1 2
3
1/1
0/0 0/0
0/01/11/1
Both specification describe the same Turing machine.
Com S 342
Uncomputability
A problem that cannot be solved by any Turing machine in finite time (or anyequivalent formalism) is called uncomputable.
Assuming Church’s thesis is true, an uncomputable problem cannot be solved by any real computer.
The Halting ProblemGiven an arbitrary Turing machine and its input tape, will the machineeventually halt?
The Halting Problem is provably uncomputable – which means that it cannotbe solved in practice.
Com S 342
Ackermann Function
The Ackermann function is the simplest example of a well-defined totalfunction which is computable but not primitive recursive.
The function f(x) = A(x, x), while Turing computable, grows much faster thanpolynomials or exponentials. The definition is:
A(0, y) = y + 1A(m+1, 0) = A(m, 1)A(m+1, n+1) = A(m, A(m+1, n))
Examples: A(2, 3) = 9A(3, 5) = 253A(4, 1) = 65533A(4, 3) = 2265536-3
Com S 342
The Lambda Calculus
Lambda calculus is a language with clear operational and denotationalsemantics capable of expressing algorithms. Also it forms a compactlanguage to denote mathematical proofs.
Logic provides a formal language in which mathematical statements can beformulated and provides deductive power to derive these. Type theory is aformal system, based on lambda calculus and logic, in which statements,computable functions and proofs all can be naturally represented.
The lambda calculus is a good medium to represent mathematics on acomputer with the aim to exchange and store reliable mathematicalknowledge.
Com S 342
The Definition of the Lambda Calculus
The Lambda Calculus was invented by Alonzo Church [1932] as amathematical formalism for expressing computation by functions.
Syntax: e ::= x a variable | λx . e an abstraction (function ) | e1 e2 a (function) application
(Operational) Semantics:α-conversion (renaming): λx . e ↔ λy . [y/x]e where y is fresh (in e)β-reduction (application): (λx . e1) e2 [e2/x]e1 avoiding name captureη-reduction: λx . (e x) e if x is not free in e
The lambda calculus can be viewed as the simplest possible pure functionalprogramming language.
Com S 342
The Scheme Syntax
<expression> ::= <identifier> | (lambda (<identifier>) <expression>) | (<expression> <expression>)
Examples:id = (lambda (x) x)Ω = ((lambda (x) (x x)) (lambda (x) (x x)))pair(x, y) = (lambda (x) (lambda (y) (lambda (z) ((z x) y))))
Com S 342
Reference or Declaration
In a program, variables can appear in two different ways:
as declarations: (lambda (x) …) or (let ((x …)) …)
The occurrence of x in both the lambda-abstraction and the let-clauseintroduces the variable as a name for some value. In particular, in thelambda expression, the value of the variable x will be supplied when theprocedure is called, whereas in the let expression the value of the variableis determined by the value of the first “…” (init expression).
as references: (f x y)
Here all variables, f, x, y, appear as references, whose meanings aredefined by an enclosing declaration.
Com S 342
Binding
A value named by a variable is also called denotation (meaning). Thedenotation must come from some declaration, we say the variable is boundby that declaration, or it refers to that declaration.
Declarations in most programming languages, including Scheme, havelimited scope (the area, where the variable is applicable). Therefore, a thesame variable name may occur multiple times in the program text, but beingused for different purposes. We use binding rules to determine thedeclaration to which a concrete variable use refers.
Scoping rules: We call a language statically scoped, if we can determine the declaration of a
variable by analyzing the program text alone. We call a language dynamically scoped, if we cannot determine the declaration of
a variable until the program is executed.
Com S 342
Binding Rules in Lambda Calculus
In (lambda (<identifier>) <expression>), the occurrence of<identifier> is a declaration that binds all occurrences of thatvariable in <expression> unless some intervening declaration ofthe same variable occurs.
Examples:(lambda (x) (lambda (y) (y x)))(lambda (x) (lambda (y) ((lambda (x) (lambda (y) (x y))) x) y))
Com S 342
Occurs Free, Occurs Bound
A variable x occurs free in e if and only if there is some use of x in e, that isnot bound by any declaration of x in e.
A variable x occurs bound in an expression e if and only if there is some useof x in e that is bound by a declaration of x in e.
Examples:((lambda (x) x) y): x is bound, but y is free(lambda (f) (lambda (x) (f x))): both f and x are bound
Note: Lambda expressions with no free variables are called combinators. Everyprocedure, when applied to all its necessary arguments, is a combinator.Therefore, procedure calls are called combinations in Scheme.
Com S 342
Free and Bound Variables
The variable x is bound by the enclosing λ in the expression λx. e. A variablethat is not bound, is free:
fv( x ) = { x } bv( x ) = ∅fv( λx . e ) = fv( e ) \ { x } bv( λx . e ) = bv( e ) ∪ { x }fv( e1 e2 ) = fv( e1 ) ∪ fv( e2 ) bv( e1 e2 ) = bv( e1 ) ∪ bv( e2 )
An expression with no free variables is closed (otherwise it is open). Forexample, y is bound and x is free in the (open) expression λy . x y.
Syntactic substitution will not always work:(λx . λy . x y) y [y/x](λy . x y) β-reduction
≠ (λy . y y) incorrect substitution!Since y is already bound in (λy . x y), we cannot directly substitute y for x.
Com S 342
The Scope of a Variable
Problem:For each variable reference find the corresponding declarationto which it refers.
This problem is easier to solve, when we ask:Given a declaration, which variable references refer to it?
In the definition of programming languages, binding rules for variablestypically associate with each declaration of a variable a region of the programwithin the declaration is effective.
Examples:(lambda (x) …): The region of x is the body of the lambda expression.(define x …): The region of x is the whole program.
Com S 342
Blocks
In lambda-calculus as in many modern programming languages regions canbe nested within each other. We call these languages block-structured, andregions are also called blocks.
> (define x ; first declaration of x (lambda (x) ; second declaration of x (map
(lambda (x) ; third declaration of x (+ x 1)) ; refers to third x))) ; refers to second
> (x `( 1 2 3)) ; refers to first(2 3 4)
Com S 342
Visibility
The scope of a variable, say x, can include inner regions that hide the variablex. Within these inner region the outer declaration of the variable x is hidden,that is, the scope of x has a hole.
We say the declaration of a variable is visible at the point of a variablereference, if the this declaration contains a variable reference within its scope.
Example: (lambda (x) (lambda (y) ((lambda (x) (lambda (y) (x y))) x) y))
Com S 342
Contour Diagrams
We use contour diagrams to picture the borders of a region:
The lexical (or static) depth of a variable reference is the number of contourscrossed to find the associated declaration.
The lexical depth is used in compilers to tell how many static links to traverseto find a variable.
Environment
Com S 342
Lexical Address
The declarations associated with a region may be numbered in the order oftheir appearance in the text. Each variable reference may then be associatedwith two numbers: its lexical depth and its position (both start with 0).
To illustrate lexical addresses, we replace every variable reference x with anexpression (x : d p), where d is the lexical depth and p is the declarationposition of v.
(lambda (x y) ((lambda (a) ((x : 1 0) ((a : 0 0) (y : 1 1)))) (x : 0 0)))
Note: The lexical address can be used by a compiler: lexical depth = number ofstatic links, and the lexical position = offset within activation frame.
Com S 342
Beta Reduction
Beta reduction is the computational engine of the lambda calculus:
Define: I ≡ λx . x
Now consider:I I = (λx . x) (λx . x) [(λx . x)/x]x β-reduction
= (λx . x) substitution= I
We can implement most lambda expressions directly in Scheme:> (define i (lambda (x) x))>(i 5)5> (i (i 5))5
Com S 342
Substitution
We must define substitution carefully to avoid name capture:
[e/x]x = e[e/x]y = y if x ≠ y[e/x](e1 e2) = ([e/x]e1 [e/x]e2)[e/x](λx . e1) = (λx . e1)[e/x](λy . e1) = (λy . [e/x] e1) if x ≠ y and y ∉ fv( e )[e/x](λy . e1) = (λz . [e/x] [z/y] e1) if x ≠ y and z ∉ (fv( e ) ∪ fv( e1 ))
Consider:(λx . ((λy . x ) (λx . x)) x ) y [y/x]((λy . x) (λx . x)) x
= ((λz . y) (λx . x)) y
Com S 342
Alpha Conversion
Alpha conversions allows one to rename bound variables.
A bound name x in the lambda abstraction (λx. e) may be substituted by anyother name y, as long as there are no free occurrences of y in e:
Consider:(λx . λy . x y) y (λx . λz . x z) y α-conversion
[y/x] (λz . x z) β-reduction (λz . y z) = y η-reduction
Com S 342
Eta Reduction
η-reductions allows one to remove “redundant lambdas”.
Suppose that f is a closed expression (i.e., x does not occur freein f ). Then:
(λx . f x) y ([y/x]f ) ([y/x]x) = f y β-reduction
More generally, this will hold whenever x does not occur free inf. In such cases, we can always rewrite (λx . f x) as f.
Com S 342
Currying
Since a lambda abstraction only binds a single variable, functions withmultiple parameters must be modeled as curried higher-order functions. Thismethod is named after the logician H. B. Curry, who popularized theapproach.
To improve readability, multiple lambdas can be suppressed, so:λx y . x = λx . λy . xλb x y . b x y = λb . λx . λy . (b x) y
Scheme:(lambda (x y) x) = (lambda (x) (lambda (y) x))(lambda (b x y) (b x y)) = (lambda (b) (lambda (x) (lambda (y) ((b x) y))))
Com S 342
Normal Forms
A lambda expression is in normal form if it can no longer be reduced by the β- or η-reduction rules.
But not all lambda expressions have normal forms!
Ω = (λx . x x) (λx . x x) [(λx . x x)/x] (x x)= (λx . x x) (λx . x x) β-reduction(λx . x x) (λx . x x) β-reduction(λx . x x) (λx . x x) β-reduction(λx . x x) (λx . x x) β-reduction...
Reduction of a lambda expression to a normal form is analogous to the factthat a Turing machine halts or a program terminates.
Com S 342
Evaluation Order
Most programming languages are strict, that is, all expressions passed to afunction call are evaluated before control is passed to the function (e.g.Scheme).
Most modern functional languages, on the other hand, use lazy evaluation,that is, expressions are only evaluated when they are needed.
Consider: square n = n * n
Applicative-order reduction:square (2 + 5) square 7 7 * 7 49
Normal-order reduction:square (2 + 5) (2 + 5) * (2 + 5) 7 * (2 + 5) 7 * 7 49
Com S 342
Applicative-Order Reduction
Motivation: Modeling call-by-value in programming languages In function calls, evaluate arguments then invoke function
In the lambda-calculus, this means: In (e1 e2), reduce e2 to normal form using applicative order reduction Then reduce e1 to normal form using applicative order reduction If e1 is a lambda abstraction, do beta reduction, and reduce the result to normal
form using applicative order reduction
Syntax makes it easy: Write expression using fully parenthesized notation Always perform rightmost beta reduction by Repeatedly scanning for rightmost (left parenthesis) occurrence of ((λx . e1) e2) Note, this includes reduction of primitives, e.g. ((add 1) 2)
Com S 342
Applicative-Order Example
Consider:((λx . ((λy . add y y) (mul x x))) (sub 3 1))
Applicative order reduction gives((λx . ((λy . add y y) (mul x x))) (sub 3 1))((λx . ((λy . add y y) (mul x x))) 2)((λx . (add (mul x x) (mul x x))) 2)(add (mul 2 2) (mul 2 2))(add (mul 2 2) 4)
(add 4 4)8
Com S 342
Applicative-Order Example - Scheme
Consider:((lambda (x) ((lambda (y) (+ y y)) (* x x))) (- 3 1))
Applicative order reduction gives((lambda (x) ((lambda (y) (+ y y)) (* x x))) (- 3 1))((lambda (x) ((lambda (y) (+ y y)) (* x x))) 2)((lambda (y) (+ y y)) (* 2 2))((lambda (y) (+ y y)) 4)(+ 4 4)8
Com S 342
Normal-Order Example
Consider:((λx . ((λy . add y y) (mul x x))) (sub 3 1))
Normal-order reduction gives((λx . ((λy . add y y) (mul x x))) (sub 3 1))((λy . add y y) (mul (sub 3 1) (sub 3 1)))(add (mul (sub 3 1) (sub 3 1)) (mul (sub 3 1) (sub 3 1)))(add (mul 2 2) (mul 2 2))(add 4 4)8
Com S 342
The Church-Rosser Property
“If an expression can be evaluated at all, it can be evaluated by consistentlyusing normal- order evaluation. If an expression can be evaluated in severaldifferent orders (mixing normal-order and applicative-order reduction), thenall of these evaluation orders yield the same result”.
So, evaluation order “does not matter” in the lambda calculus. However,applicative order reduction may not terminate, even if a normal form exists!
(λx . y) ((λx . x x) (λx . x x))
Applicative-order reduction Normal-order reduction(λx . y) ((λx . x x) (λx . x x)) y(λx . y) ((λx . x x) (λx . x x)). . .
Com S 342
SKI Combinator Reduction
SKI combinator reduction is an implementation technique that yields normal-order (lazy) evaluation in the most natural way.
A lambda calculus expression (that denotes a program) can be transformedinto an equivalent combinator expression that contains only constants andapplications. Moreover, this combinator expression will contain neither anylambda abstractions nor any variables.
The reduction of combinator expressions is based on a combinator calculusthat does not have a beta reduction, hence term rewriting does not need tomanipulate variables and environments explicitly.
Com S 342
Combinators & Combinator Reduction
C x y z = x z y
B x y z = x (y z)
S x y z = x z (y z)
K x y = x
I x = x
ReductionNameCombinator
Swap functionC = λx . λy . λ z . x z y
Composition functionB = λx . λy . λz . x (y z)
Distribution functionS = λx . λy . λz . x z (y z)
Constant functionK = λx . λy . x
IdentityI = λx . x
The first three combinators I, K, and S are sufficient to transform everylambda expression into an equivalent combinator expression.
Com S 342
A Combinator Language
Syntax:<expression> ::= k ; constant
| S | K | I | ( <expression> <expression> )
Let e be a lambda calculus expression. Then the function U( e ) translates einto an equivalent combinator expression:
U( e ) = e ; e does not contain any λU( λx . e ) = [x](U( e ) )U( e1 e2 ) = U( e1 ) U( e2 )
Com S 342
[x]( e )
The function [x]( e ) is defined as follows:
[x]( k ) = K k [x]( x ) = I [x]( y ) = K y [x]( e1 e2 ) = S ([x]( e1 ) ) ([x]( e2 ) )
Com S 342
Building a Combinator Expression
U( λx . λy . x y ) = [x](U( λy . x y ) )= [x]([y](U( x y ) ) )
= [x]([y]( x y ) )= [x]( S ( [y]( x ) ) ( [y]( y ) ) )= [x]( S ( K x ) I )= S ( [x]( S ( K x ) ) ) ( [x]( I ) )= S ( S ( [x]( S ) ) ([x]( K x ) ) ) ( K I )= S ( S ( K S ) (S ( [x]( K ) ) ( [x]( x ) ) ) ) ( K I )= S ( S ( K S ) (S ( K K ) I ) ) ( K I )
Com S 342
Reducing a Combinator Expression
(S (S (K S) (S (K K) I)) (K I)) A B ; S x y z = x z (y z)= S (K S) (S (K K) I) A (K I A) B ; S x y z = x z (y z)= K S A (S (K K) I A) (K I A) B ; K x y = x= S (S (K K) I A) (K I A) B ; S x y z = x z (y z)= S (K K) I A B ((K I A) B) ; S x y z = x z (y z)= K K A (I A) B ((K I A) B) ; K x y = x= K (I A) B ((K I A) B) ; K x y = x= I A ((K I A) B) ; I x = x= A ((K I A) B) ; K x y = x= A (I B) ; I x = x= A B
( λx . λy . x y ) A B A B
Com S 342
Data Abstraction
Overview: Abstract data types The procedure define-datatype Abstract syntax Representation strategies for data types
References: Daniel P. Friedman et al., “Essentials of Programming Languages”, Second
Edition, MIT Press, 2001 David A. Schmidt, “Denotational Semantics”, MIT Press, 1986 Carl A. Gunter, “Semantics of Programming Languages”, MIT Press, 1992
Com S 342
New Sets of Values
The definition of a new data type (i.e. a new set of values)consists of two ingredients: Some set of values, called the interface, that serves as representation of
the newly define data type, and Some set of procedures, called the implementation, that provides the
operations, which can be used to manipulate entities of the newly defineddata type.
Example: <s-list> ::= ( {<symbol> . <s-list>}* ) (define up …), (define swapper …), (define flatten …)
Com S 342
Representation Independence
The representation of new data types can be often very complex.
When working with new data types, we usually do not want to be concernedwith their actual representation. In fact, program become more reliable androbust, if they do not depend on the actual representation of data type. Datatypes that do not expose their actual representation are called representationtransparent.
Data types in C/C++ and Scheme are in general not representationtransparent (e.g. the size of integers in C/C++ is platform dependent,boolean values in Scheme are represented by #t and #f).
Data types in Java are basically representation transparent (arrays are anexception, since they are represented by objects).
Com S 342
Opaque vs. Transparent Implementations
A data type is opaque if there is no way to find out its representation, even byprinting.
Example:;; initialize a location with;; some value x(define make-cell (lambda (x) (vector x)))
;; extract value from location(define cell-ref (lambda (cell)
(vector-ref cell 0)))
> (define my-cell (make-cell 342))> (vector? my-cell)#t> my-cell#(342)> (cell-ref my-cell)342
Com S 342
Pros & Cons
Opaque data types enforce the use of defining procedure.
Opaque data types are more secure. Access to values of opaquedata types is only possible by means of access proceduresdefined in an interface.
Transparent data types are easier to debug and to extend.
The fact that transparent data types expose their internalrepresentation is also a disadvantage (limited security).
Com S 342
Abstract Data Type
The technique used to define new data types independently of their actualrepresentation is called data abstraction.
Data abstraction divides the data types in interfaces and implementations. Interfaces are used to specify the set of values the data types represents, the
operations, which are available for that data type, and properties these operationsmay be guaranteed to have.
Implementations provide a specific representation of the data and code for theoperations.
A data type, which has been defined in this way is called abstract data type.A client (program) can use values of an abstract data type by means of theinterface without knowing their actual representation (which can change overtime). Data abstraction enforces representation independence.
Com S 342
Examples of Abstract Data Types
Files
Lists, hash tables, vectors, bags
Strings, records, arrays
Objects with private instance variables and public methods
Standardized integers (e.g. in Java the type int is representedusing 32 bits and big endian format, on every platform)
Com S 342
An Abstraction for Inductive Data Types
Data types can be defined inductively using a BNF-grammar.
Problem: What is a suitable representation of an inductivelyspecified set of values?
Example: <bintree> ::= <number> | (<symbol> <bintree> <bintree>)
What should the interface for this data type look like?
Com S 342
Constructors and Access Procedures
In order to create, to manipulate, and to verify that a givenvalue is of the desired data type, we need the followingingredients: Constructors that allow us to build values of a given data type, A predicate that tests whether a given value is a representation of a
particular data type, and Some access procedures that allow us to extract a particular information
from a given representation of a data type.
Solution: a tool that provides a standard representation forinductively specified data types: define-datatype
Com S 342
Bintree
(define-datatype bintree bintree? (leaf-node (datum number?)) (interior-node (key symbol?) (left bintree?) (right bintree?)))
This says that a bintree is either a leaf-node consisting of a number called datum, or An interior-node consisting of a key that is a symbol and two bintree’s called left
and right.
Com S 342
The Elements of define-datatype
The abstraction (define-datatype bintree bintree? ... ) defines: A representation for the data type bintree. A 1-argument constructor, leaf-node, to build a leaf-node. This procedure
tests whether the argument is a number; if this test fails, an error isreported.
A 3-argument constructor, interior-node, to build an interior-node. Thisprocedure tests the first argument with symbol? and its second and thirdarguments with bintree? to ensure that the values are of an appropriatetype.
A 1-argument predicate, bintree?, that returns true (#t) if the passedargument is either a leaf-node or an interior-node. For all other argumentsbintree? returns false (#f).
Com S 342
Arrays, Records, and Unions
A data type that contains values of other types is calledcomposite or aggregate type.
Arrays and records are composite types:C/C++: struct { int f1; struct { int *a; int *b; } f2; char f3; } arecord;
int (* afp[100])(int, char);
A union type is one whose values are one or the other ofmultiple given types:C/C++: union { int f1; char f2; struct { int *a; int *b; } f3; } aunion;
Com S 342
Disjoint Union
A disjoint union (sum) type is a union type, with the exception that everyvalue is annotated with the type the value comes from (see discriminatedunion type EOPL2 page 44).
Scheme values belong to a disjoint (discriminated) union of all the primitivetypes provided by the Scheme implementation.
Inductively specified data types can be represented as disjoint union of recordtypes, called variant record:C/C++: union { struct { char a; int b [7]; } f1; struct { int *a; int *b; } f2; }VB/Delphi/COM+: The OleVariant type represents variants that contain only
COM-compatible types.
Com S 342
define-datatype
A define-datatype declaration, which can only appear at the top-level of aprogram, has the following general form:
(define-datatype <type-name> <type-predicate-name> { ( <variant-name> { (<field-name> <predicate>) }* ) }* )
This abstraction creates a variant-record data type, named type-name. Eachvariant has a variant-name and zero or more fields, each with its own field-name and associated predicate.
Note: In the variant-records, no two types may have the same name. No two variant-records may have the same name.
Com S 342
Abstract Syntax
BNF-specifications are used to describe the concrete syntax, or externalrepresentation of values.
Abstract syntax specifications are used to describe the internal representationof values.
In abstract syntax specifications terminal symbols disappear entirely.
The building blocks of abstract syntax specifications are tokens rather thanterminals.
Unlike BNF-specifications, abstract syntax specifications are allowed togenerate ambiguous syntax trees.
Com S 342
Simple Expressions
BNF-specification:<expression> := <expression> (+|-) <term> | <term><term> := <term> (*|/) <number> | <number>
4
<expression>
<expression>
<number> <number> <number>
* 2 + 1
<term><term>
<term>
Com S 342
Abstract Syntax of Simple Expressions
<expression> := <operator> <expression> <expression> | <number>
<operator> := + | - | * | /
Example:4 * 2 + 1 (+ 1 (* 4 2))
Note: There is no syntactic sugar in the abstract syntax specification. A parseruses the concrete syntax (i.e., it generates unique syntax trees for everyinput) and it generates a syntax tree, which structure is generated by theabstract syntax specification.
Com S 342
Lambda Calculus Expressions
<expression> ::= <identifier> | (lambda (<identifier>) <expression>)
| (<expression> <expression>)
(define-datatype expression expression? (variable
(id symbol?)) (abstraction
(id symbol?)(body expression?))
(application(function expression?)(argument expression?)))
> (expression? (variable `x))#t> (abstraction `x (variable `x))(abstraction x (variable x))
Com S 342
BNF vs. Abstract Syntax
<expression> ::= <identifier>
(variable (id symbol?))
::= (lambda (<identifier>) <expression>)
(abstraction (id symbol?) (body expression?))
::= (<expression> <expression>)
(application (function expression?) (argument expression?))
Com S 342
The Syntactic Form Cases
The form cases is used to determine the variant to which a givenobject of a data type belongs, and to extract its components.
The general syntax of cases is:
(cases <type-name> <expression> { (<variant-name> ( {<field-name>}* ) <consequent> ) }*
(else <default> ) )
Com S 342
is-abstraction?
(cases expression e… (abstraction (id body)
<consequent> )…
)
(define is-abstraction? (lambda (e) (and (list? e) (= (length e) 3) (eqv? `lambda (car e)) (let ((arg (cadr e))) (and (list? arg) (= (length arg) 1) (is-variable? (car arg)))) (is-expresssion? (caddr e)))))
Com S 342
free-variables
(define free-variables (lambda (expr) (cases expression expr (variable (id)
(list id)) (abstraction (id body)
(difference (free-variables body) (list id))) (application (e1 e2)
(union (free-variables e1) (free-variables e2))))))
> (free-variables (abstraction `x (application (variable `x) (variable `y))))(y)
(define free-variables … ((is-abstraction? e) (difference (free-variables (caddr e))
(list (caadr e)))) …) )))
Com S 342
Parse Expression
(define parse-expression (lambda (datum) (cond ((symbol? datum) (variable datum)) ((pair? datum) (if (eqv? (car datum) 'lambda) (abstraction (caadr datum) (parse-expression (caddr datum)))
(application (parse-expression (car datum)) (parse-expression (cadr datum)))))
(else (eopl:error 'parse-expression "Invalid concrete syntax ~s" datum)))))
> (parse-expression `(lambda (x) (x y)))(abstraction x (application (variable x) (variable y)))
WARNING:Accepts ill-formed expressions!
Com S 342
Unparse Expression
(define unparse-expression (lambda (expr) (cases expression expr (variable (id) id) (abstraction (id body) (list 'lambda (list id) (unparse-expression body))) (application (function argument) (list (unparse-expression function)
(unparse-expression argument))))))
> (unparse-expression (abstraction `x (application (variable `x) (variable `y))))(lambda (x) (x y))
Com S 342
Representation Strategies for Data Types
Abstract Data Type
Procedural Representation Record-based Representation
Given an interface for a data type we can change the underlyingrepresentation if needed using different strategies.
Com S 342
Booleans
We can represent boolean values and operations that manipulate booleanvalues as functions:
TRUE ≡ (define my_true (lambda (x y) x))
FALSE ≡ (define my_false (lambda (x y) y))
not b ≡ (define my_not (lambda (b) (b my_false my_true)))
if b then x else y ≡ (define my_if (lambda (b x y) (b x y)))
Example: if TRUE then x else y ≡ (my_if my_true `x `y) = (my_true x y) = x
Com S 342
Pairs
Although tuples are not supported by the lambda calculus, they can easily bemodeled as higher-order functions that “wrap” pairs of values. n-tuples canbe modeled by composing pairs ...
pair ≡ (define PAIR (lambda (x y) (lambda (z) (z x y)))) first ≡ (define FIRST (lambda (p) (p TRUE))) second ≡ (define SECOND (lambda (p) (p FALSE)))
> (define a-pair (PAIR 1 2))> (FIRST a-pair)1> (SECOND a-pair)2
Com S 342
Church Numbers
A number n is represented by a functional, which applies an argumentfunction n-times to another argument. The number zero (0) is represented bya functional that yields the identity function for its argument.
Define: n ≡ λs . λz . s(n) z0 ≡ λs . λz . zsucc ≡ λn . λs . λz . s (n s z)iszero ≡ λn . n (λx . FALSE) TRUEadd ≡ λm . λn . m succ n
Then: 1 = succ 0 = (λn . λs . λz . s (n s z)) (λs . λz . z) λs . λz . s ((λf . λx . x) s z) λs . λz . s ((λx . x) z) λs . λz . s z
Com S 342
Church Numbers in Scheme
0 ≡ (define NULL (lambda (s z) z))succ ≡ (define SUCC (lambda (n) (lambda (s z) (s (n s z))))iszero ≡ (define ISZERO (lambda (n) (n (lambda (x) FALSE) TRUE)))add ≡ (define ADD (lambda (m n) (m SUCC n)))
> (IF (ISZERO NULL) “is zero” “not zero”)“is zero”> (IF (ISZERO (SUCC NULL)) “is zero” “not zero”)“not zero”> (IF (ISZERO (ADD NULL NULL)) “is zero” “not zero”)“is zero”> (IF (ISZERO (ADD NULL (SUCC NULL))) “is zero” “not zero”)“not zero”
Com S 342
Functional Sets
In order to build a set of elements of some data type we can use a propertyfunction f? where f? returns #t (true) if an only if the given argument satisfiesthe property defined by f?.
We write { x | f(x) }, called set builder, to define a set of some data typewhere all elements x satisfy property f?, that is
∀ x, f(x) = true.
In fact, the “characterizing” function f? yields true for elements of the set andit yields false for all other arguments.
Example: f? = isPrime, (f? 3) = #t, (f? 6) = #f
Note, a set builder is a function that uses a predicate to build a concrete set.
Com S 342
Representation of Functional Sets
(define make-set (lambda (pred) pred))
(define fs-union (lambda (fs1 fs2) (lambda (elem) (or (fs1 elem) (fs2 elem)))))
(define fs-intersection (lambda (fs1 fs2) (lambda (elem) (and (fs1 elem) (fs2 elem)))))
(define fs-difference (lambda (fs1 fs2) (lambda (elem) (and (fs1 elem) (not (fs2 elem))))))
(define fs-symdiff (lambda (fs1 fs2) (lambda (elem) (fs-union (fs-difference fs1 fs2) (fs-difference fs2 fs1)))))
(define is-fs-member? (lambda (elem fs) (fs elem)))
Com S 342
Application of Functional Sets
> (make-set number?)#<procedure>> (is-fs-member? 2 (make-set number?))#t> (is-fs-member? 2 (fs-intersection (make-set number?) (make-set (lambda (x) (= x 2)))))#t
Com S 342
A Real Set
(define fs-filter (lambda (fs lst) (if (null? lst) `() (append (if (fs (car lst)) (list (car lst)) `())
(fs-filter fs (cdr lst))))))
> (fs-filter (make-set number?) `(1 a 3))(1 3)
merge lists
Com S 342
Record Representation of Booleans
(define-datatype my_bool my_bool? (my_true) (my_false))
(define my_not (lambda (b) (cases my_bool b (my_true () (my_false)) (my_false () (my_true)))))
(define my_if (lambda (b x y) (cases my_bool b (my_true () x) (my_false () y))))
> (my_if (my_true) `x `y)X> (my_not (my_true))(my_false)
Com S 342
A Data Type for Environments
An environment maps (free) symbols (of an expression) to values.
An environment is a function whose domain is the set of symbols, and whosecodomain (range) is the set of all values.
In general, the environment function is a total function, since the domain ofthe function is restricted to free symbols of the corresponding expression, thatis dom(f) = free-vars(e).
If we adopt the usual mathematical convention that a function is a set ofordered pairs, then we need to represent all sets of the form
{(s1,v1), …, (sn,vn)}where all si are pairwise distinct symbols and vi are any values.
Com S 342
The Environment Interface
The interface for environments has three procedures:
(empty-env) = ∅(apply-env f s) = f(s)
(extend-env`(s1 … sn)`(v1 … vn) f ) = g,
where g(s’) =!"
!#
$ %%=
otherwise)f(s'
n i 1 i, somefor s s' ifvi i
Com S 342
Procedural Representation
(define empty-env (lambda () (lambda (sym) (eopl:error 'apply-env "No binding for ~s" sym))))
(define extend-env (lambda (syms vals env) (lambda (sym) (let ((pos (list-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym))))))
(define apply-env (lambda (env sym) (env sym)))
> (apply-env (extend-env `(a b c) `(1 2 3) (empty-env)) `c)3> (apply-env (extend-env `(a b c) `(1 2 3) (empty-env)) `d)Error reported by apply-env:No binding for d
Com S 342
Call Trace
> (apply-env (extend-env `(a b c) `(1 2 3) (empty-env)) `c)
| (empty-env)| (extend-env (a b c) (1 2 3) (lambda (sym) (eopl:error …)))| (apply-env (lambda (sym) (let ((pos (list-find-position sym (a b c)))) (if (number? pos) (list-ref (1 2 3) pos) (apply-env (lambda (sym) (eopl:error …)) sym)))) c)| (let ((pos (list-find-position c (a b c)))) pos = 2 (if (number? pos) #t (list-ref (1 2 3) pos) (1 2 3)[2] = 3 (apply-env (lambda (sym) (eopl:error …)) c)))3
Com S 342
Helper
(define list-find-position (lambda (sym los) (list-index (lambda (sym1) (eqv? sym1 sym)) los)))
(define list-index (lambda (pred ls) (cond ((null? ls) #f) ((pred (car ls)) 0) (else (let ((list-index-r (list-index pred (cdr ls)))) (if (number? list-index-r) (+ list-index-r 1) #f))))))
Com S 342
BNF & Abstract Syntax Specification
<env-rep> ::= (empty-env)
(empty-env-record)
::= (extend-env ({<symbol>}*) ({<value>}*) <env-rep> )
(extend-env-record (syms (list-of symbol?)) (vals (list-of scheme-value?)) (env environment?))
Com S 342
The Environment Data Type
(define-datatype environment environment? (empty-env-record) (extended-env-record (syms (list-of symbol?)) (vals (list-of scheme-value?)) (env environment?)))
(define scheme-value? (lambda (v) #t))
> (extended-env-record `(a b c) `(1 2 3) (empty-env-record))(extended-env-record (a b c) (1 2 3) (empty-env-record))
Com S 342
List-Of
(define list-of (lambda (pred) (lambda (val) (or (null? val) (and (pair? val) (pred (car val)) ((list-of pred) (cdr val)))))))
> ((list-of number?) `(1 2 3 4))#t
list-of is a procedure that when applied toa predicate yields as value a procedure.
Com S 342
The Environment Operations
(define empty-env (lambda () (empty-env-record)))
(define extend-env (lambda (syms vals env) (extended-env-record syms vals env)))
(define apply-env (lambda (env sym) (cases environment env (empty-env-record () (eopl:error 'apply-env "No binding for ~s" sym)) (extended-env-record (syms vals env) (let ((pos (list-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym)))))))
The procedures empty-env andentend-env are data type constructors.
The procedure apply-env is adata type observer.
Com S 342
Environment-Passing Interpreters
Overview: Semantics of fundamental programming language features The construction of an interpreter Scanning and parsing Local binding, closures, recursion, parameter-passing
References: Daniel P. Friedman et al., “Essentials of Programming Languages”, Second
Edition, MIT Press, 2001 Andrew W. Appel, “Modern Compiler Implementation in [C,Java,ML]”,
Cambridge University Press, 1998 Carl A. Gunter, “Semantics of Programming Languages”, MIT Press, 1992
Com S 342
A Language Interpreter
An interpreter consists of two parts: A front end that converts program text (a program in the source language) to a
abstract syntax tree (the internal representation of the program text) and An evaluator (the actual interpreter) that looks at a data structure and performs
some associated actions, which depend on the actual data structure. In case of alanguage-processing system, the interpreter takes the abstract syntax tree andconverts it, possibly using external inputs, to an answer.
Examples: A calculator Basic Perl, Python, sh, awk, Tcl JVM
Com S 342
Execution via Interpreter
Front End
Interpreter
Program Text
Abstract Syntax Tree
Output
Input
read-eval-loop
Com S 342
A Language Compiler
A compiler translates program text into some other language (the targetlanguage)
The building blocks of a compiler are: A front end that converts program text (a program in the source language) to a
abstract syntax tree (the internal representation of the program text), A set of independent compiler phases, each has assigned a particular task in the
compilation process (e.g. semantics analysis, optimization, register allocation, codeemission), and
The evaluator of a compiled languages may be an interpreter (e.g. JVM) or simplya hardware machine (e.g. von Neumann computer).
Examples of compiled languages: C/C++/C# Pascal, Java Haskell
Com S 342
Execution via Compiler
Fron
t En
d
Abst
ract
Mac
hine
or
Har
dwar
e M
achi
ne
Prog
ram
Tex
t
Abstract Syntax Tree
Output
Input
Sem
antic
Ana
lysi
s
Opt
imiz
atio
n
Code
Em
issi
onSymbol Table
Machine Code
Analyzer Phases Translator Phases
Com S 342
Simple Interpreters
We have already developed interpreters for small languages: <list-of-numbers> <expression>
Each interpreter is a data-driven procedure that assigns aninterpretation (meaning) to every element of the abstract syntax.
Examples of interpreters: occurs-free?, occurs-bound? parse-expression, unparse-expression alpha, substitution
Com S 342
Programming Language Values
In the specification of programming languages we have always at least twosets of values: Expressed values – values that can be specified (by means of expressions) in the
given programming languageExamples: numbers, pairs, characters, strings
Denoted values – values that are bound to variablesExamples: locations containing expressed values (Scheme)
Note: In general, a denotation assigns a term (symbols, strings, or expression) ina language a precise mathematical meaning. The symbol “1” is assigned the meaning 1 – the number 1. The expression “1 * v” is assigned the meaning (times 1 (loc v)) with “times”
being the usual operation for multiplication, and “loc” being an environmentfunction that maps “v” to a value defined in the environment.
Com S 342
Source, Host, and Target Language
The source language (or defined language) is the language inwhich we write programs that should be evaluated by aninterpreter.
The host language (or defining language) is the language inwhich we specify the interpreter.
The target language is the language a source language intranslated to by a compiler. A target language may be a higher-level programming language (e.g. C) or assembly language (ormachine language).
Com S 342
A First Interpreter
In a first language the set of expressed values is equal to the setof integers and the set of denoted values is the same as the setof expressed values:
Expressed Value = NumberDenoted Value = Number
Note: We use always an equational specification to define both theset of expressed and denoted values.
Com S 342
XML
What is XML: XML stands for EXtensible Markup Language. XML was designed to describe data. XML tags are not predefined. You must define your own tags.
XML is not a programming language: XML does not do anything. XML was not designed to do anything. XML was designed to structure data.
When should you use XML? When you need a buzzword in your resume.
Com S 342
A XML-based Programming Language
We use XML to define data: Numbers, Strings, Records, List, and even Expressions or whole Programs
can be considered as data.
Has XML been used to define a programming language before? Yes: eXtensible Stylesheet Languages: Transformations (XSLT)
Our goal: XMLScheme XMLScheme is an XML-based programming language, whose semantics is
given in Scheme. XMLScheme uses a strict order of tags to facilitate parsing.
Com S 342
<program> ::= <expression>
<expression> ::= "<integer" "value" "=" <number> "/>"
::= "<reference" "name" "=" <identifier> "/>"
::= "<" <prim-op> "<arguments" {<expression>}* "/>" "/>"
<prim-op> ::= "add" | “sub" | “mul" | “inc" | “”dec"
A Small Language
a-program (exp)
lit-exp (num)
var-exp (id)
primapp-exp (prim rands)
Com S 342
Language Characteristics
A program is just an expression.
An expression is either a number, an identifier, or a primitive applicationconsisting of a primitive operator, a left parenthesis, a list of expressionsseparated by commas, and a right parenthesis.
Example:<inc <arguments <add <arguments <integer value = 3 /> <reference value = x /> /> /> />/>
(inc (add 3 x))
Com S 342
The Abstract Syntax
We use variant records to specify the abstract syntax:
(define-datatype program program?(a-program (exp expression?)))
(define-datatype expression expression?(lit-exp (num number?))(var-exp (id symbol?))(primapp-exp (prim prim-op?) (rands (list-of expression?))))
(define-datatype prim-op prim-op?(add-prim) (sub-prim) (mult-prim) (inc-prim) (dec-prim))
Com S 342
eval-program
(define eval-program (lambda (pgm) (cases program pgm
(a-program (body) (eval-expression body (init-env))))))
The main procedure, eval-program, is passed an abstract syntax tree of aprogram and returns its value.
We use the rule “follow the grammar” to define all evaluation procedures.
We need to use “cases”, even though there is only one case.
The procedure eval-expression is passed a “suitable” environment that mapsall free variables in the abstract syntax tree to denoted values.
Com S 342
eval-expression
(define eval-expression (lambda (exp env) (cases expression exp
(lit-exp (datum) datum)(var-exp (id) (apply-env env id))(primapp-exp (prim rands) (let ((args (eval-rands rands env))) (apply-primitive prim args))) )))
The procedure eval-expression takes an expression and an environment, andreturns the denoted value of the expression using the environment to map allfree variables to denoted values.
There are three cases: lit-exp, var-exp, and primapp-exp.
Com S 342
eval-rands
(define eval-rands (lambda (rands env) (map (lambda (x) (eval-rand x env)) rands)))
(define eval-rand (lambda (rand env) (eval-expression rand env)))
The procedure eval-rands applies the procedure(lambda (x) (eval-rand x env))
to each element of rands (list of expressions), and returns a list of denotedvalues.
Com S 342
apply-primitive
(define apply-primitive (lambda (prim args) (cases prim-op prim (add-prim () (+ (car args) (cadr args)))
(sub-prim () (- (car args) (cadr args)))(mult-prim () (* (car args) (cadr args)))(inc-prim () (+ (car args) 1))(dec-prim () (- (car args) 1)) )))
The procedure apply-primitive takes a primitive operation and a list ofdenoted values and returns a value associated with the application of theapplication of the primitive operator to the given arguments.
The procedure apply-primitive does not need an environment, because allvariable references have already been replaced with denoted values.
Com S 342
Comments
Our interpreter needs an initial (predefined) environment that maps all freevariables of a program to denoted values:
(define init-env (lambda () (extend-env `(i v x) ‘(1 5 10) (empty-env))))
The procedure apply-primitive assigns a meaning to all operators. Moreover,this procedure maps all operators to their usual mathematical interpretation(unary inc and dec, binary +, -, and *). If we want to change the arity of theoperators, we need to change apply-primitive.
The interpreter assigns an operational semantics to our language. Themeaning of both the expressions and the operators is defined using Z - thenatural numbers.
Com S 342
The Front End
The front end of a interpreter/compiler translates the program text into anabstract syntax tree.
As far as common programming languages are concerned, programs are juststrings of characters.
The front end groups the characters of the program into meaningful units,which are called tokens.
The front end is usually divided into two stages: Scanning: The process of dividing a sequence of characters into words, numbers,
punctuations, operators, comments, and the like. These unit are called tokens. Parsing: The process of organizing the sequence of tokens into a hierarchical
syntactical structures such as expressions, statements, and blocks. The parsertakes a sequence of tokens and produces an abstract syntax tree.
Com S 342
Lexical Analysis
Lexical analysis is in general not very complicated.
A programming language classifies lexical tokens into a finite set of tokentypes: identifiers, numbers, punctuations, comments.
A language is a set of strings; a strings is a finite sequence of symbols. Thesymbols themselves are taken from a finite alphabet (e.g. the ASCII characterset).
We use regular expressions to specify the set of strings of a language: A symbol “a” in the alphabet is a regular expression and denotes just the string a. Alternation (|), concatenation (.), not (¬), epsilon (ε), or repetition (*) applied to
regular expression are regular expressions. There are not other forms of regular expressions.
Com S 342
Parsing
The definition of a parser can be a very complicated and tedious task.
Several different techniques exist to construct a parser: Table-based parsing
LL(k)-parsing (top-down parsing) LR(k)-parsing (bottom-up parsing)
Recursive-descent parsing
When defining a language, we use a context-free grammar (type 2 or BNF) tospecify the building blocks of the language.
The grammar must not be ambiguous in order to define a parser (exceptionspossible).
The standard approach to build a front-end (which is the most easiestapproach available) is to use a parser generator (e.g. YACC, LEX, SLLGEN).
Com S 342
SLLGEN
SLLGEN stands for Scheme LL(1) parser GENerator.
This parser generator takes as input a lexical specification and agrammar, and produces as output a scanner and a parser forthem.
SLLGEN operations: (sllgen:make-string-parser scanner-spec grammar) generates a parser. (sllgen:make-string-scanner scanner-spec grammar) generates a scanner
(mainly used for debugging). (sllgen:make-define-datatypes scanner-spec grammar) generates each of
the define-datatype expressions from the grammar for use by cases.
Com S 342
Scanner Specification in SLLGEN
<scanner-spec> ::= ( {<regexp-and-action>}* )<regexp-and-action> ::= ( <name> ( {<regexp>}* ) <outcome> )<name> ::= <symbol><regexp> ::= <string> | letter | digit | whitespace | any
::= (not <character> ) | (or {<regexp>}* )::= (arbno <regexp> ) | (concat {<regexp>}* )
<outcome> ::= skip | symbol | number | string
Outcome: skip: This means this is the end of the token, but no token is emitted. symbol: The characters in the buffer are converted into a Scheme symbol. number: The characters in the buffer are converted into a Scheme number. string: The characters in the buffer are converted into a Scheme string.
Note: If there is a tie for the longest match between two regular expressions, string takesprecedence over symbol.
Com S 342
Grammar Specification in SLLGEN
<grammar> ::= ( {<production>}* )<production> ::= ( <lhs> ( {<rhs-item>}* ) <prod-name> )<lhs> ::= <symbol><rhs-item> ::= <symbol> | <string>
::= (arbno {<rhs-item>}* )::= (separated-list {<rhs-item>}* <string> )
<prod-name> ::= <symbol>
A grammar specification in SLLGEN must allow the parser to determine, whichproduction to use knowing only: What nonterminal it is looking for, and The first symbol (token) of the string being parsed.
Com S 342
LL(1)-Grammar
LL(1) means: We use only one lookahead symbol to determine, which action is to be
performed next. All leftmost symbols in the {<rhs-item>}* of all productions must be
pairwise disjoint (FIRST(rule1) ∩ FIRST(rule2) = ∅). All production must not have direct or in-direct left-recursive application of
the same production:Example of an ill-formed LL(1) rule:
(term (term “+” number) sum-term)
SLLGEN produces a warning if the input grammar fails to meetany restriction.
Com S 342
FIRST
FIRST(A) = {A}, if A is a terminalFIRST(A) = {ε}, if A = ε
FIRST(A) = { a | A ∈ nonterminals and A ::= B1 B2 … Bn, a = FIRST(A), if FIRST(Bi) = a, 1 <= i <= n, and ε ∈ FIRST(B1), …, FIRST(Bi-1), or
a = ε, if ε ∈ FIRST(B1), …, FIRST(Bn) }
Com S 342
<program> ::= <expression>
<expression> ::= "<integer" "value" "=" <number> "/>"
::= "<reference" "name" "=" <identifier> "/>"
::= "<" <prim-op> "<arguments" {<expression>}* "/>" "/>"
<prim-op> ::= "add" | “sub" | “mul" | “inc" | “”dec"
A Small Language
a-program (exp)
lit-exp (num)
var-exp (id)
primapp-exp (prim rands)
Com S 342
The Scanner Specification
(define scanner-spec ‘( (white-sp (whitespace) skip) (comment ("%" (arbno (not #\newline))) skip) (identifier (letter (arbno (or letter digit "?"))) symbol) (number (digit (arbno digit)) number) ))
Com S 342
The Grammar Specification
(define grammar ‘( (program (expression) a-program) (expression ("<integer" "value" "=" number "/>") lit-exp) (expression ("<reference" "value" "=" identifier "/>") var-exp) (expression ("<" prim-op "<arguments" (arbno expression) "/>" "/>") primapp-exp) (prim-op ("add") add-prim) (prim-op ("sub") sub-prim) (prim-op ("mul") mult-prim) (prim-op ("inc") inc-prim) (prim-op ("dec") dec-prim) ))
Com S 342
The Interpreter
;; define datatypes here
;; build the scanner and parser(define front-end (sllgen:make-string-parser scanner-spec grammar))
;; load the functional environment definition(load "environment.scm")
;; define initial environment here
;; define interpreter(define interpreter (lambda (string) (eval-program (front-end string))))
Com S 342
A-Read-Eval-Loop
(define read-eval-loop (sllgen:make-rep-loop
"$ " eval-program (sllgen:make-stream-parser scanner-spec grammar)))
(sllgen:make-rep-loop prompt eval-fn stream-parser) takes a prompt-string, a1-argument procedure, and a stream parser, and produces a read-eval-printloop.
(sllgen:make-stream-parser scanner-spec grammar) generates a streamparser.
Example: > (read-eval-loop) $ <reference value = x /> 10 $
Com S 342
Run-From-File
(define read-file (lambda (fname) (let* ((fp (open-input-file fname)) (contents (read-source fp))) (close-input-port fp) contents)))
(define read-source (lambda (in-port) (let ((char (read-char in-port))) (if (eof-object? char) “” (string-append (string char) (read-source in-port))))))
(define run-from-file (lambda (fname) (interpreter (read-file fname))))
Com S 342
Language Extensions
To study the semantics and implementation of a wide range ofprogramming language features, we add these features to ouralready defined language step-by-step.
For each feature, we add a production to the grammar, Specify an abstract syntax for that production, and Add an appropriate evaluation function (new procedure or cases clause) to
handle the new language feature.
Com S 342
<expression> ::= “<if” “<condition” <expression> “/>” “<then” <expression> “/>” “<else” <expression> “/>” “/>”
if-exp (test-exp then-exp else-exp)
We use the C/C++ style, that is, the number 0 means false, any othernumber means true.
(define is-true? (lambda (x) (not (zero? x))))
Conditional Evaluation
Com S 342
Evaluation of If-Then-Else
(define eval-expression (lambda (exp env) (cases expression exp
(if-exp (test-exp then-exp else-exp) (if (is-true? (eval-expression test-exp env))
(eval-expression then-exp env) (eval-expression else-exp env)))
…)))
We use the Scheme if-form to define the meaning of if-then-else in ourlanguage. Therefore, our understanding of the defined language depends onour understanding of the defining language.
Com S 342
If-Then-Else Examples
6
7
<if <condition <sub <arguments <integer value = 3 /> <integer value = 3 /> /> /> /> <then <mul <arguments <integer value = 2 /> <integer value = 3 /> /> /> /> <else <add <arguments <integer value = 3 /> <integer value = 4 /> /> /> />/>
<if <condition <integer value = 3 /> /> <then <mul <arguments <integer value = 2 /> <integer value = 3 /> /> /> /> <else <add <arguments <integer value = 3 /> <integer value = 4 /> /> /> />/>
Com S 342
The Dangling If-Then-Else Conflict
The “dangling” if-then-else conflict is a problem in languagedesign, where the grammar for the language contains alternativerule of the form:
A ::= B A D AA ::= B A
C/C++, Java, Pascal are languages, whose grammars containrules that generate the dangling if-then-else conflict.
Grammars with this conflict are not LL(1), that is, we cannotdefine a parser with the SLLGEN-parser generator.
Com S 342
The Conflict Illustrated
<expression> ::= if <expression> then <expression> else <expression> ::= if <expression> then <expression>
How do we parse the following construct:
if 3 then if 5 then 8 else 9 ?
Com S 342
Multiple Parsing Strategies
expression expression
if 3 then if 5 then 8 else 9
if 3 then <expression>if 3 then <expression> else 9
if 5 then 8 else 9if 5 then 8
if 3 then if 5 then 8 else 9
Solution: The innermost else is associated with the innermost then.
Com S 342
We can create new variable bindings with a let-form.
<expression> ::= "<let" <declarations> <expression> "/>"
let-exp (decls body)
<declarations> ::= "<declarations"
{"<declaration" "<variable" "value" "=" <identifier> "/>" <expression> "/>"}* "/>“
let-decls (ids rands)
Local Binding
Com S 342
Facts about Let-Bindings
The let-form introduces “named value abstractions”.
The scope of the variable bindings is the body of the let-form.
The entire let-form is an expression. Therefore, let-forms may benested.
Com S 342
Abstract Syntax of the Let-Form
A variant-record for the let-form:
(let-exp (decls declarations?) (body expression?))
(define-datatype declarations declarations? (let-decls (ids (list-of symbol?)) (rands (list-of expression?))) )
Com S 342
Evaluation of the Let-Form
A new case in eval-expression:
(let-exp (decls body) (let ((args (eval-rands (get-rands decls) env)))
(eval-expression body (extend-env (get-ids decls) args env))))
We use the Scheme let-form to define the meaning of local binding. First,we evaluate all expressions, which shall be bound to the newly introducedvariables. Then, we extend the original environment with the new bindingsand evaluate the body of the let-form with the extended environment.
Com S 342
Auxiliaries
(define get-rands (lambda (decls) (cases declarations decls (let-decls (ids rands) rands) ) ) )
(define get-ids (lambda (decls) (cases declarations decls (let-decls (ids rands) ids) ) ) )
Com S 342
A Let Example
28
140
<let <declarations <declaration <variable value = f /> <integer value = 4 /> /> <declaration <variable value = t /> <add <arguments <integer value = 3 /> <integer value = 4 /> /> /> /> /> <mul <arguments <reference value = f /> <reference value = t /> /> />/>
<let <declarations <declaration <variable value = f /> <integer value = 4 /> /> <declaration <variable value = t /> <add <arguments <integer value = 3 /> <integer value = 4 /> /> /> /> /> <mul <arguments <reference value = f /> <reference value = t /> <reference value = v /> /> />/>
(apply-env env `v) 5
Com S 342
Procedures
Procedures introduce “names expression abstractions”
We represent procedures as first-class values in our language.
Expressed Value = Denoted Value = Number + ProcVal
ProcVal is the set of values representing procedures.
We need two new constructs: declaration of procedures procedure calls
Com S 342
Syntax for Procedures
<expression> ::= "<proc" <formals> <expression> "/>"
proc-exp (ids body)
::= "<invoke" <expression> <arguments> "/>"
app-exp (rator rands)
<arguments> ::= "<arguments" {<expression>}* "/>"
<formals> ::= "<params" {<param>}* "/>“
param-decls (ids)
Com S 342
Grammar Specification
(expression ("<proc" formals expression "/>") proc-exp)
(expression ("<invoke" expression
"<arguments" (arbno expression) "/>" "/>") app-exp)
(formals ("<params"
(arbno "<param" "value" "=" identifier "/>") "/>") param-decls)
Com S 342
Requirements for Procedure Application
When a procedure is applied, its body is evaluated in anenvironment that binds the formal parameters of the procedureto the arguments of the application.
Variables that occur free (references without declarations withinthe scope of the procedure) must obey the lexical binding rule,that is, they need to be defined in the enclosing region.
The mechanism that resolves all variable references at the timethe procedure is created is called static scoping.
Com S 342
Example of Static Scoping
If f is called, its body should be evaluated in the following environment:(((y z) (2 28)) . (((x) (5)) . ()))
y = 2, z = 28, x = 5
<let <declarations <declaration <variable value = x /> <integer value = 5 /> /> /> <let <declarations <declaration <variable value = f /> <proc <params <param value = y /> <param value = z /> /> <add <arguments <reference value = y /> <sub <arguments <reference value = z /> <reference value = x /> /> /> /> /> /> /> <declaration <variable value = x /> <integer value = 28 /> /> /> <invoke <reference value = f /> <arguments <integer value = 2 /> <reference value = x /> /> /> />/>
Com S 342
Nested Procedures
(let ((cadd (lambda (n) (let ((h (lambda (m) (+ n m)))) h))) (twice (lambda (f) (let ((g (lambda (x) (f (f x)))))
g)))) (let
((seventeen ((twice (cadd 5)) 7)) (addTwentyFour (twice (twice (cadd 6))))) (addTwentyFour seventeen)))
Both cadd and twice return a procedure!
(cadd 4) (lambda (m) (+ 4 m))
(twice cadd) (lambda (x) (cadd (cadd x)))
41
Com S 342
Nested Procedures
In languages without nested functions (such as C), the runtimerepresentation of a function value can be the address of themachine code for that function. This address can be passed asan argument, stored in a variable, and so on.
But this does not work for nested procedures; if we representthe procedure h by an address, in what outer frame can it accessthe variable n? Similarly, how does the procedure g access thevariable f?
Com S 342
Closures
A closure is a package that contains The procedure body The list of all formal paramters The bindings of its free variables
(define closure (lambda (ids body env) (lambda (args) (eval-expression body (entend-env ids args env)))))
In general, it is convenient to store the entire creation environment of aprocedure, rather than just the bindings of the free variables.
Com S 342
The Representation of ProcVal
We define an abstract data type for ProcVal:
(define-datatype procval procval? (closure (ids (list-of symbol?)) (body expression?) (env environment?)))
Com S 342
Procedure Call Evaluation
(define apply-procval (lambda (proc args) (cases procval proc
(closure (ids body env) (eval-expression body (extend-env
idsargs env))))))
Is proc a procedure?
Evaluate procedure usingthe creation environment.
Com S 342
Extensions of eval-expression
(define eval-expression (lambda (exp env)
(cases expression exp … (proc-exp (params body)
(closure (get-ids params) body env)) … )))
The body of the procedure is not yetevaluated. We memorize only thecreation environment!
Com S 342
Get-Ids
(define-datatype pdeclarations pdeclarations? (param-decls (ids (list-of symbol?))) )
(define get-ids (lambda (decls) (cond ((pdeclarations? decls) (cases pdeclarations decls (param-decls (ids) ids))) ) ) )
Extract parameter name list:
Com S 342
Evaluation of Procedure Call
(define eval-expression (lambda (exp env)
(cases expression exp … (app-exp (rator rands) (let ((proc (eval-expression rator env)) (args (eval-rands rands env))) (if (procval? proc)
(apply-procval proc args) (eopl:error ‘eval-expression “Attempt to apply non-procedure ~s” proc))))
… )))
Like Scheme “x isnot a procedure”
call-by-value
Com S 342
Dynamic Scoping
Dynamic scoping means that the procedure body is evaluated inan environment obtained by extending the environment at thepoint of the procedure call.
(let ((a 3)) (let ((p (lambda (x) (+ x a)))
(a 5)) (* a (p 2))))
(* 5 (+ 2 5)) = (* 5 7) = 35
How can we implementdynamic scoping?
Com S 342
Dynamic Scoping
(define apply-procval (lambda (proc args calling-env) (cases procval proc (closure (ids body env) (eval-expression body (extend-env ids args (extend-env2 calling-env env)))))))
(app-exp (rator rands)(let ((proc (eval-expression rator env)) (args (eval-rands rands env))) (if (procval? Proc) (apply-procval proc args env) (eopl:error `val-expression
"Attempt to apply non-procedure ~s“ proc))))
Extend closure environmentwith call environment!
Com S 342
extend-env2
(define extend-env2 (lambda (calling-env creation-env) (cases environment calling-env (empty-env-rec () creation-env) (extend-env-rec (syms vals old-env) (extend-env syms vals (extend-env2 old-env creation-env))) ) ) )
Com S 342
Dynamic Scoping Example
<let <declarations <declaration <variable value = a /> <integer value = 3 /> /> /> <let <declarations <declaration <variable value = p /> <proc <params <param value = x /> /> <add <arguments <reference value = x /> <reference value = a /> /> /> /> /> <declaration <variable value = a /> <integer value = 5 /> /> /> <mul <arguments <reference value = a /> <invoke <reference value = p /> <arguments <integer value = 2 /> /> /> /> /> /> />
Com S 342
Split Programs
XMLScheme is a very verbose language.
To facilitate the writing of programs XMLScheme should alsosupport a mechanism that allows programmers to divide a givenprogram into different compilation units.
Languages like C/C++, Scheme, and HTML support source codeinclusions. In XMLScheme we will adapt this approach andintroduce a new form of expression: hyper-references.
Com S 342
Hyper-References
<expression> ::= "<href" "value" "=" <identifier> "/>"
extern-exp (unit)
A hyper-reference is an expression that is defined in a separate compilationunit.
Hyper-reference may also occur nested.
The compilation unit is loaded when a hyper-reference is evaluated.
Com S 342
Hyper-Reference Evalution
(extern-exp (unit-name) (let ((contents (read-file (string-append (symbol->string unit-name) ".xml")))) (eval-expression (expression-parser contents) env)))
Build filename
Load expression definition
Call expression parser
Com S 342
Expression Parser
We define a new grammar for expressions.
The root symbol in the new grammar is <expression>.
The new grammar will contain only rules that are reachable fromthe root symbol <expression>.
(define expression-parser (sllgen:make-string-parser scanner-spec expression-spec))
Com S 342
Hyper-Reference Example
<let <declarations <declaration <variable value = h /> <proc <params <param value = m /> /> <add <arguments <reference value = n /> <reference value = m /> /> /> /> /> /> <reference value = h />/>
h.xml:
… <declaration <variable value = cadd /> <proc <params <param value = n /> /> <href value = h /> /> />…
Com S 342
Recursion
Most programming languages support the definition of recursiveabstractions: Records (to construct linked lists) Procedure (to implement inductively specified data types) Mutual depended data structures like classes
Recursion is a challenging mechanism and may often lead tocomplications in program understanding.
We will study only one form of recursive definition here:letrec – recursive procedure definition à la Scheme.
Com S 342
A Recursive Problem
Suppose we want to define the operation plus using only the operatorsincrement and decrement.
We may write:$ let plus = proc (n, m) if n then (plus dec(n) inc(m)) else m
in (plus 2 3)Error reported by apply-envNo binding for plus
Unfortunately this is not a definition, since we are trying to use “plus” beforeit is defined.
Task: Although recursion is fundamental to programming, it is not yetprimitive in the our language, so we must find a way to “program” it!
Com S 342
Recursive Functions As Fixed Points
However, we can obtain a closed expression by abstracting over plus:
rplus = proc (plus, n, m) if n then (plus dec(n) inc(m)) else m
Now, let “fplus” be the actual addition function we want. We must pass it to“rplus” as a parameter before we can perform any additions.
(rplus fplus) is the function we want. In other words, we are looking for anfplus such that:
rplus fplus ↔ fplus
That Is, we are searching for a fixed point of “rplus”.
Com S 342
Fixed Points
In general, a fixed point of a function is a value in the function’s domain,which is mapped to itself by the function. Therefore, a fixed point of afunction f is a value p such that (f p) = p.
Examples: (factorial 1) = 1(factorial 2) = 2(fibonacci 0) = 0(fibonacci 1) = 1
However, not all functions have exactly one fixed point: “inc(n) = n + 1” hasnone.
We need to represent the fixed-point operation in our language.
Com S 342
Fixed-Point Theorem
Fixed-point Theorem:For every F there exists a fixed-point X such that F X ↔ X.
Proof:Let
Y ≡ λf . (λx . f (x x)) (λx . f (x x))
Now consider:X ≡ Y F (λx . F (x x)) (λx . F (x x))
F ((λx . F (x x)) (λx . F (x x))) F X
Therefore, the “Y combinator” can always be used to find a fixed-point ofan arbitrary lambda expression, if such a fixed-point exists.
Com S 342
Unfolding Recursive Lambda Expressions
rplus = proc (plus, n, m) if n then (plus dec(n) inc(m)) else m
plus is a fixed point of rplus. By the fixed point theorem, we can take:
plus ≡ Y rplus
plus 1 1= (Y rplus) 1 1 rplus plus 1 1 if 1 then (plus (pred 1) (succ 1)) else 1 (plus (pred 1) (succ 1)) (rplus plus (pred 1) (succ 1)) if (pred 1) then (plus (pred (pred 1)) (succ (succ 1))) else (succ 1) if 0 then (plus (pred (pred 1)) (succ (succ 1))) else (succ 1) (succ 1) 2
Com S 342
Strict Fixed-Point Operator
The fixed-point operator Y is useless in a call-by-value setting, since theexpression Y g diverges for any g. In call-by-value settings we use,therefore, the operator fix:
fix ≡ λf . (λx . f (λy . x x y)) (λx . f (λy . x x y))
Com S 342
Unfolding Recursive Lambda Expressions II
rplus = proc (plus, n, m) if n then (plus dec(n) inc(m)) else m
We can take: plus ≡ fix rplus plus 1 1= (fix rplus) 1 1 (h h) 1 1 where h = (λx . rplus (λy . x x y)) rplus fct 1 1 where fct = λy . h h y if 1 then (fct 0 2) else 1 fct 0 2 h h 0 2 rplus fct 0 2 if 0 then (fct (pred 0) (succ 2)) else 2 2
call-by-value
Com S 342
Syntax for Mutual Recursive Definitions
<expression> ::= "<reclet" <declarations> <expression> "/>"
reclet-exp ( decls body)
The syntax for mutual recursive definitions is like the let-syntax.
However, the evaluation of mutual recursive definitions requiresthe application of a corresponding fixed-point operator.
Com S 342
Applications of a Recursive Definition
<reclet <declarations <declaration <variable value = even /> <href value = even /> /> <declaration <variable value = odd /> <href value = odd /> /> /> <invoke <reference value = odd /> <arguments <integer value = 13 />
/> />/>
odd ∈ fv(even) and even ∈ fv(odd)
Com S 342
Even & Odd<proc <params <param value = x/> /> <if <condition <reference value = x /> /> <then <invoke <reference value = odd /> <arguments <dec <arguments <reference value = x /> /> /> /> /> /> <else <integer value = 1 /> /> />/>
<proc <params <param value = x/> /> <if <condition <reference value = x /> /> <then <invoke <reference value = even /> <arguments <dec <arguments <reference value = x /> /> /> /> /> /> <else <integer value = 0 /> /> />/>
Com S 342
Building the Fixed-Point
Fixed-point semantics:
Let e’ be (extend-env-recursively proc-names bodies e), Then
If name is one of the names in proc-names, then
(apply-env e’ name) = (closure ids body e’),
where ids and body are the formal arguments and the bodyof the recursive procedure, respectively.
If not, then (apply e’ name) = (apply-env e name).
Com S 342
Evaluation of reclet
…(reclet-exp (decls body)
(let* ((args (eval-rands (get-rands decls) env)) ;; filter recursive procedure ids (rec-proc-ids (map car (filter (lambda (p) (procval? (cadr p))) (zip (get-ids decls) args)))) ;; now change closure to closure-rec (new-args (map (lambda (v) (if (procval? v) (build-rec-proc v rec-proc-ids) v)) args)))
(eval-expression body (extend-env (get-ids decls) new-args env))))…
Com S 342
Filter
(define filter (lambda (p lst) (if (null? lst) '() (if (p (car lst)) (cons (car lst) (filter p (cdr lst))) (filter p (cdr lst)) ) ) ) )
> (filter odd? '(1 2 3 4 5 6 7 8))'(1 3 5 7)
(define odd? (lambda (n) (= (modulo n 2) 1)))
Com S 342
build-rec-proc
(define build-rec-proc (lambda (v rec-ids) (cases procval v (closure (ids body env) (closure-rec ids body rec-ids env)) (else v) ;; we should never reach this case ) ) )
(define-datatype procval procval? (closure (ids (list-of symbol?)) (body expression?) (env environment?)) (closure-rec (ids (list-of symbol?)) (body expression?) (rec-proc-ids (list-of symbol?)) (env environment?)) )
Com S 342
New Approach to Call Procedures
(app-exp (rator rands) (let ((proc (eval-expression rator env))
(args (eval-rands rands env))) (if (procval? proc) ;; add calling-env to resolve occurring recursive procedures
(apply-procval proc args env) (eopl:error 'eval-expression
"Attempt to apply non-procedure ~s" proc))))
Com S 342
New apply-procval
(define apply-procval (lambda (proc args calling-env) (cases procval proc (closure (ids body env) (eval-expression body (extend-env ids args env))) (closure-rec (ids body rec-proc-ids env) (eval-expression body (extend-env ids args (extend-rec-env rec-proc-ids calling-env env)))) ) ) )
Com S 342
extend-rec-env
(define extend-rec-env (lambda (rec-proc-ids calling-env creation-env) (if (not (null? rec-proc-ids)) (extend-env rec-proc-ids (map (lambda (id) (apply-env calling-env id)) rec-proc-ids) creation-env) creation-env ) ) )
Build fixed-point for allrecursive procedures
Com S 342
Expression Sequences
Command expression:
<expression> ::= "<sequence" {<expression>}+ "/>"
Kleene star: A+ = A + A*
<expression> ::= "<sequence" <expression> {<expression>}* "/>"
seq-exp (exp exps)
Com S 342
Implementation of Sequencing
(define eval-expression (lambda (exp env) (cases expression exp … (seq-exp (exp exps) (eval-exp-sequence exp exps env)) …)))
Com S 342
Sequence Evaluation
(define eval-exp-sequence (lambda (exp exps env) ; while loop (let continue ((res (eval-expression exp env)) (tail exps)) (if (null? tail) res (continue (eval-expression (car tail) env) (cdr tail)) ) ) ) )
Return valueof last
expression
Scheme Loop
Com S 342
Control Context
The standard recursive implementation of factorial uses a call to procedurefactorial as an operand that requires the creation of a control context:
(define factorial (lambda (n) (if (zero? n) 1 (* n (factorial (- n 1))) ) ))
(factorial 6)| (factorial 5)| |(factorial 4)| | (factorial 3)| | |(factorial 2)| | | (factorial 1)| | | |(factorial 0)| | | |1| | | 1| | |2| | 6| |24| 120720
New contextrequired
Com S 342
Tail Form
A procedure call that does not grow control context is the same as a jump.Such a procedure call is said to be a tail call.
Iterative factorial:
(define factorial (lambda (n) (let fact-iter ((rest n) (res 1)) (if (zero? rest)
res (fact-iter (- rest 1) (* res rest))) ) ) )
(factorial 6)| (fact-iter 6 1)| |(fact-iter 5 6)| | (fact-iter 4 30)| | |(fact-iter 3 120)| | | (fact-iter 2 360)| | | |(fact-iter 1 720)| | | | (fact-iter 0 720)| | | | 720| | | |720| | | 720| | |720| | 720| |720| 720720
Com S 342
Sequencing Example
<sequence <add <arguments <integer value = 2 /> <integer value = 3 /> /> /> <mul <arguments <integer value = 2 /> <integer value = 3 /> /> />/>
6
Com S 342
New Operators
<prim-op> ::= "add" | "sub" | "mul" | "inc" | "dec" |"div" | "equal" | "less" | "greater" |"not" | "and" | "or"
(define-datatype primitive primitive? (add-prim) (sub-prim) (mult-prim) (inc-prim) (dec-prim) (div-prim) (equal-prim) (less-prim) (greater-prim)
(not-prim) (and-prim) (or-prim))
Com S 342
New Operator Evaluation Approach
(add-prim () (eval (append '(+) args)))(sub-prim () (eval (append '(-) args)))(mult-prim () (eval (append '(*) args)))(inc-prim () (car (reverse (map (lambda (n) (+ n 1)) args))))(dec-prim () (car (reverse (map (lambda (n) (- n 1)) args))))(div-prim () (eval (append '(/) args)))
Return lastexpression
Com S 342
Evaluation of Relational Operators
(equal-prim () (let cont ((tail (cdr args))) (if (null? tail) 1 ; identity ==> true (if (eqv? (car args) (car tail)) (cont (cdr tail)) 0 ; not all elements are equal ) ) ))
(less-prim () (b->n (eval (append '(<) args))))(greater-prim () (b->n (eval (append '(>) args))))
(define b->n (lambda (b) (if b 1 0)))
Com S 342
Evaluation of Boolean Operators
(not-prim () (b->n (car (reverse (map (lambda (v) (if (number? v) (not (is-true? v)) (not v))) args)))))
(and-prim () (b->n (eval (append '(and) (map is-true? args)))))(or-prim () (b->n (eval (append '(or) (map is-true? args)))))
(define b->n (lambda (b) (if b 1 0)))
Com S 342
Equal & Or
<equal <arguments <integer value = 3 /> <dec <arguments <integer value = 4 /> /> /> <inc <arguments <integer value = 2 /> /> /> <add <arguments <integer value = 2 /> <integer value = 1 /> /> /> /> />
<or <arguments <integer value = 3 /> <dec <arguments <integer value = 4 /> /> /> <inc <arguments <integer value = 2 /> /> /> <add <arguments <integer value = 2 /> <integer value = 1 /> /> /> /> />
Com S 342
Loop Expression
A loop expression provides a general looping construct similar tothe for statement in C/C++ or Java.
<expression> ::= "<loop" [ <declarations> ] [ "<conditions" {<expression>}+ /> ] [ "<increments" {<expression>}+ /> ]
<expression> "/>"
loop-exp (decls conds incrs body)
Com S 342
Loop Syntax
(opt-declarations() empty-decl-list)
(opt-declarations(declarations) decl-list)
(opt-conditions() empty-exp-list)
(opt-conditions("<conditions" expression (arbno expression) "/>") exp-list)
(opt-increments() empty-exp-list)
(opt-increments("<increments" expression (arbno expression) "/>") exp-list)
(expression("<loop" opt-declarations opt-conditions opt-increments
expression "/>") loop-exp)
Com S 342
Loop Evaluation
(loop-exp (decls conds incrs body) (let ((new-decls (eval-declaration-list decls env)) (conditions (make-exp-list conds)) (increments (make-exp-list incrs))) ;; check for correct increments arity (if (= (length new-decls) (length increments)) (eval-loop new-decls conditions increments body env) ;; arity error (eopl:error 'eval-expression "Arity mismatch in loop increments" ) ) ))
Com S 342
make-exp-list
(define make-exp-list (lambda (exps) (cases expression-list exps (empty-exp-list () '()) (exp-list (exp tail) (cons exp tail)) ) ) )
Com S 342
Evaluate Loop Body
(define eval-loop (lambda (loop-decls conditions increments body env) (let ((p (unzip loop-decls)) ; map conditions to or-prim (loop-test (primapp-exp (or-prim) (car conditions) (cdr conditions)))) (let continue ((new-env (extend-env (car p) (cadr p) env)) ; loop env (res 0)) ; res #f ;; do loop (if (is-true? (eval-expression loop-test new-env)) (let ((new-res (eval-expression body new-env)) ; eval body => res (step-res (map (lambda (e) (eval-expression e new-env)) increments))) ;; next iteration (continue (extend-env (car p) step-res env) new-res) ) res ) ) ) ) )
Com S 342
A Loop Example
<loop <declarations <declaration <variable value = i /> <integer value = 5 /> /> <declaration <variable value = j /> <integer value = 6 /> /> <declaration <variable value = k /> <integer value = 7 /> /> /> <conditions <greater <arguments <reference value = i /> <integer value = 0 /> /> /> <greater <arguments <reference value = j /> <integer value = 0 /> /> /> <greater <arguments <reference value = k /> <integer value = 0 /> /> /> /> <increments <dec <arguments <reference value = i /> /> /> <dec <arguments <reference value = j /> /> /> <dec <arguments <reference value = k /> /> /> /> <add <arguments <reference value = i /> <reference value = j /> <reference value = k /> /> />/>
[i j k](5 6 7)(4 5 6)(3 4 5)(2 3 4)(1 2 3)(0 1 2)(-1 0 1)(-2 -1 0) 0
Com S 342
Variable Assignment
In a language that supports variable assignment, every identifier denotes andaddress of a mutable location in memory.
The address is called references, and it is the contents of the reference that ismodified by a variable assignment.
References or locations are called L-values, which reflects their associationwith variables appearing on the left-hand side of assignment statements.
Analogously, expressed values, such as the values of the right-hand sideexpressions of assignment statements, are called R-values.
Com S 342
Interpreter Values
Denoted Values = Ref(Expressed Values)Expressed Values = Number + ProcVal
Com S 342
Syntax for Variable Assignment
<expression> ::= "<set" "<variable" "value" "=" <identifier> "/>" <expression> "/>“
assign-exp (id exp)
variant-record type: (assign-exp (id symbol?) (exp expression?))
Com S 342
Semantics of Assignment
What is the difference between assignment and binding? A binding creates an immutable association of a name with a value. An assignment changes the value of an existing binding.
Variable assignment enables the sharing of values between different parts ofa program (e.g., procedures).
Variable assignment is not transparent, i.e., a change of a value of a variableby an assignment is seen by all parts of the program that refer to thevariable.
If a language supports variable assignments, then procedures do in generalnot have referential transparency.
Com S 342
Sharing
<let <declarations <declaration <variable value = x /> <integer value = 0 /> /> <declaration <variable value = zero? /> <href value = zero /> /> /> <reclet <declarations <declaration <variable value = even /> <href value = evensharing /> /> <declaration <variable value = odd /> <href value = oddsharing /> /> /> <sequence <set <variable value = x /> <integer value = 13 /> /> <invoke <reference value = odd /> <arguments /> /> /> />/>
Com S 342
oddsharing
<proc <params /> <if <condition <invoke <reference value = zero? /> <arguments <reference value = x /> /> /> /> <then <integer value = 0 /> /> <else <sequence <set <variable value = x /> <dec <arguments <reference value = x /> /> /> /> <invoke <reference value = even /> <arguments /> /> /> /> />/>
Com S 342
evensharing
<proc <params /> <if <condition <invoke <reference value = zero? /> <arguments <reference value = x /> /> /> /> <then <integer value = 1 /> /> <else <sequence <set <variable value = x /> <dec <arguments <reference value = x /> /> /> /> <invoke <reference value = odd /> <arguments /> /> /> /> />/>
Com S 342
Private State
(let ((g (let ((count 0)) (lambda () (begin (set! count (+ count 1)) count))))) (+ (g) (g)))
The procedure g maintains a private variable count that storesthe number of times g has been called, so this programevaluates to 3.
Com S 342
Call-By-Value
Every time a procedure is called, we can create a new reference for eachformal parameter, a policy called call-by-value.
(let ((x 100)) (let ((p (lambda (x) (begin (set! x (+ x 1)) x)))) (+ (p x) (p x))))
This program evaluates to 202, because a new reference is created for x ateach of the procedure calls.
At each procedure call, the assignment affects only the inner binding.
Com S 342
The Reference Data Type
References can by represented by indices of a vector:
(define-datatype reference reference? (a-ref (position number?) (vec vector?)))
We need two operations: deref to access to value stored in a location setref! to set the value in a location
Com S 342
deref & setref!
(define deref(lambda (ref) (cases reference ref
(a-ref (pos vec) (vector-ref vec pos)))))
(define setref!(lambda (ref value) (cases reference ref
(a-ref (pos vec) (vector-set! vec pos value)))))
We use the Scheme vector procedures!
Com S 342
New Environment Data Type
(define-datatype environment environment? (empty-env-rec) (extended-env-rec
(syms (list-of symbol?))(vec vector?)(env environment?)))
(define empty-env (lambda ()
(empty-env-rec)))
Com S 342
A New Environment Representation
(define extend-env (lambda (syms vals env) (extended-env-rec syms (list->vector vals) env)))
(define apply-env (lambda (env sym) (deref (apply-env-ref env sym))))
(define apply-env-ref (lambda (env sym) (cases environment env (empty-env-rec () (eopl:error 'apply-env-ref "No binding for ~s" sym)) (extended-env-rec (syms vals old-env) (let ((pos (list-find-position sym syms))) (if (number? pos) (a-ref pos vals) (apply-env-ref old-env sym)))))))
convert a list to a vector
return a value
return a reference
Com S 342
Implementation of Variable Assignment
(define eval-expression (lambda (exp env) (cases expression exp …
(assign-exp (id r-value) (let ((val (eval-expression r-value env))) (setref! (apply-env-ref env id) val) val)) …)))
We need to return avalue, because thereturn value ofsetref! is unspecified.
Com S 342
Analysis of Solution
The new environment definition immediately provides a mechanism for call-by-value (parameters are elements of a vector).
<let <declarations <declaration <variable value = x /> <integer value = 100 /> /> /> <let <declarations <declaration <variable value = p /> <proc <params <param value = x /> /> <sequence <set <variable value = x /> <inc <arguments <reference value = x /> /> /> /> <reference value = x /> /> /> /> /> <add <arguments <invoke <reference value = p /> <arguments <reference value = x /> /> /> <invoke <reference value = p /> <arguments <reference value = x /> /> /> /> /> />/> 202
Com S 342
Parameter-Passing Variations
Call-by-value is the most commonly used form of parameter passing, and isthe standard against which other parameter-passing mechanisms are usuallycompared.
(let ((a 3) (p (lambda (x) (set! x 4)) (begin (p a) a))
Under call-by-value semantics, the denoted value associated with “x” is areference that initially contains the same variable as the reference associatedto “a”, but these references are distinct. Therefore, any assignment to “x”has no effect on the contents of “a”.
3
Com S 342
Call-By-Reference
The isolation between the caller and the callee, as in call-by-value, is generally desirable.
But, it is also valuable to allow a procedure to be passedvariables with the expectation that they will be assigned by theprocedure. In particular, we may want to use this approach,when the procedure returns multiple values.
The parameter-passing mechanism is called “call-by-reference”.
Com S 342
Semantics of Call-By-Reference
If an operand is a variable reference, then a reference to thevariable’s location is passed. The formal parameter of theprocedure is then bound to this location.
If the operand is some other kind of expression, then the formalparameter is bound to a new location containing the value of theoperand, just as in call-by-value.
Com S 342
The Procedure swap
(let((a 3) (b 4) (swap (lambda (x y)
(let ((temp x)) (begin
(set! x y)(set! y temp)
))))) (begin
(swap a b) (- a b)))
Under call-by-reference, thisswaps the values of “a” and“b”, so it returns 1.
Under call-by-value, thisprogram returns –1, becausethe assignments inside theswap procedure have no effecton the variables “a” and “b”.
Com S 342
Expressed Values and Denoted Values
Under call-by-reference, identifiers still denote references to expressedvalues, just as the did under call-by-value:
Denoted Value = Ref(Expressed Value) Expressed Value = Number + ProcVal
The only change occurs when new references are created. Under call-by-value, a new reference is created for every evaluation of an
operand. Under call-by-reference, a new reference is created for every evaluation of
an operand other than a variable.
Com S 342
A Problem
In our approach, call-by-value creates a new location for every operand in aprocedure application. We have put the values of all the operands in a vector,and have “apply-env-ref” create a reference to the location at variable-lookuptime.
Under call-by-reference, we will need a new location for some operands andnot for others. So, we need a different representation for references.
A reference will be, as before, a reference to a location within a vector. Butthe vector, instead of containing expressed values, will contain eitherexpressed values or references to expressed values. We call these two kindsof targets direct targets and indirect targets, respectively.
indirect targets are ref-2 pointers.
Com S 342
Targets
(define-datatype target target?(direct-target
(expval expval?))(indirect-target
(ref ref-to-direct-target?)))
Com S 342
expval? & ref-to-direct-target?
(define expval?(lambda (val)
(or (number? val) (procval? val))))
(define ref-to-direct-target?(lambda (val) (and (reference? val) (cases reference val
(a-ref (pos vec) (cases target (vector-ref vec pos) (direct-target (v) #t) (indirect-target (v) #f)))))))
Com S 342
deref
(define primitive-deref deref)
(define deref (lambda (ref) (cases target (primitive-deref ref) (direct-target (expval) expval) (indirect-target (ref-ref) (cases target (primitive-deref ref-ref) (direct-target (expval) expval) (indirect-target (p) (eopl:error 'deref "Illegal reference: ~s" ref-ref)))))))
Ref-2 pointer
Com S 342
setref!
(define primitive-setref! setref!)
(define setref! (lambda (ref expval) (let ((target-ref (cases target (primitive-deref ref)
(direct-target (aval) ref)
(indirect-target (aref) aref))))
(primitive-setref! target-ref (direct-target expval)))))
Use new reference
Use old reference
Com S 342
Environments in call-by-reference
Pseudo code:
(proc (&t, &u, &v, &w) (proc (&a, &b) (proc (&x, &y, &z) y := 13; a b 6) 3 v) 5 6 7 8)
(x y z) #[ , , 6 ]
(a b) #[ 3, ]
(t u v w) #[ 5, 6, 7, 8 ]
Both b and y point to thelocation denoted by v.
Com S 342
Specification of call-by-reference
We add different parameter modes:
<formals> ::="<params" {"<param" <formal-mode> "=" <identifier> "/>"}* "/>"
param-decls (modes ids)
<formal-mode> ::= "value" | "byref"
mode-value | mode-byref
Com S 342
Swap
<let <declarations <declaration <variable value = a /> <integer value = 3 /> /> <declaration <variable value = b /> <integer value = 4 /> /> <declaration <variable value = swap /> <proc <params <param byref = x /> <param byref = y /> /> <let <declarations <declaration <variable value = temp /> <reference value = x /> /> /> <sequence <set <variable value = x /> <reference value = y /> /> <set <variable value = y /> <reference value = temp /> /> /> /> /> /> /> <sequence <invoke <reference value = swap /> <arguments <reference value = a /> <reference value = b /> /> /> <sub <arguments <reference value = a /> <reference value = b /> /> /> />/>
1
Com S 342
Implementation of call-by-reference
In order to implement call-by-reference we have to analyzeeach place where sub-expressions are evaluated: Primitive application expression Let expression Reclet expression For expression Application expression
Com S 342
Primitive Application Expression
We do not need to change the behavior of primitive application expression,since we require it to yield a pure expressed value:
eval-expression: (primapp-exp (prim rands) (let ((args (eval-prim-rands rands env)))
(apply-primitive prim args)))
(define eval-prim-rands (lambda (rands env) (map (lambda (x) (eval-prim-rand x env)) rands)))
(define eval-prim-rand (lambda (rand env) (eval-expression rand env)))
Com S 342
Let Expression
For let-bound variables, we choose to retain the call-by-value semantics:
eval-expression: (let-exp (ids rands body)(let ((args (eval-let-rands rands env))) (eval-expression body (extend-env ids args env))))
(define eval-let-rands (lambda (rands env) (map (lambda (x) (eval-let-rand x env)) rands)))
(define eval-let-rand (lambda (rand env) (direct-target (eval-expression rand env))))
Com S 342
Reclet Expression
For reclet-bound variables, we use the same approach:…(reclet-exp (decls body)
(let* ((args (eval-rands (get-rands decls) env)) ;; filter recursive procedure ids (rec-proc-ids (map car (filter (lambda (p) (procval? (cadr p))) (zip (get-ids decls) args)))) ;; now change closure to closure-rec (new-args (map (lambda (v) (if (procval? v) (build-rec-proc v rec-proc-ids) v)) args)))
;; reclet: binder is direct-target (new-d-args (map (lambda (v) (direct-target v)) new-args)))
(eval-expression body (extend-env (get-ids decls) new-d-args env))))…
Build expressedvalues first.
Com S 342
For Expression
(define eval-loop (lambda (loop-decls conditions increments body env) (let ((p (unzip loop-decls)) (loop-test (primapp-exp (or-prim) (car conditions) (cdr conditions)))) (let continue ;; loop var denote direct targets ((new-env (extend-env (car p) (map direct-target (cadr p)) env)) (res 0)) (if (is-true? (eval-expression loop-test new-env)) (let ((new-res (eval-expression body new-env)) (step-res (map (lambda (e) (eval-expression e new-env)) increments))) ;; next iteration, direct targets (continue (extend-env (car p) (map direct-target step-res) env) new-res) ) res) ) ) ) )
Com S 342
Application Expression
For procedure applications we need to analyze the parameter modes:
… (app-exp (rator rands) ;; check procval first in order to extract parameter modes (let ((proc (eval-expression rator env))) (if (procval? proc) ;; we need parameter modes here (let ((args (eval-proc-rands (get-param-modes (closure-params proc)) rands env))) ;; add calling-env to resolve occurring recursive procedures
(apply-procval proc args env)) (eopl:error 'eval-expression
"Attempt to apply non-procedure ~s" proc))))…
Com S 342
eval-proc-rands
(define eval-proc-rands (lambda (modes rands env) (if (= (length rands) (length modes)) (map (lambda (p) (eval-proc-rand p env)) (zip modes rands)) (eopl:error 'eval-proc-rands "Parameter mismatch") )))
Com S 342
eval-proc-rand
(define eval-proc-rand (lambda (argument env) (cases pmode (car argument) (mode-value () (direct-target (eval-expression (cadr argument) env))) (mode-byref () (cases expression (cadr argument) (var-exp (id) ;; build new reference (indirect-target (let ((ref (apply-env-ref env id)))
(cases target (primitive-deref ref) (direct-target (val) ref)
(indirect-target (a-ref) a-ref))))) (else (direct-target (eval-expression (cadr argument) env))) )) )))
build new reference
return new reference
return old reference
Com S 342
Mixing Parameter Modes
<let <declarations <declaration <variable value = a /> <integer value = 3 /> /> <declaration <variable value = b /> <integer value = 4 /> /> <declaration <variable value = swap /> <proc <params <param value = x /> <param byref = y /> /> <let <declarations <declaration <variable value = temp /> <reference value = x /> /> /> <sequence <set <variable value = x /> <reference value = y /> /> <set <variable value = y /> <reference value = temp /> /> /> /> /> /> /> <sequence <invoke <reference value = swap /> <arguments <reference value = a /> <reference value = b /> /> /> <sub <arguments <reference value = a /> <reference value = b /> /> /> />/>
0
Com S 342
Object-Oriented Extensions
The object-oriented programming paradigm enables us toassociate functions to data more directly.
To add objects to XMLScheme, we need to add a new form ofvalue to the set of expressed values:
Expressed Value = Number + ProcVal + ObjectVal
Com S 342
Object Values
(define-datatype objectval objectval? (object (ivars environment?) (methods environment?) (image environment?)) )
Com S 342
Object-Oriented Language Extensions
<expression> ::= "<class" {<i-vars>}* {<methods>}* "/>“<expression> ::= "<send"
"<message" "name" "=" <identifier> "/>“ <expression> <arguments> "/>"
<i-vars> ::= "<variables" {<ivar>}* "/>"<i-var> ::= <instance" "<variable" "value" "=" <identifier> "/>"
<expression> "/>"
<methods> ::= "<methods" {<method>}* "/>"<method> ::= "<method" "<name" "value" "=" identifier "/>"
<formals> <expression> "/>"
Com S 342
Example
<let <declarations <declaration <variable value = listobj /> <invoke <reference value = List /> <arguments <integer value = 1 /> <integer value = 0 /> /> /> /> /> <sequence <send <message name = consI /> <reference value = listobj /> <arguments <integer value = 3 /> <integer value = 0 /> /> /> <send <message name = hd /> <reference value = listobj /> <arguments /> /> /> />
Com S 342
Evaluation of Classes
(class-exp (params i-decls m-decls) (let ((vtable (evaluate-methods m-decls env))) (closure-class params i-decls (car vtable) (cadr vtable) env)))
A class is encodedas a function
Com S 342
Classes As Procedures
(define-datatype procval procval? … (closure-class (params pdeclarations?) (i-decls i-vars?) (mids (list-of symbol?)) (m-decls (list-of target?)) (env environment?)) )
Com S 342
closured.scm
(closure-class (params i-decls mids mprocs env) …
Com S 342
Class List
<class <params <param value = car /> <param value = cdr /> /> <instances <instance <variable value = head /> <reference value = car /> /> <instance <variable value = tail /> <reference value = cdr /> /> /> <methods <method <name value = cons /> <params <param value = car /> <param value = cdr /> /> <sequence <set <variable value = head /> <reference value = car /> /> <set <variable value = tail /> <reference value = cdr /> /> /> /> <method <name value = hd /> <params /> <reference value = head /> /> <method <name value = tl /> <params /> <reference value = tail /> /> <method <name value = consI /> <params <param value = car /> <param value = cdr /> /> <send <message name = cons /> <reference value = self /> <arguments <reference value = car /> <reference value = cdr /> /> /> /> /> />
Com S 342
Evaluation of Methods
(send-exp (mid obj rands) (let (;; evaluate object (receiver (eval-expression obj env))) (if (objectval? receiver) (let* (;; find method (mproc (lookup-method receiver mid)) ;; evaluate arguments (args (eval-proc-rands (get-param-modes (closure-params mproc)) rands env))) ;; call method using 'self' as calling env (apply-method mproc args (self receiver)) ) (eopl:error 'eval-expression "Receiver is not an object") )))
Com S 342
Method Lookup & Self
(define lookup-method (lambda (obj mid) (cases objectval obj (object (ivars methods image) (apply-env methods mid)) ) ) )
(define self (lambda (obj) (cases objectval obj (object (ivars methods image) image) ) ) )
Com S 342
apply-method
(define apply-method (lambda (mproc args object-env) (cases procval mproc (closure (params body env) (eval-expression body (extend-env (get-param-ids params) args (link-env object-env env)))) (else (eopl:error 'apply-method "Illegal method call")) ) ) )
Add receiver object to callingenvironment of method.
Com S 342
Type Systems
Overview What is a Type? Static vs. Dynamic Typing Kinds of Typing Polymorphic types Overloading
References Daniel P. Friedman et al., “Essentials of Programming Languages”, Second Edition,
MIT Press, 2001 David Watt, “Programming Language Concepts and Paradigms”, Prentice Hall,
1990 Luca Cardelli and Peter Wegner, “On Understanding Types, Data Abstractions, and
Polymorphism”, ACM Computing Survey, 17/4, Dec. 1985, pp. 471-522
Com S 342
What Is a Type?
Type errors:> (+ 5 `())Error in +: () is not a number.Type (debug) to enter the debugger.
A type is a set of values: Integer = {…, -2, -1, 0, 1, 2, …} Boolean = {True, False} Point = { (x y) | x,y ∈ Integer }
A type is a partial specification of behavior: n, m ∈ Integer ⇒ (+ n m) is valid, but (not n) is an error
Com S 342
Static Typing
Values have static types defined by the programminglanguage.
A language is statically typed if it is always possible todetermine the (static) type of an expression based on theprogram text alone.
Com S 342
Dynamic Typing
Variables and expressions have dynamic types determined bythe values they assume at run- time.
A language is dynamically typed if only values have fixedtype. Variables and parameters may take on different typesat run-time, and must be checked immediately before theyare used.
Com S 342
Type Consistency
A language is strongly typed if it is possible to ensure thatevery expression is type consistent based on the programtext alone.
Type consistency may be assured by compile-time type-checking, type inference, or dynamic type-checking.
Com S 342
Kinds of Types
All programming languages provide some set of built-in types.
Most strongly-typed modern languages provide for additionaluser-defined types: Primitive types: Booleans, Integers, Floats, Chars, ... Composite types: Functions, Lists, Tuples, ... User-defined types: Enumerations, Recursive Types, Generic Types, ...
The Type Completeness Principle (Watt):No operation should be arbitrarily restricted in the types of values involved.
Com S 342
Types in Scheme
Scheme is a dynamically typed language. However, no object satisfies morethan one of the following predicates:
boolean? pair? symbol?number? char? string?vector? port? procedure?
These predicates define the types Boolean*, Pair, Symbol, Number,Character, String, Vector, Port, and Procedure.* All values in Scheme count as true except ‘#f’.
The empty list is a special object of its own type; it satisfies none of the typepredicates.
Com S 342
A Language for Scheme Types
<Type> ::= bool | int | symbol | char | string | <Identifier> | ( <Type> ) lists | ( {<Type>}*(+) ) tuples | ( {<Type>}*(*) -> <Type> ) functions
<Typed Expression> ::= ( <Expression> <Type> )
This type language used solely toillustrate a possible approach to addtype assignments to Scheme.
Com S 342
Function Types
Functions types allow one to deduce the types of expressionswithout the need to evaluate them:
(define increment (lambda ((x int)) (+ x 1)))
(+ ((int * int)-> int)) (1 int) (increment (int -> int)) (42 int) ⇒ ((increment 42) int)
type binding
Com S 342
List and Tuple Types
List types: A list of values of type “a” has the type “(a)”: ((1 2 3) (int))
Note: All elements in a list must have the same type! (“Hello world!” 2 #f ) – this is illegal! It cannot be typed!
Tuple types: If the expressions x1, x2, …, xn have types a1, a2, …, an respectively,
then the tuple (x1 x2 … xn) has type (a1 + a2 + … + an):
((1 (2) 3) (int + (int) + int))((“Hello world!” #f) (string + bool))(((1 2) (3 4)) ((int + int) + (int + int)))
Com S 342
Polymorphism
Languages like Pascal have monomorphic type systems: every constant,variable, parameter and function result has a unique type. Such languageshinders, however, the definition of generic abstraction, if possible at all.
Modern languages also incorporate (universally quantified) polymorphictypes. Polymorphic type expressions describe families of types. For example,“(∀ a) (a)” is the family of types consisting of, for every type “a”, the typeof lists of “a”.
Scheme also allows the definition of expression that can be assigneduniversally quantified polymorphic types. In a type language, polymorphictypes are represented by type variables.
Com S 342
Polymorphic Types
We can deduce the types of expressions using polymorphicfunctions by simply binding type variables to concrete types:
Consider:(length ((a) -> int))(string-length (string -> int))(map ((a -> b) -> (a) -> (b)))
Then:((map string-length) ((string) -> (int)))((“Hello” “World”) (string))((map string-length '("Hello" "World" "!")) (int))
The Scheme versionof map does not allowthis form!
Com S 342
Kinds of Polymophism
Universal: Parametric:
polymorphic map function in Scheme, “void *” in C “Object” in Java
Inclusion: subtyping — graphic objects
Ad Hoc: Overloading:
The operator + applies to both integers and floating point numbers. Coercion:
Integer values can be used where floating point numbers are expected andvice versa.
Com S 342
Coercion or Overloading
How does one distinguish?
3 + 43.0 + 43 + 4.03.0 + 4.0
Com S 342
The Typed Lambda Calculus
There are many variants of the lambda calculus. The typed lambdacalculus decorates terms with type annotations:
Syntax:e ::= xτ | e1
(σ τ) e2σ | (λxσ . eτ)(σ τ)
Operational Semantics:α-conversion: λxσ . eτ ↔ λyσ . [yσ/xσ]eτ where yσ is fresh (in eτ)β-reduction: (λxσ . e1
τ) e2σ [e2
σ/xσ]e1τ avoiding name capture
η-reduction: λxσ . (e1τ xσ) e1
τ if xσ is not free in e1τ
Examples:T ≡ (λxσ. (λyτ . xσ)(τ σ))(σ (τ σ))
F ≡ (λxσ. (λyτ . yτ)(τ τ))(σ (τ τ))
Com S 342
A Type System
A type assumption is a partial function Γ : V T with a finite domain V theset of variables.
A type assertion or type judgement is a triple (Γ, e, τ), where Γ is a typeassumption, e is a lambda-term (either typed or untyped, depending on thecontext), and τ is a type. The domain of Γ is exactly the set of free variablesof e (fv(e)).
A type system for the lambda-calculus:
2 : )
2e
1(e
1 :
2e )
2
1( :
1e
)2
1
( : e) . 1
:x ( 2
: e 1
: x ;
: x : x ;
!
!!!
!!!"
!!
!!
#$
#$%#$
%#$
#$
#$
Com S 342
Find Type of an Expression
(define type-check (lambda (gamma e) (if (is-expression? e) (cond ((is-constant? e) 'int) ((is-variable? e) (type-of-variable gamma e)) ((is-abstraction? e) … ) ((is-application? e) … ) ) (eopl:error 'type-check "Argument (~s) is not an expression" e) ) ))
!! : x : x ; "#
We add Integer constants.
Com S 342
Type of a Variable: t = Γ(x)
(define type-of-variable (lambda (gamma id) (if (null? gamma) (eopl:error 'type-of-variable "Undefined identifier ~s" id) (if (equal? (caar gamma) id) (cadar gamma) (type-of-variable (cdr gamma) id) ) ) ))
Com S 342
Built-in Functions
(define gamma-zero '( (add (int -> (int -> int))) (sub (int -> (int -> int))) (mul (int -> (int -> int))) (div (int -> (int -> int))) ))
Com S 342
Abstraction
… ((is-abstraction? e) (let* ((formal-type (get-formal-type e)) (type-of-e1 (type-check (cons (list (caadr e) formal-type) gamma) (caddr e)))) (list formal-type '-> type-of-e1)))…
)2
1
( : e) . 1
:x ( 2
: e 1
: x ;
!!!"
!!
#$%
$%
Com S 342
Application
…((is-application? e) (let ((type-of-e1 (type-check gamma (car e))) (type-of-e2 (type-check gamma (cadr e)))) (if (is-function-type? type-of-e1) (if (equal? (get-argument-type type-of-e1) type-of-e2) (get-result-type type-of-e1) (eopl:error 'type-check "Wrong argument type ~s" type-of-e2) ) (eopl:error 'type-check "~s is not a function" (car e)) ) ))…
2 : )
2e
1(e
1 :
2e )
2
1( :
1e
!
!!!
"#
"#$"#
Com S 342
Examples
Success:> (type-check gamma-zero '((add 1) 2))int> (type-check gamma-zero '(add 1))(int -> int)> (define t '(lambda (x (int -> (int -> int))) (lambda (y (int -> int)) (lambda (z int) ((x z) (y z)))))> (type-check '() t)((int -> (int -> int)) -> ((int -> int) -> (int -> int)))
(add (int -> (int -> int)))(sub (int -> (int -> int)))(mul (int -> (int -> int)))(div (int -> (int -> int)))
Com S 342
Examples
Type errors:> (type-check gamma-zero '(1 2))Error reported by type-check:1 is not a function> (type-check gamma-zero '(add div))Error reported by type-check:Wrong argument type (int -> (int -> int))
(add (int -> (int -> int)))(sub (int -> (int -> int)))(mul (int -> (int -> int)))(div (int -> (int -> int)))
Com S 342
The Polymorphic Lambda Calculus
Polymorphic functions like “map” cannot be typed in the typed lambdacalculus!
We need type variables to capture polymorphism:
β-reduction (ii): (λxΑ . e1τ) e2
σ [σ/A]([e2σ/xΑ]e1
τ)
Example:T ≡ (λxΑ. (λyΒ . xΑ)(Β Α))(Α (Β Α))
T (Α (Β Α)) ατ bσ (λyΒ . aτ)(Β τ) bσ
aτ
Com S 342
Polymorphism and Self Application
Even the polymorphic lambda calculus is not powerful enough toexpress certain lambda terms.
Recall that both Ω and the Y combinator, which make use of“self application”:
Ω ≡ (λx . x x) (λx . x x)Y ≡ λf . (λx . f (x x)) (λx . f (x x))
What type annotation would you assign to the expression? Aretheses terms typable at all?
Com S 342
Type Inference
Overview: The type inference problem Typed lambda terms, type assertions, and typing rules Wand’s algorithm Unification of type equations
References: Mitchell Wand, “A Simple Algorithm and Proof for Type Inference”,
Fundamenta Informaticae, 10:115-122, 1987 J. Roger Hindley, “Basic Simple Type Theory”, Cambridge University Press,
1997 John C. Mitchell, “Foundations for Programming Languages”, MIT Press,
1996
Com S 342
The Type Inference Problem
The type inference problem can be stated as follows:
“Given a term of the untyped lambda calculus, find all terms of the typedlambda calculus, which yield the given term when the type information onbound variables is deleted.”
Since such terms can differ only in their types, this problem is sometimesreferred to as finding the “possible typings” of a term.
This problem was first formulated and solved by Curry (in the 1930’s) andHindley (1969).
Milner (1978) was the first to make the connection with the unificationproblem formulated by Robinson (1965).
Com S 342
Typed Lambda-Terms
The set e of untyped lambda-terms are defined as follows:e ::= x | (λx . e ) | (e1 e2)
The set T of types is defined as follows:t ::= K - basic types | (t1 t2) - function types
The set eT of typed lambda-terms is obtained by modifying the second clausein the definition of the untyped lambda-terms:
(λx : t . e)
Com S 342
Type Inference for Closed Terms
The type inference problem for closed term can be stated as follows:
Given a closed lambda-term e, find all types t such that( ∅, e, t ).
The set of type expressions is defined by adding type variables, written τ, tothe set t of types.
Theorem: Given a closed lambda-term e, it is decidable whether there exists a type t such
that ( ∅, e, t ). If there is any such t, then there is a type expression u such that the typings of e
are precisely the types of the form σu for all substitutions σ.
Com S 342
Wand’s Algorithm - Skeleton
Input:A lambda-term e0.
Initialization:Set E = ∅ and G = {(Γ0, e0, t0)}, where t0 is a type variable and Γ0 maps thefree variables of e0 to other distinct type variables.
Loop Step:If G = ∅, then halt and return E. Otherwise, choose a subgoal (Γ, e, t) fromG, delete it from G, and add to E and G new verification conditions andsubgoals, as specified in an action table.
End of Skeleton
Com S 342
The Action Table
Case (Γ, x, t):Generate the equation t = Γ(x).
Case (Γ, (λx . e ), t):Let τ1 and τ2 be fresh type variables. Generate the equationt = (τ1 τ2) and the subgoal (Γ; x : τ1, e, τ2).
Case (Γ, (e1 e2), t):Let τ1 be a fresh type variable. Then generate the subgoals(Γ, e1, τ1 t) and (Γ, e2, τ1).
Com S 342
Example
Consider (λx . (λy . (λz . ((x z) (y z)) ) ) )
{(∅, (λx . (λy . (λz . ((x z) (y z)) ) ) ),τ0)}{((x : τ1), (λy . (λz . ((x z) (y z)) ) ),τ2)}; τ0 = (τ1 τ2){((x : τ1, y : τ3), (λz . ((x z) (y z)) ),τ4)}; τ2 = (τ3 τ4){((x : τ1, y : τ3, z : τ5), ((x z) (y z)),τ6)}; τ4 = (τ5 τ6){((x : τ1, y : τ3, z : τ5), (x z),(τ7 τ6)), ((x : τ1, y : τ3, z : τ5), (y z),τ7)}{((x : τ1, y : τ3, z : τ5), x,(τ8 (τ7 τ6))), ((x : τ1, y : τ3, z : τ5), z,τ8),
((x : τ1, y : τ3, z : τ5), (y z),τ7)}{((x : τ1, y : τ3, z : τ5), z,τ8), ((x : τ1, y : τ3, z : τ5), (y z),τ7)}; (τ8 (τ7 τ6)) = τ1
{((x : τ1, y : τ3, z : τ5), (y z),τ7)}; τ8 = τ5
{((x : τ1, y : τ3, z : τ5), y,(τ9 τ7)), ((x : τ1, y : τ3, z : τ5), z,τ9)}{((x : τ1, y : τ3, z : τ5), z,τ9)}; (τ9 τ7) = τ3
∅; τ9 = τ5
Com S 342
The Equation Set
The generated equations are:τ0 = (τ1 τ2)τ2 = (τ3 τ4)τ4 = (τ5 τ6)
(τ8 (τ7 τ6)) = τ1
τ8 = τ5
(τ9 τ7) = τ3
τ9 = τ5
Solving these equations by the unification algorithm gives the solution:
τ0 = ((τ5 (τ7 τ6)) ((τ5 τ7) (τ5 τ6)))
which is the so-called principal type of the term λx . λy . λz . x z (y z).
Com S 342
The Algorithm unify
unify(∅) = ∅
unify(E ∪ {K1 = K2}) =if K1 ≠ K2 then fail
else unify(E)
unify(E ∪ {τ = t}) =if τ ≡ t then unify(E)
else if τ occurs in t then failelse unify([t/τ]E) ° [t/τ]
unify(E ∪ {t = τ}) = unify(E ∪ {τ = t})
unify(E ∪ {(t1 t2) = (t3 t4)}) = unify(E ∪ {t1 = t3 } ∪ {t2 = t4})
Com S 342
The Algorithm PT
PT(x) = {x : τ} ∴ x : τ
PT(e1 e2) = let Γ ∴ e1’ : τ = PT(e1) Γ’ ∴ e2’ : σ = PT(e2) S = unify ({α = β | x : α ∈ Γ and x : β ∈ Γ’} ∪ {τ = (σ ρ)})
where ρ is a fresh type variable in
SΓ ∪ SΓ’ ∴ S(e1’ e2’) : Sρ
PT(λx . e) = let Γ ∴ e’ : ρ = PT(e) in
if x : τ ∈ Γthen Γ – {x : τ} ∴ λx : τ. e’ : (τ ρ)else Γ ∴ λx : σ. e’ : (σ ρ)
where σ is a fresh type variable
S: set of substitutions