View
218
Download
0
Embed Size (px)
Citation preview
Programming Languages: The Essence of Computer Science
Robert HarperCarnegie Mellon University
October, 2002
CS Is About Programming
• How to build systems.– Better, faster, cheaper.– Reliable, maintainable, extensible.
• Evaluating and comparing systems.– Performance, behavior, security.– Compliance, usability.
• What can and can’t be done.– Algorithms and complexity.
Programming Is Linguistic
• Programming is an explanatory activity.– To yourself, now and in the future.– To whoever has to read your code.– To the compiler, which has to make it run.
• Explanations require language.– To state clearly what is going on.– To communicate ideas over space and
time.
Programming Languages Are Essential
• Therefore languages are of the essence in computer science.– Programming languages, in the familiar
sense.– Description languages, design patterns,
architecture languages, specification languages.
Some Conventional Wisdom
• PL research is over. – The language of the future is X … for various
values of X.– It’s not about the one, true language!
• Anyone can design a PL.– Clearly, anyone does.– But look what you get! TCL, Perl, TeX, C++.
• PL research is irrelevant to practice.– Seen as a purely academic pursuit.– But the tide is turning as successes accumulate.
Some Accomplishments
• High-level languages.
• Static type disciplines.
• Automatic storage management.
• Objects, classes, ADT’s.
• Sophisticated compiler technology.
• Specification and verification.
Why People Don’t Notice
• It takes decades to go from research to practice.– Similar to other areas, such as algorithms, AI.– Large installed base of programmers.
• Ideas become “common sense” long before they are widely used.– Even very simple things, such as lexical scope,
were once controversial!– Not to mention data abstraction, OOP, ….
Some Important Trends
• Safety with performance.– High-level languages with good compilers.– Low-level languages with good type
systems.
• Languages, not tools.– The design should live with the code.– Conformance should be checkable
throughout the evolution of the code.
Some Important Trends
• Interplay between theory and practice.– “Prove theorems and report numbers.”– Build systems to assess practicality and to
discover new problems.
• Full-spectrum coverage.– Functional and imperative.– High and low-level languages.– Design and implementation.
Some Important Trends
• Type theory as the GUT of PL’s.– Provides a precise criterion for safety and
sanity of a design.– “Features” correspond to types.– Close connections with logics and
semantics.
What Is A Type System?
• Static semantics: the well-formed programs.
• Dynamic semantics: the execution model.
What is a Type System?
• Safety theorem: types predict behavior.– Types describe the states of an abstract
machine model.– Execution behavior must cohere with these
descriptions.
• Thus a type is a specification and a type checker is a theorem prover.
Types Are Specifications
• Examples:– e : float means that e evaluates to a value
in, say, floating point register 0.– e : float ! int means that e is a procedure
that is called with arg in FR0 and returns with result in GR0.
– e : queue means that e behaves like a queue.
Types Are Formal Methods
• Type checking is the most successful formal method!– In principal there are no limits.– In practice there is no end in sight.
• Examples:– Using types for low-level languages, say inside a
compiler.– Extending the expressiveness of type systems for
high-level languages.
Types in Compilation
• Conventional compilers:Source = L1 L2 … Ln = Target
:
T1
• Types apply only to the source code.– Type check, then discard types.– If compiler is correct, target code is safe.
Typed Intermediate Languages
• Generalize syntax-directed translation to type-directed translation.– Intermediate languages come equipped
with a type system.– Compiler transformations translate both a
program and its type.– Translation preserves typing: if e:T then
e*:T* after translation
Typed Intermediate Languages
• Type-directed translation:Source = L1 L2 … Ln = Target : : :
T1 T2 … Tn
• Transfers typing properties from source code to object code.– Check integrity of compiler.– Exploit types during code generation.
Certifying Compilers
• Types on object code certify its safety.– Type checking ensures safety.– Type information ensures verifiability.
• Examples of certified object code:– TAL = typed assembly language.– PCC = bare code + proof of safety.– Many variations are being explored.
TAL Example
fact: 8 .{r1:int, sp:{r1:int, }::} jgz r1, positive mov r1,1 retpositive: push r1 ; sp:int::{t1:int,sp:}:: sub r1,r1,1 call fact[int::{r1:int,sp:}::] imul r1,r1,r2 pop r2 ; sp:{r1:int,sp:}::ret
Types for Low-Level Languages
• What is a good type system for a low-level language?– Should expose data representations.– Should allow for low-level “hacks”.– Should be verifiably safe.– Should not compromise efficiency.
• Current systems make serious compromises.– Very weak safety properties.– Force atomicity of complex operations.
Example: Memory Allocation
• Most type systems take an atomic view of constructors.– Allocate and initialize in “one step”.– Even HLL’s like Java impose restrictions.
• We’d like to expose the “fine structure”.– Support code motion such as coalescing.– Allow incremental initialization.– But nevertheless ensure safety!
Example: Memory Allocation
• An allocation protocol (used in TILT):– Reserve: obtain raw, un-initialized space.– Initialize: assign values to the parts.– Allocate: baptize as a valid object.
• Current type systems cannot handle this.– Partially initialized objects.– In-place modification of parts.– Interaction with collected heap.
A Low-Level Type System
• Borrow two ideas from linear logic.– Restricted and unrestricted variables.– A modality to mediate between them.
• Restricted variables are resources.– Think: pointer into middle of an object.
• Unrestricted variables are standard.– Think: heap pointer.
A Low-Level Type System
• Variables are bound to valid objects.– Can be used freely.– Garbage collected when inaccessible.
• Resources are bound to parts of objects-in-creation.– Cannot be passed around freely.– Explicitly allocated and disposed.
Restrictions on Resources
• Linearity: use resources exactly once.– Admits re-typing after initialization.– Ensure allocation before general usage.
• Ordering: resource adjacency matters.– Admit “pointers into the middle” of objects.– Supports in-place, piecemeal initialization.
Variables and Resources
• Typing judgments:
• Ordering of x’s does not matter.– Abstract “mobile” locations.
• Ordering and usage of a’s does matter.– Abstract “pinned” locations, with (a form of)
pointer arithmetic.
Low-Level Type Constructors
• Contiguous data: 1 ² 2.
– Two contiguous values.– Two adjacent words: int ² int.
• Mobile data object: !.– A fully initialized object of type .
• Example: 1 £ 2 := ! (1 ² 2).
– A pointer to an adjacent pair of values.
Allocating a Pair
• Allocate (1,2): Initialize a1, using it up.Re-introduce a1.
Must be empty on return.
Reserve space at a.Create names for parts.
Fuse parts and allocate.
Resource a is used up!
What Have We Gained?
• The ordered type system ensures:– All reserved data is eventually allocated.– Initialization can happen in any order.– Cannot read un-initialized memory.
• May also be used for inter-operability.– Adjacency requirements.– Marshalling and un-marshalling.– Precise control over representations.
Types for High-Level Languages
• Recall: types express invariants.– Calling sequence, data representation.– Abstraction boundaries (polymorphism).
• Theme: capture more invariants.– Representation invariants.– Protocol compliance.
• Trade-off expressiveness for simplicity and convenience.
Data Structure Invariants
• Example: bit strings.– nat: no leading 0’s.– pos : a non-zero Nat.– bits : any old thing.
• Goal: check such properties of code.– Incr takes nat to pos, preserves pos.– Left shift preserves nat, pos.
Data Structure Invariants
• Properties of interest:– pos ` nat ` bits
• Operations:– : bits– add0 : bits ! bits Æ nat ! pos– add1 : bits ! bits Æ nat ! pos
Data Structure Invariants
• Logical consequences:– add0 : nat ! pos Æ nat ! nat– add1 : pos ! pos– : bits
• Type check code to check invariants!– Simple bi-directional algorithm suffices.– Example: increment.
Data Structure Invariants
• Increment:
inc: bits ! bits Æ nat ! pos
fun inc () ) 1 | inc (b0)) b1 | inc (b1)) (inc b)0
Data Structure Invariants
• Fully mechanically checkable!– Type check inc twice, once for each conjunct.– First pass: assume argument is just bits, derive
that result is therefore bits.– Second pass: assume argument is nat, derive that
result is pos.
• Requires checking entailment of properties.– Decidable for subtype-like behavior.
Value Range Invariants
• Array bounds checking (a la Pascal):– [0..size(A)]
• Null and non-null objects:– null, really(C)
• Aliasing:– its(c): the object c itself
Watch Out!
• Such types involve dynamic entities.– [0..size(A)] : A is an array.– its(o) : o is a run-time object.
• But types are static!– What is an expression of type
[0..size(if … then A else B)]???
• How to get static checking?
Dependent Types
• Solution: compile-time proxies.– Track values insofar as possible.– Existentially quantify when its not.
• Examples:– 0 : its(0)– + : its(1) £ its(2) ! its(4)– if … then 1 else 2 : 9 n its(n)
Types for the World
• What about the state of the world?– The lock l is held.– The file f is open.– The contents of cell c is positive.
• But such properties change as execution proceeds!– Here I’m holding the lock, there I’m not.– The file is open now, not later.
Types for the World
• Want a simple, checkable type system for the “world”.– Types characterize the current state.– Types of the state change over time.
• But what kind of type system?– Properties of the world are ephemeral.– Facts are no longer persistent.
• Need: a logic of ephemera.
Types for the World
• Ephemera behave strangely:– Cannot be replicated: holding twice is not
the same as holding once.– Cannot be ignored: must not lose track of
whether you hold a lock.
• Once again linear logic does the trick!– Model parts of the world as resources.– Type changes track state changes.
Types for the World
• A very simple locking protocol:– acquire : 8 l its(l) / free(l) ! unit / held(l)
– release : 8 l its(l) / held(l) ! free(l)
– new : unit ! 9 l unit / free(l)A “linear” function type. Gets “used up”.“Replaces” premise.
Value / World
Types for the World
• What does this buy you?– Stating and enforcing simple protocol
properties.– Locality of reasoning: focus only on what
changes, not what stays the same.
• It’s much harder than it sounds!– Separation logic: Reynolds and O’Hearn.– But it can be done: Vault Project at MSR.
Summary
• PL research is providing creative, useful solutions to practical problems.– Building on decades of fundamental
research, esp. in logic and type theory.
• There is much, much more coming!– Many ripe fruit, lots of new developments.– Many good problems yet to be addressed.