46
Regular Functions Rajeev Alur University of Pennsylvania

Regular Functions Rajeev Alur University of Pennsylvania

Embed Size (px)

Citation preview

Page 1: Regular Functions Rajeev Alur University of Pennsylvania

Regular Functions

Rajeev Alur University of Pennsylvania

Page 2: Regular Functions Rajeev Alur University of Pennsylvania

Languages vs Functions

A language L is a subset of S* A numerical function maps strings in S* to N (or integers Z) A string-to-string transformation maps S* to G*

For Turing-complete models of computation, choice is not critical

Page 3: Regular Functions Rajeev Alur University of Pennsylvania

“Finite-state” Computation

For language-based view, the definable class is regular languages

Many alternative characterizations Appealing theoretical properties Finite Automata: Intuitive operational model with efficient

analysis algorithms Many applications

What is the analog of regularity for defining functions?

Page 4: Regular Functions Rajeev Alur University of Pennsylvania

Finite Automata with Cost Labels

C: Buy CoffeeS: Fill out a surveyM: End-of-month

C / 2 C / 1S

MM

Maps a string over {C,S,M} to a cost value:Cost of a coffee is 2, but reduces to 1 after filling out

a survey until the end of the monthOutput is computed by implicitly adding up transition costs

Intuitive, analyzable, and many applicationsBut expressiveness not theoretically robust

S

Page 5: Regular Functions Rajeev Alur University of Pennsylvania

Finite Automata with Cost Registers

C / x:=x+2 C / x:=x+1S

MM

Cost Register Automata:Finite control + Finite number of registersRegisters updated explicitly on transitionsRegisters are write-only (no tests allowed)Each (final) state associated with output register

xx:=0

x

S

Page 6: Regular Functions Rajeev Alur University of Pennsylvania

CRA Example

C / x:=x+2 C / x:=x+1S

M / x:=0M / x:=0

At any time, x = costs of coffees during the current month

Cost register x reset to 0 at each end-of-month

xx:=0

x

S

Page 7: Regular Functions Rajeev Alur University of Pennsylvania

CRA Example

C / x:=x+2

C / x:=x+1S / x:=y

M / y:=x M / y:=x

Filling out a survey gives discounted cost for all the coffees during that month

xx,y:=0

x

y:=y+1

S

Page 8: Regular Functions Rajeev Alur University of Pennsylvania

CRA Example

C / y:=y+1

M / x:=min(x,y); y:=0

Output equals the minimum number of coffees consumed during a month

Updates use two operations: increment and min

min(x,y)

y:=0

x:=Infty

Page 9: Regular Functions Rajeev Alur University of Pennsylvania

Talk Outline

Definition of Regular Functions

Additive Regular Functions

String Transducers

Regular Functions over a Semiring

Conclusions + Open Problems

Page 10: Regular Functions Rajeev Alur University of Pennsylvania

Cost Model

Cost Grammar G: Defines a set of termsInc: t := c | (t+c)Plus: t := c | (t+t)Min-Inc: t := c | (t+c) | min(t,t)Inc-Scale: t := c | (t+c) | (t*d)

Interpretation []:Set D of cost valuesMapping operators to functions over D

Example interpretations for the Plus grammar:Set N of natural numbers with additionSet G* of strings with concatenation

Page 11: Regular Functions Rajeev Alur University of Pennsylvania

Regular Cost Function

Definition parameterized by the cost model C=(D,G,[])

A (partial) function f:S*->D is regular w.r.t. the cost model C if there exists a string-to-tree transformation g such that

(1) for all strings w, f(w)=[g(w)](2) g is a regular string-to-tree transformation

Page 12: Regular Functions Rajeev Alur University of Pennsylvania

Example Regular Cost Function

Cost grammar Min-Inc: t := c | (t+c) | min(t,t)Interpretation: Natural numbers with usual meaning of + and minS={C,M}f(w) = Minimum number of C symbols between successive M’s

Infty 0 1 1 0 1 1 1

+ + + + +

min min

Input w= C C M C C C M

Tree:

Value = 2

Page 13: Regular Functions Rajeev Alur University of Pennsylvania

Regular String-to-tree Transformations

Definition based on MSO (Monadic Second Order Logic) –definable graph-to-graph transformations (Courcelle)

Studied in context of syntax-directed program transformations, attribute grammars, and XML transformations

Operational model: Macro Tree Transducers (Engelfriet et al)

Recent proposals:Streaming String Transducers (POPL 2011)Streaming Tree Transducers (ICALP 2012)

Page 14: Regular Functions Rajeev Alur University of Pennsylvania

Properties of Regular Cost Functions

Known properties of regular string-to-tree transformations imply:

If f and g are regular w.r.t. a cost model C, and L is a regular language, then “if L then f else g” is regular w.r.t. C

Reversal: define Rev(f)(w) = f(reverse(w)).If f is regular w.r.t. a cost model C, then so is Rev(f)

Costs grow linearly with the size of the input string:Term corresponding to a string w is O(|w|)

Page 15: Regular Functions Rajeev Alur University of Pennsylvania

Talk Outline

Additive Regular Functions

String Transducers

Regular Functions over a Semiring

Conclusions + Open Problems

Page 16: Regular Functions Rajeev Alur University of Pennsylvania

Regular Cost Functions over Commutative Monoid

Cost model: D with binary function +Interpretation for + is commutative, associative, with identity 0

Cost grammar G(+): t := c | (t+t)

Cost grammar G(+c): t := c | (t+c)

Thm: Regularity w.r.t. G(+) coincides with regularity w.r.t. G(+c)

Proof intuition: Show that rewriting terms such as (2+3)+(1+5) to (((2+3)+1)+5) is a regular tree-to-tree transformation, and use closure properties of tree transducers

Page 17: Regular Functions Rajeev Alur University of Pennsylvania

Additive Cost Register Automata

Additive Cost Register Automata:DFA + Finite number of registersEach register is initially 0Registers updated using assignments x := y + cEach final state labeled with output term x + c

Given commutative monoid (D,+,0), an ACRA defines a partial function from S* to D

C / x:=x+2, y:=y+1 C / x:=x+1S / x:=y

M / y:=x M / y:=x

xx,y:=0

x

S

Page 18: Regular Functions Rajeev Alur University of Pennsylvania

Regular Cost Functions and ACRAs

Thm: Given a commutative monoid (D,+,0), a function f:S*->D is definable using an ACRA iff it is regular w.r.t. grammar G(+).

Establishes ACRA as an intuitive, deterministic operational model to define this class of regular functions

Proof relies on the model of SSTT (Streaming string-to-tree transducers) that can define all regular string-to-tree transformations

Page 19: Regular Functions Rajeev Alur University of Pennsylvania

Single-Valued Weighted Automata

Weighted Automata:Nondeterministic automata with edges labeled with

costs Single-valued:

Each string has at most one accepting path Cost of a string:

Sum of costs of transitions along the accepting path Example: When you fill out a survey, each coffee during

that month gets the discounted cost.Locally nondeterministic, but globally single-valued

Thm: ACRAs and single-valued weighted automata define the same class of functions

Page 20: Regular Functions Rajeev Alur University of Pennsylvania

Decision Problems for ACRAs

Min-Cost: Given an ACRA M, find min {M(w) | w in S*}Solvable in Polynomial-timeShortest path in a graph with vertices (state, register)

Equivalence: Do two ACRAs define the same functionSolvable in Polynomial-timeBased on propagation of linear equalities in program

graphs

Register Minimization: Given an ACRA M with k registers, is there an equivalent ACRA with < k registers?

Algorithm polynomial in states, and exponential in k

Page 21: Regular Functions Rajeev Alur University of Pennsylvania

Towards a Theory of Additive Regular Functions

Goal: Machine-independent characterization of regularitySimilar to Myhill-Nerode theorem for regular languagesRegisters should compute necessary auxiliary functions

Example: S = {C,S}f(w)= if w contains S then |w| else 2|w|f1(Ci)=i and f2(Ci)=2i are necessary and sufficient

Thm: Register complexity of a function is at least k iff there exist strings s0, … sm, loop-strings t1,…tm, and suffixes w1,…wm, and k distinct vectors c1,…ck such that for all numbers x1,…xm, f(s0 t1

x1 s1 t2x2 … sm wi) = Sj cij xj + di

Page 22: Regular Functions Rajeev Alur University of Pennsylvania

Talk Outline

String Transducers

Regular Functions over a Semiring

Conclusions + Open Problems

Page 23: Regular Functions Rajeev Alur University of Pennsylvania

Regular Functions for Non-Commutative Monoid

Cost model: G* with binary function concatenation .Interpretation for . is non-commutative, associative, identity e

Cost grammar G(.): t := s | (t . t) s is a string

Cost grammar G(.s): t := s | (t . s) | (s . t)

Thm: Regular functions w.r.t G(.) is a strict superset of regular functions w.r.t. G(.s)

Classical model of Sequential Transducers captures only a subset of regular functions w.r.t. G(.s)

Page 24: Regular Functions Rajeev Alur University of Pennsylvania

Streaming String Transducer: Delete

Finite state control + register x ranging over output strings

String variables explicitly updated at each step

Delete all a symbols

output x

a / x := x

x := e

b / x := xb

Page 25: Regular Functions Rajeev Alur University of Pennsylvania

Streaming String Transducer: Reverse

Symbols may be added to string variables at both ends

output x

a / x := ax

x := e

b / x := bx

Page 26: Regular Functions Rajeev Alur University of Pennsylvania

Streaming String Transducer: Regular Look Ahead

If input ends with b, then delete all a symbols, else reverse

output x

a / x:=ax

x,y := eoutput y

b / x:=bx; y:=yb

b / x:=bx; y:=yb

a / x:=ax

Register x equals reverse of the input so farRegister y equals input so far with all a’s

deleted

Page 27: Regular Functions Rajeev Alur University of Pennsylvania

Streaming String Transducer: Concatenation

Registers can be concatenated

Example: Swap substring before first a with substring following last a

a a

a a

Key restriction: a variable can appear at most once on RHS [x,y] := [xy, e] allowed [x,y] := [xy, y] not allowed

Page 28: Regular Functions Rajeev Alur University of Pennsylvania

SST Properties

At each step, one input symbol is processed, and at most a constant number of output symbols are newly created

Output is bounded: Length of output = O(length of input)

SST transduction can be computed in linear time

Finite-state control: Registers not examined

SST cannot implement merge f(u1u2….uk#v1v2…vk) = u1v1u2v2….ukvk

Multiple registers are essential For f(w)=wk, k variables are necessary and sufficient

Page 29: Regular Functions Rajeev Alur University of Pennsylvania

Decision Problem: Type Checking

Pre/Post condition assertion: { L } S { L’ }Given a regular language L of input strings (pre-condition), an SST S, and a regular language L’ of output strings (post-condition), verify that for every w in L, S(w) is in L’

Thm: Type checking is solvable in polynomial-timeKey construction: Summarization

Page 30: Regular Functions Rajeev Alur University of Pennsylvania

Decision Problem: Equivalence

Functional Equivalence;Given SSTs S and S’ over same input/output alphabets, check whether they define the same transductions.

Thm: Equivalence is solvable in PSPACE(polynomial in states, but exponential in # of string variables)

No lower bound known

Page 31: Regular Functions Rajeev Alur University of Pennsylvania

Expressiveness

Thm: A string transduction is definable by an SST iff it is regular

1. SST definable transduction is MSO definable 2. MSO definable transduction can be captured by a two-

way transducer (Engelfriet/Hoogeboom 2001) 3. SST can simulate a two-way transducer

Evidence of robustness of class of regular transductions

Closure properties1. Sequential composition: f1(f2(w))

2. Regular conditional choice: if w in L then f1(w) else f2(w)

Page 32: Regular Functions Rajeev Alur University of Pennsylvania

SST Applications

Equivalent class of single pass list processing programs with solvable program analysis problems (POPL 2011)

Algorithmic verification of retransmission protocols (network components as regular transformers over bit sequences; FORTE 2013)

OpportunitiesBEK: Transducer-based tool for analyzing string sanitizersFlashFill: Learning string transformations from examples

Page 33: Regular Functions Rajeev Alur University of Pennsylvania

function delete input ref curr; input data v; output ref result; output bool flag := 0; local ref prev; while (curr != nil) & (curr.data = v) { curr := curr.next; flag := 1; } result := curr; prev:= curr; if (curr != nil) then { curr := curr.next; prev.next := nil; while (curr != nil) { if (curr.data = v) then { curr := curr.next; flag := 1; } else { prev.next := curr; prev := curr; curr := curr.next; prev.next := nil; } }

Decidable Analysis: 1. Assertion checks 2. Pre/post condition 3. Full functional correctness

Algorithmic Verification of List-processing Programs

head tail

3 8 2

Page 34: Regular Functions Rajeev Alur University of Pennsylvania

Talk Outline

Regular Functions over a Semiring

Conclusions + Open Problems

Page 35: Regular Functions Rajeev Alur University of Pennsylvania

Regular Cost Functions over Semiring

Cost Domain: Natural numbers + Infty

Operation Min: Commutative monoid with identity Infty

Operation +: Monoid with identity 0

Rules: a + Infty = Infty + a = Inftya+min(b,c) = min (a+b, a+c); min(b,c)+a =

min(b+a,c+a)

Cost grammar MinInc: t := c | min(t,t) | (t+c)

Goal: Understand class of regular functions w.r.t. MinInc

Page 36: Regular Functions Rajeev Alur University of Pennsylvania

Weighted Automata

Weighted Automata:Nondeterministic automata with edges labeled with

costs

Interpreted over the semiring cost model: cost of string w = min of costs of all accepting paths

over wcost of a path = sum of costs of all edges in a path

Widely studied (Weighted Automata, Droste et al)Minimum cost problem solvableEquivalence undecidable over (N, min, +)Not determinizableNatural model in many applicationsRecent interest in CAV community for quantitative

analysis

Page 37: Regular Functions Rajeev Alur University of Pennsylvania

CRA over Min-Inc Semiring

C / y:=y+1

M / x:=min(x,y); y:=0

Output equals the minimum number of coffees consumed during a month

min(x,y)y:=0

x:=Infty

Page 38: Regular Functions Rajeev Alur University of Pennsylvania

CRA(min,+c) = Weighted Automata

From WA to CRA(min,+c):Generalizes subset construction for determinizationFor every state q of WA, CRA maintains a register xq

xq = min of costs of all paths to q on input read so far

Update on a: xq := min { xp + c | p –(a,c)-> q is edge in WA}

From CRA(min,+c) to WA:State of WA = (state q of CRA, register x)min simulated by nondeterminismTo simulate p – (a, x:=min(y,z)) -> q in CRA,

add a-labeled edges from (p,y) and (p,z) to (q,x)Distributivity of + over min critical

Page 39: Regular Functions Rajeev Alur University of Pennsylvania

CRA(min,+c) > Min-Plus Regular Functions

Thm: The class of regular functions w.r.t. Min-Inc semiring is a strict subset of weighted automata

Above function is not regular: cost term is quadratic in input

a/1 b/1

#

b,# a,#

Input w: w1 # w2 # … # wn

Each wi in {a,b}*

ai = Number of a’s in wi

bi = Number of b’s in wi

Cost(w) = minj { a1+…+aj+bj+1+…+bn}

Page 40: Regular Functions Rajeev Alur University of Pennsylvania

Machine Model for Semiring Regular Functions

Updates to registers must be copylessEach register appears at most once in a right-hand-

sideUpdate [x,y] := [min(x,y),y] not allowedNecessary to maintain “linear” growth

Need ability to simulate substitutionRegister x carries two values c and dStands for the parameterized expression min(c, ?)+dBesides min and inc, can substitute ? with a value

Resulting model coincides with regular functions over semiring

Open: Decidability of equivalence over (N, min , +c)

Page 41: Regular Functions Rajeev Alur University of Pennsylvania

Talk Outline

Conclusions + Open Problems

Page 42: Regular Functions Rajeev Alur University of Pennsylvania

Discounted Cost Regular Functions

Basic element: (cost c, discount d) Discounted sum: (c1,d1)*(c2,d2) = (c1+d1c2, d1d2) Example of non-commutative monoid Classical Model: Future discounting

Cost of a path: (c1,d1) * (c2,d2) * … * (cn,dn)

Polynomial-time algorithm for “generalized” shortest path Past discounting

Cost of a path: (cn,dn) * (cn-1,dn-1) * … * (c1,d1)

Same PTIME algorithm works for shortest paths Prioritized double discounting

Cost = (c1,d1) * … * (cn, dn) * (c’1,d’1) * … * (c’n,d’n)

Shortest path: NExpTime algorithm Open: Shortest path for Discounted Cost Register Automata

Page 43: Regular Functions Rajeev Alur University of Pennsylvania

Open Problems and Challenges

Complexity of equivalence of SSTs and STTs

Large gap between lower and upper bounds

Machine-independent characterization of regularitySupport functions needed to compute a function

Decidability of min-cost for discounted cost automata

Decidability of equivalence for Copyless CRAs over (N,min,+c)

Simpler/cleaner proofs of equivalence of machine models and MSO-definable transformations

Page 44: Regular Functions Rajeev Alur University of Pennsylvania

Unexplored Directions

Probabilistic modelsMarkov chains / MDPs with regular rewards

Regular costs for infinite executionsInfinitary operators: Lim-average, Discounted-sumStarting point: Infinite-String-to-Tree Transducers

Regular costs for trees

Combinations of other operationsRegular functions over G(+,min): t := c | (t+t) | min(t,t)

Page 45: Regular Functions Rajeev Alur University of Pennsylvania

Conclusions

Cost Register AutomataWrite-only machines with multiple registers to store outputs

Regular FunctionsDefinition parameterized by allowed operationsBased on MSO-definable graph transformations / transducers

Emerging theory Some results, new connectionsMany open problems and unexplored directions

Page 46: Regular Functions Rajeev Alur University of Pennsylvania

Acknowledgements and References

Streaming String Transducers(with P. Cerny; POPL’11, FSTTCS’10)

Transducers over Infinite Strings(with E. Filiot, A. Trivedi; LICS’12)

Streaming Tree Transducers(with L. D’Antoni; ICALP’12)

Regular Functions and Cost Register Automata(with L. D’Antoni, J. Deshmukh, M. Raghothaman, Y. Yuan; LICS’13)

Decision problems for Additive Cost Regular Functions(with M. Raghothaman; ICALP’13)

Infinite-String to Infinite-Term Regular Transformations(with A. Durand, A. Trivedi; LICS’13)

Min-cost problems for Discounted Sum Regular Functions(with S. Kannan, K. Tian, Y. Yuan; LATA’13)