43
1 “Approximating Context-Free Grammar Ambiguity” November 2, 2004 Approximating Context- Free Grammar Ambiguity Claus Brabrand [email protected] BRICS, Department of Computer Science University of Aarhus, Denmark

Ambiguity Pilambda

  • Upload
    fthgkc

  • View
    356

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ambiguity Pilambda

1“Approximating Context-Free Grammar Ambiguity”November 2, 2004

Approximating Context-Free Grammar Ambiguity

Claus [email protected]

BRICS, Department of Computer Science

University of Aarhus, Denmark

Page 2: Ambiguity Pilambda

2“Approximating Context-Free Grammar Ambiguity”November 2, 2004

“Approximating Context-Free Grammar Ambiguity”

// Abstract

Context-free grammar ambiguity is undecidable.

However, just because it’s undecidable, doesn’t mean there aren’t (good) approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”.

We exhibit a characterization of context-free ambiguity which induces a whole framework for approximating the problem.

In particular, we give an approximation, AMN, based on the [Mohri-Nederhof, 2000] regular approximation of context-free grammars and show how to boost the precision even further.

Page 3: Ambiguity Pilambda

3“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// OutlineIntroductionVertical / Horizontal AmbiguityCharacterization of Ambiguity(Over-)Approximation FrameworkApproximation (AMN)

AssessmentRelated WorkConclusion

Page 4: Ambiguity Pilambda

4“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Context-Free Grammar

N finite set of nonterminals finite set of terminalss N start nonterminal : N P(E*) production function, E = N

G = N, , s,

Assume:All nN reachable (from s)All nN derive some (finite) string

L : G P(*) language of G, L(G)

Page 5: Ambiguity Pilambda

5“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Relevant CFG Decision Problems

Decidable:Membership: L(GCFG)

Emptyness: L(GCFG) =

Intersection (w/ REG): L(GCFG) L(RREG) = L(CCFG)

… constructively

Undecidable:Intersection (w/ CFG): L(GCFG) L(G’CFG) ?…Ambiguity: *: 2 derivation trees ?

Page 6: Ambiguity Pilambda

6“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Ambiguity: Undecidable!

Algorithms:Undecidable!

However…

Ts

T’s

=

unambiguous ambiguous

Ambiguity: *: 2 derivation trees ?

?

Page 7: Ambiguity Pilambda

7“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// “Side-Stepping Undecidability”

Unsafe approximation:

Safe approximation:

However, just because it’s undecidable, doesn’t mean there aren’t (good) approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”.

unambiguous ambiguous

safe (over-)approximation

unambiguous ambiguous

safe (under-)approximation

unambiguous ambiguous

unsafe approximation

Page 8: Ambiguity Pilambda

8“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// MotivationUse safe (over-)approximation:

“Yes!” “G guaranteed unambiguous”!!!Safely use any GLR parser on G

Because: never two parses at runtime!

Hence:dynamic parse ambiguity static parse ambiguity

unambiguous ambiguous

Yes!

.

Page 9: Ambiguity Pilambda

9“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Motivation (cont’d)Undecidability means: “there’ll always be a slack”:

However, still useful!Possible interpretations of “No?”:

Treat as error (reject grammar):“Please redesign your grammar” (as in [LA]LR(k))

Treat as warning:“Here are some potential problems”

unambiguous ambiguous

No?

. .

Page 10: Ambiguity Pilambda

10“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Vertical Ambiguity“Vertical ambiguity”:

Example:

n N : , ’ (n) : ’ L() L(’) =

xay

Z : x A y : x B y A : aB : a

Ambiguous string:

~ “reduce/reduce conflict” in [Yacc]

G

Page 11: Ambiguity Pilambda

11“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Horizontal Ambiguity“Horizontal ambiguity”:

where:

Example:

n N: (n): i [1..||-1]: L(0 .. i-1) L(i .. ||-1 ) =

: P(*) P(*) P(*)

X Y = { xay | x,y* a+ x,xaL(X) y,ayL(Y) }

xay

Z : A B A : x a : xB : a y : y

Ambiguous string:

~ “shift/reduce conflict” in [Yacc]

G

Page 12: Ambiguity Pilambda

12“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Characterization of AmbiguityTheorem 1:

Lemma 1a: (“”)

Lemma 1b: (“”)

G G G unambiguous

G G G unambiguous

G G G unambiguous

Page 13: Ambiguity Pilambda

13“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Proof (Lemma 1a): “”

…or contrapositively:

Proof:Assume G ambiguous (i.e. 2 der. trees for )

Show: by induction in max height of the 2 derivation trees

G G G unambiguous

G ambiguous G G

G G

Page 14: Ambiguity Pilambda

14“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Proof (Lemma 1a): “” (Base)Base case (height 1):

The ambiguity means that (for pp’):

Which means:i.e., we have a vertical ambiguity:

N

’1N

1

L() L(’) {}

p p’

=

G

Page 15: Ambiguity Pilambda

15“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Proof (Lemma 1a): “” (I.H.)Induction step (height n):

Assume induction hypothesis (for height n-1)

The ambiguity means:

N

n-1

N

n-1

i ’i’

… …i … …’i’p p’11

|-1|= ’0 ’|’-1|0.. .. .. ..

=

Page 16: Ambiguity Pilambda

16“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Proof (Lemma 1a): “” (pp’)Case p = q (different production):

…but then i.e., we have a vertical ambiguity:

L() L(’) {}

p p’

G

N

n-1

N

n-1

i ’i’

… …i … …’i’p p’11

|-1|= ’0 ’|’-1|0.. .. .. ..

=

Page 17: Ambiguity Pilambda

17“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Proof (Lemma 1a): “” (p=p’,1)Case p q (same prod. ):

i.e. “the top of the trees are the same”Case :

ambiguity in subtreei ( deriving same i):

Induction hypothesis (this subtree)

i : i = ’i

p = p’ i : i = ’i

N

n-1

N

n-1

i i

… …i … …i’p p’11

|-1|= 0 |-1|0.. .. .. ..

=

G G

Page 18: Ambiguity Pilambda

18“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Proof (Lemma 1a): “” (p=p’,2)Case p q (same prod. ):

Case :…but then: (assume WLOG ):

Now pick any k:...then:

N

n-1

N

n-1

i

. … .i p

i : i ’i

p = p’

p11

i : i = ’i i : i = ’i

ji: j ’j

=

j

j ’i

. … .i j

’j

i k < j

L(0 .. k) L(k+1 .. || )

k k

least such i2nd least such j

G

Page 19: Ambiguity Pilambda

19“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Proof (Lemma 1b): “”

Contrapositively:

Assume “ ” (vertical conflict):Then for some NN:

But then derive (using reachability + derivability of N):

s * x N x * x a * x a y

s * x N x ’ * x a * x a y

N * a, N ’ * a, L() L(’) {a}

G G G unambiguous

G ambiguous G G

Page 20: Ambiguity Pilambda

20“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Proof (Lemma 1b): “” (cont’d)Assume “ ” (horizontal conflict):

Then for some NN:

But then derive (using reachability + derivability of N):

s * v N v * v x * v x a y * v x a y w

s * v N v * v x a * v x a y * v x a y w

N , L() L()

x,y * : a + : x,xa L() y,ay L()

i.e.

Page 21: Ambiguity Pilambda

21“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// (Over-)Approximation (A)(Over-)Approximation A : E* P(*)

A decidable “ ” and “ ” decidable on co-dom(A)

Approximated vertical ambiguity:

Approximated horizontal ambiguity:

E* : L() A()

n N : , ’ (n) : A() A(’) =

A

A

n N: (n): i [1..||-1]: A(0 .. i-1) A(i .. ||-1) =

G

G

Page 22: Ambiguity Pilambda

22“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Ambiguity ApproximationTheorem 2:

Proof:

“Conflicts w/ smaller sets conflicts w/ larger sets”:

G unambiguous

A() A() = L() L() =

A() A() = L() L() =

AA

AA

G G

G G G G

Page 23: Ambiguity Pilambda

23“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Compositionality (of A’s)Colloary 3:

Proof:Follows from definition [omited…]

i.e. “Approximations are compositional”!:

A, A’ decidable (over-)approximations A A’ decidable (over-)approximation

unambiguous ambiguous

unambiguous ambiguous

unambiguous ambiguous

A

A’

A A’

Page 24: Ambiguity Pilambda

24“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Choice(s) of A?A*() = * (constant)

Worst approximation…but safe approximation!

Useless: “Cannot determine that any grammars are unambiguous”

unambiguous ambiguous

worst approximation

Page 25: Ambiguity Pilambda

25“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Choice(s) of A? (cont’d)AMN() = [Mohri-Nederhof]()

CFG DFA (NFA) Approximation

Properties of this “ Black-box ”:Good (over-)approximation!Works on language, L(G);

not on grammatical structure, G

Approximation parameterizable:E.g. unfold nonterminals “n” times

“Regular Approximation of Context-Free Grammars through Transformation”[Mohri-Nederhof, 2000]

Black-box

Page 26: Ambiguity Pilambda

26“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Decidability (of AMN)

“” decidable (using DFAs)O(|XNFA||YNFA|)

“ ” decidable (using DFAs)O(|XNFA||YNFA|)

AMN decidable

With potential counterexamples (using DFAs)

X Y =

X Y =

G unambiguous

AMNAMN

Page 27: Ambiguity Pilambda

27“Approximating Context-Free Grammar Ambiguity”November 2, 2004

For X,Y regular languages:

All overlappings, “xay”, as DFAs; variant of “” construction!

// Decision Algorithm for (X Y)

XNFA YNFA

[X;Y]NFA

a path :

XNFA YNFA

[X;Y]NFA

a a

x y

x a ya

a

X Y

YX

X Y

Page 28: Ambiguity Pilambda

28“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Three Approximation AnswersY!:

“G definitely not ambiguous”!“?/D?”:

“?”: “Don’t know”?…could not find any potential counterexamples.

“D?”: “Don’t know” – look at over-approx, D?…and here are all potential counterexamples

Note: some strings do not even parse!

Improve: Parse S FIN D subset of real counterexamples

True answer

Page 29: Ambiguity Pilambda

29“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Regaining Lost Precision!Now parse all counterexamples!

i.e. parse DFA, DDFA:

1) i.e. construct:Decidable in O(|D||G|)

2) Decide emptyness on C:Decidable in O(|C| = |D||G|)

Only potential counterexamples that parse!

L(CCFG) = L(DDFA) L(GCFG)

L(CCFG) =

Page 30: Ambiguity Pilambda

30“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Three Approximation AnswersY!:

“G definitely not ambiguous”!“?/C?”:

“?”: “Don’t know”?…could not find any counterexamples.

“C?”: “Don’t know” – look at over-approx, C?…and here are all potential counterexamples

Note: all strings actually parse (maybe not ambiguously)!

Improve: extract finite under-approximation...?

True answer

Page 31: Ambiguity Pilambda

31“Approximating Context-Free Grammar Ambiguity”November 2, 2004

[Mohri-Nederhof]: O(n2vh)Vertical Amb: O(n3v4h4)Horizontal Amb: O(n3v3h5)Total: O(n3v3h4(v+h)) O(g5)

// Asymptotic (Time) Complexity

N1 : e1,1 … ea,1

: … : e1,p … ea,p

h

n

vn = |N|v = max{|(N)|, NN}h = max{||, (N), NN}g = nvh = |G|

Page 32: Ambiguity Pilambda

32“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Related Work (Dynamic)Dynamic disambiguation:

“Disambiguation-by-convention”:Longest match, most specific match, …

Customizable:[Bison v. 1.5+]: %dprec, %merge[ASF+SDF]: “disambiguation filters”

Dynamic ambiguity interception:GLR ([Tomita], [Early], [Bison], [ASF+SDF], …)

Page 33: Ambiguity Pilambda

33“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Related Work (Static)Static disambiguation:

“Disambiguation-by-convention”:First match, most specific match, …

Customizable:[Yacc]: %left, %right, %nonassoc, %prec

Static ambiguity interception:LL(k), [LA-]LR(k), …Our work goes here (but for GLR)!

Page 34: Ambiguity Pilambda

34“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Implementationdisamb (Java)

In progress…!

Page 35: Ambiguity Pilambda

35“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// AssessmentQuality of approximation ~ ~ Quantity of false-positives

Precision:Our \ LR(k) ?LR(k) \ Our ?False-positives ?Characterize “?” / “N?”

In terms of grammatical structure ?

Efficiency (in practise…)

In progress…!

Page 36: Ambiguity Pilambda

36“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Example: Expression chains

…!?

E -> E + T -> TT -> T * F -> FF -> ( E ) -> x

Page 37: Ambiguity Pilambda

37“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Example: Balancing StructuresNasty:

Requires:Unbounded memory (# x’es)

i.e. CFG structure

Unbounded lookaheadi.e. any finite k is insufficient

False-positives!

S -> A AA -> x A x -> y xxyxxxyx

Example string:

Page 38: Ambiguity Pilambda

38“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Future WorkPermit

With disambiguating conventions for:AssociativityPrecedence

Parsing optimization:Exploit compile-time analysis information at runtime

E -> E E

Page 39: Ambiguity Pilambda

39“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Conclusion

But wait, there’s more…

“Approximating Context-Free Grammar Ambiguity”

Context-free grammar ambiguity is undecidable.

However, just because it’s undecidable, doesn’t mean there aren’t (good) approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”.

We exhibit a characterization of context-free ambiguity which induces a whole framework for (over-)approximation.

In particular, we give an approximation based on the [Mohri-Nederhof, 2000] regular approximation of context-free grammars and show how to boost the precision even further.

Page 40: Ambiguity Pilambda

40“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Lessons LearnedFramework:

Plug in your favorite (over-)approximation of L()Even take intersection of them: A = i Ai

Approximation closed under intersection

Methodology:Just because it’s undecidable doesn’t mean there aren’t (good) approximations

Quantity of false-positives (practically motivated)What to do with false-positives (pratically motivated)

Don’t be scared of undecidability

Page 41: Ambiguity Pilambda

41“Approximating Context-Free Grammar Ambiguity”November 2, 2004

[bonus slides]

Page 42: Ambiguity Pilambda

42“Approximating Context-Free Grammar Ambiguity”November 2, 2004

// Membership: Decidable!Membership (aka. “parsing”):

Given * :“Is the string, , in the language of G”:

Algorithms:LL(k) O(||)[LA-]LR(k) O(||)GLR O(||3)…

L(G)

Page 43: Ambiguity Pilambda

43“Approximating Context-Free Grammar Ambiguity”November 2, 2004

The ambiguity problem for [X;Y]...

In fact, already a problem if x’ “goes too far”:

Thus, we only have a problem if (“X eats into Y”):

Essentially disambiguation by picking longest match

// Parsing Greedily Left-to-Right

x y

x’ y’

x y

- (“too little”): Not possible (due to greediness)

... may occur in 2 cases:

- (“too much”): Only this is a problem!

X X;( prefix(Y) \ {} ) X Y

x’ y’