Upload
lyre
View
43
Download
0
Embed Size (px)
DESCRIPTION
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing-. 발표자 : 정영임 발표일 : 2007. 10. 6. Table of Contents. 12.1 Probabilistic Context-Free Grammars 12.2 Problems with PCFGs 12.3 Probabilistic Lexicalized CFGs. Introduction. Goal - PowerPoint PPT Presentation
Citation preview
10. Lexicalized and Probabilistic Parsing-Speech and Language Processing-
발표자 : 정영임발표일 : 2007. 10. 6.
2
Table of Contents
12.1 Probabilistic Context-Free Grammars12.2 Problems with PCFGs12.3 Probabilistic Lexicalized CFGs
3
Introduction
Goal To build probabilistic models of sophisticated syntactic information To use this probabilistic information in efficient probabilistic parser
Use of Probabilistic Parser To disambiguation
Earley algorithm can represent the ambiguities of sentences, but it can not resolve them
Probabilistic grammar can choose the most-probable interpretation To make language model
For speech recognizer, N-gram model was used in predicting upcoming words, and helping constrain the search for words
Probabilistic version of more sophisticated grammar can provide additional predictive power to speech recognition
4
12.1 Probabilistic Context-Free Grammars
Probabilistic Context-Free Grammars(PCFG) PCFG is also known as the Stochastic Context Free Grammar(SCFG) Five parameters of PCFG
5-tuple G=(N, ∑, P, S, D)1.A set of non-terminal symbols (or “variables”) N2.A set of terminal symbols ∑ (disjoint from N)3.A set of productions P, each of the form A→β, where A is a non-terminal and β is a string of symbols from the infinite set ofstrings (∑∪N)*4. A designated start symbol S5. A function assigning probabilities to each rule in P
P(A →β) or P(A→β|A)
5
12.1 Probabilistic Context-Free Grammars
Sample PCFG for a miniature grammar
6
12.1 Probabilistic Context-Free Grammars
Probability of a particular parse T Production of the probabilities of all the rules r used to expand
each node n in the parse tree
By the definition of conditional probability
Since a parse tree includes all the words of sentence, P(S|T) is 1
7
12.1 Probabilistic Context-Free Grammars
Higher probability
8
12.1 Probabilistic Context-Free Grammars
Formalization of selecting the parse with highest probability The best tree for a sentence S out of the set of parse trees for S(which w
e’ll call τ (S))
Since P(S) is constant for each tree, we can eliminate it
→
→
Since P(T,S) = P(T)
9
12.1 Probabilistic Context-Free Grammars
Probability of an ambiguous sentence Sum of probabilities of all the parse trees for the sentence
10
Other issue on PCFG
Prefix Jelinek and Lafferty (1991) gives an Algorithm for efficiently computing th
e probability of a prefix of sentence Stolcke (1995) describes how the standard Earley parser can be augmen
ted to compute these prefix probabilities Jurafsky et al (1995) describes an application of a version of this algorith
m as the language model for a speech recognizer
Consistent PCFG is said to be consistent if the sum of the probabilities of all senten
ces in the language equals 1 Certain kinds of recursive rules cause a grammar to be inconsistent by ca
using infinitely looping derivations for some sentences Booth and Thompson (1973) gives more details on consistent and incons
istent grammars
11
Probabilistic CYK Parsing of PCFGs
Parsing problem for PCFG Can be interpretated into how to compute the most-likely parse for a give
n sentence
Algorithms for computing the most-likely parse Augmented Earley algorithm (Stolcke, 1995)
Probabilistic Earley algorithm is somewhat complex to present Probabilistic CYK(Cocke-Younger-Kasami) algorithm
CYK algorithm is worth understanding
12
Probabilistic CYK Parsing of PCFGs
Probabilistic CYK(Cocke-Younger-Kasami) algorithm CYK algorithm is essentially a bottom-up parser using dynamic programm
ing table Bottom-up makes it more efficient when processing lexicalized grammar Probabilistic CYK parsing was first described by Ney(1991) CYK parsing algorithm presented here
Collins(1999), Aho and Ullman(1972)
13
Probabilistic CYK Parsing of PCFGs
Input, output, and data structure of Probabilistic CYK
14
Probabilistic CYK Parsing of PCFGs
15
Probabilistic CYK Parsing of PCFGs
Pseudocode for Probabilistic CYK algorithm
16
Learning PCFG Probabilities
Where do PCFG probabilities come from Obtaining the PCFG probabilities from Tree bank
Tree bank: a corpus of already-parsed sentences E.g.) Penn Tree bank(Marcus et al., 1993)
– Brown Corpus, Wall street Journal, parts of Switchboard corpus
Probability of each expansion of a non-terminal– By counting the number of times that expansion occurs– Normalizing
17
Learning PCFG Probabilities
Where do PCFG probabilities come from Learning the PCFG probabilities
PCFG probabilities can be generated by first parsing a (raw) corpus Unambiguous sentences
– Parse the corpus– Increment a counter for every rule in the parse– Then normalize to get probabilities
Ambiguous sentences– We need to keep a separate count for each parse of a
sentence and weight each partial count by the probability of the parse it appears in
– Standard algorithm for computing this is called Inside-Outside algorithm proposed by Baker(1979)
» Cf.) Manning and Schuze(1999) for a complete description
18
12.2 Problems with PCFGs
19