Upload
myra-miner
View
217
Download
0
Embed Size (px)
Citation preview
09.04.2003 CSA2050: DCG I 1
CSA2050 Introduction to Computational
Linguistics
Lecture 8
Definite Clause Grammars
09.04.2003 CSA2050: DCG I 3
Logic Rules andGrammar Rules
Basic Question: what is the connection between logic rules and grammar rules?
x y male(x) & parent(x,y) → father(x,y)
S → NP VP
They are both concerned with the definition of predicates.
09.04.2003 CSA2050: DCG I 4
Logic Rulesand Grammar Rules
Logic: arbitrary n-ary predicates, eg raining; clever(x); father(x,y); between(x,y,z)
Grammar Rules: predicates over text segments, egnp(x); vp(y); s(z).
09.04.2003 CSA2050: DCG I 5
Text Segments
A text segment is a sequence of consecutive words.
A text segment can be identified by two pointers, if we assign names to the spaces between words. 0 the 1 cat 2 sat 3 on 4 the 5 mat 6
(0,6) is the whole sentence (0,2) is the first noun phrase
09.04.2003 CSA2050: DCG I 6
From Grammar Rules to Logic
The general statement made by the CF rule S → NP, VP
can be summarised using predicates over segments with the following logic statement
NP(p1,p) & VP(p,p2) => S(p1,p2)
09.04.2003 CSA2050: DCG I 7
From Grammar Rules to Logic
0 the 1 cat 2 sat 3 on 4 the 5 mat 6
NP
VP
S
09.04.2003 CSA2050: DCG I 8
From Logic to Prolog
Each logic statement of the form
NP(p1,p) & VP(p,p2) => S(p1,p2)Corresponds to the "definite clause"
s(P1,P2) :- np(P1,P), vp(P,P2).
09.04.2003 CSA2050: DCG I 9
Converting a Grammar
S → NP, VP
NP → N
NP → Det N
VP → V NP
s(P1,P2) :- np(P1,P), vp(P,P2).
np(P1,P2) :- n(P1,P2).
np(P1,P2) :- det(P1,P), n(P,P2).
vp(P1,P2) :-v(P1,P), np(P, P2)
09.04.2003 CSA2050: DCG I 10
Lexical Categories and Rules
Lexical categories are those which are not defined in the grammar itself (eg. N and V in our grammar)
Instead, they are defined by the words that they rewriteV → run, sleep, talk etc
Lexical categories always derive exactly one input token.
09.04.2003 CSA2050: DCG I 11
Lexical Rules
A rule defining lexical category C must express the following information:there is a C between positions p1 and p2 if some word of syntactic category C spans those positions
There are many different ways to translate such a rule into a Prolog clause.
Each way needs to make reference to how the input sentence is represented.
09.04.2003 CSA2050: DCG I 12
Defining Lexical Categories
Each category is defined in terms of the words it can rewrite
d(P1,P2) :- input(P1,P2,[the]).n(P1,P2) :- input(P1,P2,[cat]).n(P1,P2) :- input(P1,P2,['John']).v(P1,P2) :- input(P1,P2,[ate]).
How is the input sentence represented?
09.04.2003 CSA2050: DCG I 13
Representing the Input
Define the predicate input(P1,P2,L) such that P1 and P2 are positions and L is a list containing the words spanning those positions
Checkpoint: show how to represent the input sentence "John ate the cat"
09.04.2003 CSA2050: DCG I 14
John ate the cat
input(0,1,['John']).
input(1,2,[ate]).
input(2,3,[the]).
input(3,4,[cat]). Checkpoints
Why is John in quotes? Why use a list of one element rather than an atom? Is this the only way to do it?
09.04.2003 CSA2050: DCG I 15
Complete Program
1. Grammar
s(P1,P2) :- np(P1,P), vp(P,P2).
np(P1,P2) :- n(P1,P2).
np(P1,P2) :- d(P1,P), n(P,P2).
vp(P1,P2) :- v(P1,P2).
vp(P1,P2) :-v(P1,P), np(P, P2)
2. Lexicond(P1,P2) :- input(P1,P2,[the]).n(P1,P2) :- input(P1,P2,[cat]).n(P1,P2) :- input(P1,P2,['John']).v(P1,P2) :- input(P1,P2,[ate]).3. Inputinput(0,1,['John']).input(1,2,[ate]).input(2,3,[the]).input(3,4,[cat]).4. Query?- s(0,4).
09.04.2003 CSA2050: DCG I 16
Trace of query?- vp(1,4)
1 1 Call: vp(1,4) ?2 2 Call: v(1,4) ?3 3 Call: input(1,4,[ate]) ?3 3 Fail: input(1,4,[ate]) ? 2 2 Fail: v(1,4) ? 2 2 Call: v(1,_349) ? 3 3 Call: input(1,_349,[ate]) ? 3 3 Exit: input(1,2,[ate]) ? 2 2 Exit: v(1,2) ? 4 2 Call: np(2,4) ? 5 3 Call: n(2,4) ? 6 4 Call: input(2,4,[cat]) ? 6 4 Fail: input(2,4,[cat]) ?
6 4 Call: input(2,4,[John]) ? 6 4 Fail: input(2,4,[John]) ? 5 3 Fail: n(2,4) ? 5 3 Call: d(2,_1338) ? 6 4 Call: input(2,_1338,[the]) ? 6 4 Exit: input(2,3,[the]) ? 5 3 Exit: d(2,3) ? 7 3 Call: n(3,4) ? 8 4 Call: input(3,4,[cat]) ? 8 4 Exit: input(3,4,[cat]) ? 7 3 Exit: n(3,4) ? 4 2 Exit: np(2,4) ? 1 1 Exit: vp(1,4) ?
09.04.2003 CSA2050: DCG I 17
Representing the Sentence Using Difference Lists
We can represent the input as a pair of pointers The first pointer points to the entire list The second pointer points to a suffix of the list. The represented list is the difference between
the two lists.input(['John',ate,the,cat],['John',ate,the,cat]).input(['John',ate,the,cat],[ate,the,cat]).input(['John',ate,the,cat],[the,cat]).input(['John',ate,the,cat],[]).input([X|Y],Y,X).
09.04.2003 CSA2050: DCG I 18
DCG Notation
The conversion of CF rules into Prolog is so simple that it can be done automatically.
Clauses in DCG notation:s --> np, vp.np --> d, n.n --> [cat].are automatically translated when read in tos(P1,P2) --> np(P1,P),vp(P,P2).np(P1,P2) --> d(P1,P), n(P,P2).n([dog|L],L).
09.04.2003 CSA2050: DCG I 19
DCG Notation
Every DCG rule takes the formnonterminal --> expansionwhere expansion is any of A nonterminal symbol np A list of non-terminal symbols [each,other] A null constitutent [ ] A plain Prolog goal enclosed in braces {write('Found')}
A series of any of these expansions joined by commas.
09.04.2003 CSA2050: DCG I 20
Complete DCG
1. Grammar
s --> np, vp.
np --> n.
np --> d, n.
vp --> v.
vp --> v, np
2. Lexicond --> [the].n --> [cat].n --> ['John'].v --> ['ate']. 3. Input
4. Query?- s(['john', ate, the, cat], []).