45
Artificial Intelligence Resolution Readings: Chapter 9 of Russell & Norvig.

Norvig. - University of Iowahomepage.cs.uiowa.edu/~hzhang/c145/notes/9-resolution.pdf · Example Knowledge Base contd.::: it is a crime for an American to sell weapons to hostile

  • Upload
    dangdat

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Artificial Intelligence

Resolution

Readings: Chapter 9 of Russell &Norvig.

ResolutionFull first-order version:

`1 ∨ · · · ∨ `k, m1 ∨ · · · ∨ mn

(`1 ∨ · · · ∨ `i−1 ∨ `i+1 ∨ · · · ∨ `k ∨ m1 ∨ · · · ∨ mj−1 ∨ mj+1 ∨ · · · ∨ mn)θ

where UNIFY(`i,¬mj) = θ.

Or equivalently,L ∨ `i, M ∨ mj

(L ∨ M)θ

where

L = (`1 ∨ · · · ∨ `i−1 ∨ `i+1 ∨ · · · ∨ `k),

M = (∨m1 ∨ · · · ∨ mj−1 ∨ mj+1 ∨ · · · ∨ mn), and

UNIFY(`i,¬mj) = θ.

Resolution: Examples

¬Pet(Joe) ∨ Cat(Joe)∨↓

Bird(Joe) Parrot(x) ∨ ¬↓

Bird(x)

¬Pet(Joe) ∨ Cat(Joe) ∨ Parrot(Joe)(+)

(+) Unify(Bird(x), Bird(Joe)) = {x/Joe}

¬↓

On(x, y) ∨Above(x, y) On(B, A)∨↓

On(A, B)

Above(A, B) ∨ On(B, A)(*)

(*) Unify(On(x, y), On(A, B)) = {x/A, y/B}

¬Bird(x)∨↓

Feathers(x) ¬↓

Feathers(y) ∨Flies(y)

¬Bird(x) ∨ Flies(x)(†)

(†) Unify(Feathers(x), F eathers(y)) = {y/x}

Resolution Proofs by RefutationLet Γ = {¬P (w) ∨ Q(w), ¬Q(y) ∨ S(y), P (x) ∨ R(x), ¬R(z) ∨ S(z)}.

To prove S(A) from Γ, try to prove False from Γ ∪ {¬S(A)}

~P(w) \/ Q(w) ~Q(y) \/ S(y)

~P(w) \/ S(w)

{y/w}

S(x) \/ R(x)

{w/x}P(x) \/ R(x)

~R(z) \/ S(z){z/x}

S(x) ~S(A){x/A}

False

Unification

A variable unifies with any term in which it does notoccur.

A constant symbol unifies only with itself

Two compound terms (or two atomic sentences) unifyonly if

their top symbol is the same,that symbol is applied to the same number ofargumentsthe arguments unify pairwise.

Examples of Unification

1. x and A unify but A and B do not

2. x and F (A,H(x)) do not unify (x occurs in F (A,H(x)))

3. F (H(x)) and G(H(x)) do not unify (different topsymbols)

4. F (x) and F (x, y) do not unify (different number ofarguments)

5. F (t1, t2, . . . , tn) and F (s1, s2, . . . , sn) unify if t1 and s1

unify, t2 and s2 unify, . . . , tn and sn unify.

Unifiersθ is a unifier of t1 and t2 if θ(t1) is identical to θ(t2).

Some terms may have more than one unifier:

G(x, H(z)) and G(A, y)

unify with θ1 := {x/A, y/H(A), z/A}

or with θ2 := {x/A, y/H(z)}

But some unifiers are more general than others. Here,θ2 above is more general than θ1.

Formally, unifier σ2 is more general than unifier σ1 ifthere exists a substituion σ3 such that σ1 = σ2σ3.

When unifying two terms we are interested in their mostgeneral unifier (mgu).

The Unification ProcedureInput: a set V of the form {t =? t′}

Output: either FAIL or a set of the form {v1 =? t1, . . . , vm =? tm}

Procedure: Repeatedly apply (in any order) the transformations rules

Delete, Eliminate, Decompose, Orient, Occurs, Mismatch

to V until no more transformations apply.

If the procedure returns {v1 =? t1, . . . , vm =? tm} on input{t =? t′}, then {v1/t1, . . . , vm/tm} is a mgu of t and t′.

Notation:

v denotes a variable,

t denotes a term (possibly a variable).

The Transformation Rules

Decompose. U ∪ {f(t1, . . . , tn) =? f(s1, . . . , sn)} −→ U ∪ {ti =? si : 1 ≤ i ≤ n}

Orient. U ∪ {t =? v} −→ U ∪ {v =? t}

Delete. U ∪ {v =? v} −→ U

Eliminate. U ∪ {v =? t}, v ∈ Vars(U) \ Vars(t) −→ U [v/t] ∪ {v =? t}

Mismatch. U ∪ {f(t1, . . . , tm) =? g(s1, . . . , sn)},

f, g distinct or m 6= n −→ FAIL

Occurs. U ∪ {v =? t}, v 6= t but v ∈ Vars(t) −→ FAIL

Notation: Vars(U),Vars(t) denote the set of variables in U, t , respectively;v denotes a variable, s, t denote terms (possibly variables);f, g denote function/constant symbols.

The Unification Procedure: ExamplesUnify: F (G(H(y)), H(A)) and F (G(x), x)

{F (G(H(y)), H(A)) =? F (G(x), x)}Decompose

−−−−−−→

{G(H(y)) =? G(x), H(A) =? x}Decompose

−−−−−−→

{H(y) =? x, H(A) =? x}Orient

−−−−→

{x =? H(y), H(A) =? x}Eliminate x

−−−−−−→

{x =? H(y), H(A) =? H(y)}Decompose

−−−−−−→

{x =? H(y), A =? y}Orient

−−−−→

{x =? H(y), y =? A}Eliminate y

−−−−−−→

{x =? H(A), y =? A}

mgu = {x/H(A), y/A}

The Unification Procedure: ExamplesUnify: F (A,H(y), z) and F (A, x, z)

{F (A,H(y), z) =? F (A, x, z)}Decompose−−−−−−→

{A =? A, H(y) =? x, z =? z}Decompose−−−−−−→

{H(y) =? x, z =? z}Orient−−−→

{x =? H(y), z =? z}Delete−−−−→

{x =? H(y)}

mgu = {x/H(y)}

The Unification Procedure: ExamplesUnify: F (H(y), z) and F (G(x), z)

{F (H(y), z) =? F (G(x), z)}Decompose−−−−−−→

{H(y) =? G(x), z =? z}Mismatch−−−−−→

FAIL

do not unify

Unify: F (y) and F (H(x, y))

{F (y) =? F (H(x, y))}Decompose−−−−−−→

{y =? H(x, y)}Occurs−−−−→

FAIL

do not unify

Example Knowledge Base

The law says that it is a crime for an American to sellweapons to hostile nations. The country Nono, an enemy ofAmerica, has some missiles, and all of its missiles weresold to it by Colonel West, who is American.

Prove that Col. West is a criminal

Example Knowledge Base contd.. . . it is a crime for an American to sell weapons to hostilenations:

Example Knowledge Base contd.. . . it is a crime for an American to sell weapons to hostilenations:

American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) =⇒Criminal(x)Nono . . . has some missiles

Example Knowledge Base contd.. . . it is a crime for an American to sell weapons to hostilenations:

American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) =⇒Criminal(x)Nono . . . has some missiles, i.e.,∃x Owns(Nono, x) ∧ Missile(x):

Owns(Nono,M1) and Missile(M1). . . all of its missiles were sold to it by Colonel West

Example Knowledge Base contd.. . . it is a crime for an American to sell weapons to hostilenations:

American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) =⇒Criminal(x)Nono . . . has some missiles, i.e.,∃x Owns(Nono, x) ∧ Missile(x):

Owns(Nono,M1) and Missile(M1). . . all of its missiles were sold to it by Colonel West

∀x Missile(x) ∧ Owns(Nono, x) =⇒ Sells(West, x,Nono)Missiles are weapons:

Example Knowledge Base contd.. . . it is a crime for an American to sell weapons to hostilenations:

American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) =⇒Criminal(x)Nono . . . has some missiles, i.e.,∃x Owns(Nono, x) ∧ Missile(x):

Owns(Nono,M1) and Missile(M1). . . all of its missiles were sold to it by Colonel West

∀x Missile(x) ∧ Owns(Nono, x) =⇒ Sells(West, x,Nono)Missiles are weapons:

Missile(x) ⇒ Weapon(x)An enemy of America counts as “hostile”:

Example Knowledge Base contd.. . . it is a crime for an American to sell weapons to hostilenations:

American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) =⇒Criminal(x)Nono . . . has some missiles, i.e.,∃x Owns(Nono, x) ∧ Missile(x):

Owns(Nono,M1) and Missile(M1). . . all of its missiles were sold to it by Colonel West

∀x Missile(x) ∧ Owns(Nono, x) =⇒ Sells(West, x,Nono)Missiles are weapons:

Missile(x) ⇒ Weapon(x)An enemy of America counts as “hostile”:

Enemy(x,America) =⇒ Hostile(x)West, who is American . . .

American(West)The country Nono, an enemy of America . . .

Enemy(Nono,America)

Resolution Proof: Definite Clauses

American(West)

Missile(M1)

Missile(M1)

Owns(Nono,M1)

Enemy(Nono,America) Enemy(Nono,America)

Criminal(x)Hostile(z)LSells(x,y,z)LWeapon(y)LAmerican(x)L > > > >

Weapon(x)Missile(x)L >

Sells(West,x,Nono)Missile(x)L Owns(Nono,x)L> >

Hostile(x)Enemy(x,America)L >

Sells(West,y,z)LWeapon(y)LAmerican(West)L > > Hostile(z)L>

Sells(West,y,z)LWeapon(y)L > Hostile(z)L>

Sells(West,y,z)L> Hostile(z)L>L Missile(y)

Hostile(z)L>L Sells(West,M1,z)

> > L Hostile(Nono)L Owns(Nono,M1)L Missile(M1)

> L Hostile(Nono)L Owns(Nono,M1)

L Hostile(Nono)

Criminal(West)L

Why Resolution by Refutation WorksLet Γ be a set of clauses and α an atomic sentence.

To show that α follows from Γ, prove False fromΓ ∪ {¬α}.

This proof method is sound because:

False provable from Γ ∪ {¬α} by resolution impliesΓ ∪ {¬α} |= False impliesΓ ∪ {¬α} is unsatisfiable implies¬(Γ ∧ ¬α) is valid implies¬Γ ∨ α is valid impliesΓ ⇒ α is valid impliesΓ |= α

Resolution by Refutation

1. Obtain CNF (KB ∧ ¬α)

2. Apply resolution steps to the CNF

3. If the empty clause is generated, then KB |= α.

A proof by contradiction is also called a refutation.

Theorem: The resolution rule with factoring is arefutation-complete inference system:

if a set of clauses is unsatisfiable,then False is provable from it by resolution.

A TechnicalityThe resolution rule alone is not complete.

For example, the set {

α︷ ︸︸ ︷

P (x) ∨ P (y),

β︷ ︸︸ ︷

¬P (x) ∨ ¬P (y)} isunsatisfiable.

But all we can do is start with

P (x) ∨ P (y) ¬P (x) ∨ ¬P (y)

P (y) ∨ ¬P (y)︸ ︷︷ ︸

γ

But then, resolving γ with α, or γ with β leads nowhere.(try it!)

FactorsLet ϕ be a CNF sentence of the form below whereα1 ∨ α2 are both positive (or negative) literals

α1 ∨ α2 ∨ β

If ϕ1, ϕ2 unify with mgu θ then

θ(α1 ∨ β)

is a factor of ϕ.

Example: P (x) is a factor of P (x) ∨ P (y) becauseP (x), P (y) unify with mgu = {y/x}

Too Many Choices!A resolution-based inference system has just one ruleto apply to build a proof.

However, at any step there may be several possibleways to apply the resolution rule.

P

Q

R R

~R

~Q~P \/ R

False ~P

~Q \/ R~P \/ Q

~P

Several resolution strategies have been developed inorder to reduce the search space.

Some Resolution StrategiesUnit resolution: Unit resolution only.

Input resolution: One of the two clauses must be an inputclause.

Set of support: One of the two clauses must be from adesigned set called set of support. New resolvent areadded into the set of support.

Linear resolution The latest resolvent must be used in thecurrent resolution.

Note: The first 3 strategies above are incomplete.The Unit resolution strategy is equivalent to the Inputresolution strategy: a proof in one strategy can beconverted into a proof in the other strategy.

Logic programmingSound bite: computation as inference on logical KBs

Logic programming Ordinary programming

1. Identify problem Identify problem

2. Assemble information Assemble information

3. Tea break Figure out solution

4. Encode information in KB Program solution

5. Encode problem instance as facts Encode problem instance as data

6. Ask queries Apply program to data

7. Find false facts Debug procedural errors

Should be easier to debug Capital(NewY ork, US) thanx := x + 2 !

Prolog: A Logic Programming LanguageA pure Prolog program consists of a set of definiteclauses (only one positive literal). Instead of writing aclause as A ∨ ¬B1 ∨ ¬B2 ∨ · · · ∨ ¬Bn, we write it as

A :- B1, B2, ..., Bn

A query is a negative clause (all literals are negative):

?- C1, C2, ..., Cm

which is the clause ¬C1 ∨ ¬C2 ∨ · · · ∨ ¬Cm.

Horn Clauses = Definite Clauses + Negative Clauses

Prolog SystemsA resolution inference system on Horn clauses + bells & whistles

Widely used in Europe, Japan (basis of 5th Generation project)Compilation techniques ⇒ 60 million LIPS

Program = set of definite clauses, which are of form:head :- literal1, . . . literaln.

Variables are identifiers starting with a capital; others are constants.

Built-in predicates for arithmetic etc., e.g., X is Y*Z+3

Efficient unification by open coding

Efficient retrieval of matching clauses by direct linking

Depth-first, left-to-right resolution

Closed-world assumption (“negation as failure”) e.g., givenalive(X) :- not dead(X). alive(joe) succeeds ifdead(joe) fails.

Prolog: A Small Example1) father(bob, ken).

2) father(ken, joe).

3) father(joe, don).

4) mother(jan, don).

5) parent(X, Y) :- father(X, Y).

6) parent(X, Y) :- mother(X, Y).

7) ancestor(X, Y) :- parent(X, Z), parent(Z, Y).

8) ancestor(X, Y) :- parent(X, Z), ancestor(Z, Y).

9) ancestor(X, Y) :- ancestor(X, Z), parent(Z, Y).

?- ancestor(A, don).

A = ken;A = bob.

Resolution Proof in Prolog% it is a crime to sell weapons to hostile nations:

criminal(X):-american(X),weapon(Y),sells(X,Y,Z),hostile(Z).

% Nono ... has some missiles,

owns(nono,m1).

missile(m1).

% all of its missiles were sold to it by Colonel West

sells(west,X,nono) :- missile(X), owns(nono,X).

% Missiles are weapons

weapon(X) :- missile(X).

% An enemy of America counts as ‘‘hostile’’:

hostile(X) :- enemy(X,america).

% The country Nono, an enemy of America ...\al

enemy(nono,america).

% West, who is American ...\al

american(west).

?- criminal(Who).

Who = west.

Prolog examplesDepth-first search from a start state X:

goal(e).

successor(a,b).

successor(a,c).

successor(b,c).

successor(b,d).

successor(c,d).

successor(c,e).

dfs(X) :- goal(X).

dfs(X) :- write(X), successor(X,S), dfs(S).

?- dfs(a).

a b c d e.

No need to loop over S: successor succeeds for each

Prolog examplesAppending two lists to produce a third:

append([],Y,Y).

append([X|L],Y,[X|Z]) :- append(L,Y,Z).

?- append([1,2], [3,4], C).

C=[1,2,3,4].

?- append([1,2], B, [1,2,3,4]).

C=[3,4].

?- append(A,B,[1,2]).

A=[] B=[1,2]

A=[1] B=[2]

A=[1,2] B=[]

An Abstract InterpreterInput: Goal G, a list of literals, and program P , a list of definiteclauses.

Output: A substitution S which makes G a logical consequence of P ,or no if no such instances exist.

Algorithm:

Goals := G; S := [];

while Goals != [] do

A := delete_first(Goals);

Choose a (renamed) clause A’ :- B1, ..., Bn from P

such that A and A’ unify, with mgu S’.

(If no such clauses, go back to the last choice;

if no more choices, exit the while loop.)

Goals := append([B1, ..., Bn], Goals);

Goals := apply(S’, Goals);

S := combine(S’, S).

end

return S if Goals == [], no otherwise.

Search Trees in PrologA search tree is representation of all possible computation paths. Allpossible proofs of a query are represented in one search tree.

Consider, for example, the following program.

member(X,[X|Ys]).

member(X,[Y|Ys]) :- member(X,Ys).

The following is a search tree for the goal member(a,[a,b,a]).

member(a,[a,b,a])

/ \

succ member(a,[b,a])

/ \

fail member(a, [a])

/ \

succ member(a, [])

/ \

fail fail

Traversing Search TreeProlog traverses a search tree in a depth-first manner.

We may not find a proof even if one exists, because depth-firstsearch is an incomplete search procedure.

The order of clauses and the order of literals in a clause decide thestructure of the search tree.

Different goal orders result in search trees with different structures.Consider the following example

q(a).

p(s(Y)) :- p(Y).

The results for query p(X), q(X) and query q(X), p(X) are quitedifferent.

Changes in rule order permutes the branches of the search tree.

When writing Prolog program, make sure a successful path in thesearch tree appears before any infinite search tree path.

Unification in PrologA query ?- test. will succeed for the followingprogram:

less(X,succ(X)).test :- less(Y,Y).

Why? This because Prolog uses an unsound unificationalgorithm: The occur check is omitted for the purpose ofefficiency.

occur check: If v =? t and v occurs in t, then failure.

Another Example: An infinite loop for the query test.

test :- leq(Y,Y).leq(X,s(X)) :- leq(X,X).

Resolution Strategies in PrologSet of support: The query (negative clauses) is the onlyclause in the set of support.

Input strategy: One of the two clauses must be an inputclause.

Linear strategy: Only one resolvent is saved at any time.

The literals in the resolvent is organized as a stack:First In, Last Out.

The clauses are treated in the order they are listed.

The literals in one clause are treated in the order fromleft to right.

Note: The first three strategies are equivalent for Hornclauses.

Using GNU PrologThe download and documentation can be found at

http://pauillac.inria.fr/˜diaz/gnu-prolog/It’s available on every CS linux machine. Just type gprolog.GNU Prolog 1.2.16

By Daniel Diaz

Copyright (C) 1999-2002 Daniel Diaz

| ?- [’c:/c145/crime.pl’].

compiling c:\c145\crime.pl for byte code...

c:\c145\crime.pl compiled, 15 lines read - 1646 bytes written, 20 ms

(40 ms) yes

| ?- listing.

hostile(A) :-

enemy(A, america).

weapon(A) :-

missile(A).

Using GNU Prologsells(west, A, nono) :-

missile(A),

owns(nono, A).

enemy(nono, america).

missile(m1).

owns(nono, m1).

american(west).

criminal(A) :-

american(A),

weapon(B),

sells(A, B, C),

hostile(C).

(30 ms) yes

| ?- criminal(X).

X = west

yes

Negative Knowledge in PrologPure Prolog can never deduce negative information. Forexample, given the program

s(a).s(b).t(c).

we cannot deduce not s(c), nor not t(a).If we had the following program:

s(a).s(b).not s(c).t(c).

then we could deduce not s(c) as a logicalconsequence. But this is not a definite program (the headliteral must be positive).

The Closed World Assumption in PrologConsider again the following program P1:

s(a).

s(b).

t(c).

Suppose we knew that all true instances of s(X) andt(Y) were logical consequences of P1. That is to say,we assume that P1 expresses completely what weknow about the true instances of the relations s(X) andt(Y).

We are assured that s(c) is not a true instance of therelation s(X).

Hence, we conclude that s(c) is false, or equivalently,that not s(d) is true.

Negation as Failure in PrologWith the Closed World Assumption (CWA), we canimplement negation as failure. In this case we will allowthe inference that not G is true if we fail to prove G.

We cannot tell, in general, whether a goal G willsucceed or fail in a finite amount of time. If G has afinitely failed search tree (one which is finiteand has no success branches), we can define anegation as failure inference rule. This says that we mayconclude not A if A has a finitely failed search tree.

A typical implemention:

not(X) :- X, !, false.

Negation as Failure in PrologProlog’s negation doesn’t quite implement this. Why? just because agoal G has a finitely failed search tree doesn’t mean that the searchtree which Prolog traverses is finitely failed (remember the literalorder decides the search tree structure).

An example:

unmarried_student(X) :- not married(X), student(X).

student(bill).

married(joe).

The query unmarried student(X) fails, even though X=bill is asolution. Changing the order of the goals in the body ofunmarried student/1 avoids this difficulty.

Negated goals must be ground for negation as failure in Prolog towork correctly.

SummaryProlog is based on the Resolution method.

To make it efficient, it accepts only definite clauses.

To make it efficient, it uses an unsound unificationalgorithm.

To make it efficient, it doesn’t use factoring.

To make it efficient, it uses depth-first search.

To allow it to handle negative knowledge, the ClosedWorld Assumption is taken and the Negation As Failureis implemented.

Many other techniques are also used: cut, compilation,built-in data types, ...