5 Adversarial

Adversarial Search

Chapter 5

Outline

Optimal decisions

- pruning

Imperfect, real-time decisions

Stochastic Games

2

Games vs. search problems

"Unpredictable" opponent specifying a move for every possible opponent reply

Time limits since it is unlikely to find the goal, agent must approximate

3

Exercise

1. Tough question . Do you recognize this?

2. Name the game.

3. Is this an interesting or a dull

game? Why?

4. Characterize the game:

Performance Measure,

Environment, Actuators and

Sensors (PEAS)

4

Game tree (2-player,

deterministic, turns)

5

Exercise in pairs

Can you think of a heuristic function for the Tic-Tac-Toe game?

Using the heuristic function, devise a strategy to play tic-tac-toe.

If both players play their best, what is the depth of the tree (how many moves)?

(One move in this game corresponds to

two plies, where each ply is one players turn)

6

Tic-Tac-Toe heuristic function

7

Exercise in pairs

Compute the average branching factor.

Could the branching factor be reduced? How?

What would the reduced branching factor be?

You just prune the search tree.

8

Minimax

Perfect play for deterministic games

Idea: choose move to position with highest minimax value

= best achievable payoff against best play

E.g., 2-ply game:

9

Minimax

10

Minimax algorithm

11

Exercise in pairs

Analyze the minimax algorithm

Compare your strategy for Tic-Tac-Toe with the minimax algorithm

12

Properties of minimax

Complete? Yes (if tree is finite)

Optimal? Yes (against an optimal opponent)

Time complexity? O(bm)

Space complexity? O(bm) (depth-first exploration)

For chess, b 35, m 100 for "reasonable" games exact solution completely infeasible

13

Optimal decisions in multiplayer

games

14

- pruning example

15

- pruning example

16

- pruning example

17

- pruning example

18

- pruning example

19

Properties of -

Pruning does not affect final result

Good move ordering improves effectiveness of pruning

With "perfect ordering," time complexity = O(bm/2) doubles depth of search

A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)

20

- Prunning

is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max

If v is worse than , max will avoid it prune that branch

Define similarly for min

21

The - algorithm

22

The - algorithm

23

Imperfect Real-Time Decisions

Suppose we have 100 secs, explore 104

nodes/sec

106 nodes per move

Standard approach:

cutoff test:

e.g., depth limit (perhaps add quiescence search)

evaluation function

= estimated desirability of position

24

Evaluation functions

For chess, typically linear weighted sum of features

Where wi = the values of pieces, e.g. 9 for queen, 3 for bishops and 1 for pawns, with fi(s) = feature of position. fi could be the numbers of each kind of piece.

It assumes independence between features

Could be modified to include nonlinear combination of features

wi could be estimated using machine learning

EVAL = 11 + 22 + + =

=1

25

Cutting off search

MinimaxCutoff is identical to MinimaxValue except 1. Terminal? is replaced by Cutoff?

2. Utility is replaced by Eval

Does it work in practice?

bm = 106, b=35 m=4

4-ply lookahead is a hopeless chess player! 4-ply human novice 8-ply typical PC, human master 12-ply Deep Blue, Kasparov

Kasparov m=12. If Minimax m=100

26

Stochastic Games

White movement

Black movement

Element of

chance

27

Stochastic Games

Characterize the environment for Backgammon

Deterministic/Stochastic

Continuous/Discrete

Dynamic/Semidynamic/Static

Fully observable/Partially observable

Episodic/Sequential

28

Stochastic Games

29

Stochastic Games

Positions do not have definite minimax values; compute expected value of a

position

Generalized minimax value to expectiminimax value

EXPECT IM IN IM AX =

UT IL ITY if TERM INALTEST

maxEXPECT IM IN IM AX RESULT , if PLAYER = MAX

minEXPECT IM IN IM AX RESULT , if PLAYER = MIN

EXPECT IM IN IM AX RESULT ,

if PLAYER = CHANCE

r represents a possible dice roll 30

Stochastic Games

EXPECTIMINIMAX, in addition to MIN and MAX, must also consider the possible dice rolls

O(bmnm) where b=branching factor, m=depth of search tree, and n is the number of distinct rolls.

Alternative: Montecarlo Simulation

From start position simulate thousands of games against itself using random dice rolls

31

Partially Observable Games

Battleship

Kriegspiel: Variant of Chess but each player only sees his/her pieces on the board. The referee does see all the

pieces

Each player in his/her turn announces move to the referee; the opponent does not hear the move

The referee announces if the move is legal or illegal; if illegal player may keep proposing moves until a legal one is found

Referee announces e.g. Capture on square X, or Check by D, where D is the direction of the check.

Referee also announces checkmate or stalemate.

32

Card games

Question:

Why are most card games different to dice

games?

Is Domino similar to a card game?

33

Domino

Possible algorithm:

Consider all possible deals of the invisible dominos

Solve each one as if it were a fully observable game

Choose the move that has the best outcome averaged over all the deals, i.e., if each deal s occurs with probability P(s), then the move to chose is:

Run MINIMAX if computationally feasible; or H-MINIMAX otherwise

argmax MIN IM AX RESULT ,

35

Domino

Is it computationally feasible to consider all possible deals of the invisible dominos?

Alternative: Montecarlo simulation

Take a random sample of N deals where the probability of deal s appearing in the sample is proportional to P(s):

This method is called Averaging over clairvoyance

argmax

1

MIN IM AX RESULT ,

=1

37

Games in practice

Checkers: Chinook uses alpha-beta search with a database of 39 x 1012 precomputed endgame positions.

Chess: Deep Blue uses 30 IBM RS/6000 processors doing alpha-beta search. Uses 480 custom VLSI chess processors, Searched up to 30 x 109 positions per move with an evaluation function with over 8000 features.

Othello: human champions refuse to compete against computers, which are too good.

Go: Computer programs in 19x19 board play at advanced amateur level. In go, b > 300, so most programs use Montecarlo simulation and pattern knowledge bases to suggest plausible moves.

Bridge: Bridge Baron program won the 1997 computer bridge championship. Uses complex hierarchical plans involving high level ideas, but is not optimal.

Scrabble: using a dictionary chooses highest-scoring move. Good but not expert player since game is partially observable and stochastic. Quackle defeated world champion David Boys 3-2 in 2006.

38

Summary

Games are fun to work on!

They illustrate several important points about AI

perfection is unattainable must approximate

good idea to think about what to think about (metareasoning)

39

Documents

5 Adversarial