5 Adversarial

Embed Size (px)

DESCRIPTION

Inteligencia artificial

Citation preview

  • Adversarial Search

    Chapter 5

  • Outline

    Optimal decisions

    - pruning

    Imperfect, real-time decisions

    Stochastic Games

    2

  • Games vs. search problems

    "Unpredictable" opponent specifying a move for every possible opponent reply

    Time limits since it is unlikely to find the goal, agent must approximate

    3

  • Exercise

    1. Tough question . Do you recognize this?

    2. Name the game.

    3. Is this an interesting or a dull

    game? Why?

    4. Characterize the game:

    Performance Measure,

    Environment, Actuators and

    Sensors (PEAS)

    4

  • Game tree (2-player,

    deterministic, turns)

    5

  • Exercise in pairs

    Can you think of a heuristic function for the Tic-Tac-Toe game?

    Using the heuristic function, devise a strategy to play tic-tac-toe.

    If both players play their best, what is the depth of the tree (how many moves)?

    (One move in this game corresponds to

    two plies, where each ply is one players turn)

    6

  • Tic-Tac-Toe heuristic function

    7

  • Exercise in pairs

    Compute the average branching factor.

    Could the branching factor be reduced? How?

    What would the reduced branching factor be?

    You just prune the search tree.

    8

  • Minimax

    Perfect play for deterministic games

    Idea: choose move to position with highest minimax value

    = best achievable payoff against best play

    E.g., 2-ply game:

    9

  • Minimax

    10

  • Minimax algorithm

    11

  • Exercise in pairs

    Analyze the minimax algorithm

    Compare your strategy for Tic-Tac-Toe with the minimax algorithm

    12

  • Properties of minimax

    Complete? Yes (if tree is finite)

    Optimal? Yes (against an optimal opponent)

    Time complexity? O(bm)

    Space complexity? O(bm) (depth-first exploration)

    For chess, b 35, m 100 for "reasonable" games exact solution completely infeasible

    13

  • Optimal decisions in multiplayer

    games

    14

  • - pruning example

    15

  • - pruning example

    16

  • - pruning example

    17

  • - pruning example

    18

  • - pruning example

    19

  • Properties of -

    Pruning does not affect final result

    Good move ordering improves effectiveness of pruning

    With "perfect ordering," time complexity = O(bm/2) doubles depth of search

    A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)

    20

  • - Prunning

    is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max

    If v is worse than , max will avoid it prune that branch

    Define similarly for min

    21

  • The - algorithm

    22

  • The - algorithm

    23

  • Imperfect Real-Time Decisions

    Suppose we have 100 secs, explore 104

    nodes/sec

    106 nodes per move

    Standard approach:

    cutoff test:

    e.g., depth limit (perhaps add quiescence search)

    evaluation function

    = estimated desirability of position

    24

  • Evaluation functions

    For chess, typically linear weighted sum of features

    Where wi = the values of pieces, e.g. 9 for queen, 3 for bishops and 1 for pawns, with fi(s) = feature of position. fi could be the numbers of each kind of piece.

    It assumes independence between features

    Could be modified to include nonlinear combination of features

    wi could be estimated using machine learning

    EVAL = 11 + 22 + + =

    =1

    25

  • Cutting off search

    MinimaxCutoff is identical to MinimaxValue except 1. Terminal? is replaced by Cutoff?

    2. Utility is replaced by Eval

    Does it work in practice?

    bm = 106, b=35 m=4

    4-ply lookahead is a hopeless chess player! 4-ply human novice 8-ply typical PC, human master 12-ply Deep Blue, Kasparov

    Kasparov m=12. If Minimax m=100

    26

  • Stochastic Games

    White movement

    Black movement

    Element of

    chance

    27

  • Stochastic Games

    Characterize the environment for Backgammon

    Deterministic/Stochastic

    Continuous/Discrete

    Dynamic/Semidynamic/Static

    Fully observable/Partially observable

    Episodic/Sequential

    28

  • Stochastic Games

    29

  • Stochastic Games

    Positions do not have definite minimax values; compute expected value of a

    position

    Generalized minimax value to expectiminimax value

    EXPECT IM IN IM AX =

    UT IL ITY if TERM INALTEST

    maxEXPECT IM IN IM AX RESULT , if PLAYER = MAX

    minEXPECT IM IN IM AX RESULT , if PLAYER = MIN

    EXPECT IM IN IM AX RESULT ,

    if PLAYER = CHANCE

    r represents a possible dice roll 30

  • Stochastic Games

    EXPECTIMINIMAX, in addition to MIN and MAX, must also consider the possible dice rolls

    O(bmnm) where b=branching factor, m=depth of search tree, and n is the number of distinct rolls.

    Alternative: Montecarlo Simulation

    From start position simulate thousands of games against itself using random dice rolls

    31

  • Partially Observable Games

    Battleship

    Kriegspiel: Variant of Chess but each player only sees his/her pieces on the board. The referee does see all the

    pieces

    Each player in his/her turn announces move to the referee; the opponent does not hear the move

    The referee announces if the move is legal or illegal; if illegal player may keep proposing moves until a legal one is found

    Referee announces e.g. Capture on square X, or Check by D, where D is the direction of the check.

    Referee also announces checkmate or stalemate.

    32

  • Card games

    Question:

    Why are most card games different to dice

    games?

    Is Domino similar to a card game?

    33

  • Domino

    Possible algorithm:

    Consider all possible deals of the invisible dominos

    Solve each one as if it were a fully observable game

    Choose the move that has the best outcome averaged over all the deals, i.e., if each deal s occurs with probability P(s), then the move to chose is:

    Run MINIMAX if computationally feasible; or H-MINIMAX otherwise

    argmax MIN IM AX RESULT ,

    35

  • Domino

    Is it computationally feasible to consider all possible deals of the invisible dominos?

    Alternative: Montecarlo simulation

    Take a random sample of N deals where the probability of deal s appearing in the sample is proportional to P(s):

    This method is called Averaging over clairvoyance

    argmax

    1

    MIN IM AX RESULT ,

    =1

    37

  • Games in practice

    Checkers: Chinook uses alpha-beta search with a database of 39 x 1012 precomputed endgame positions.

    Chess: Deep Blue uses 30 IBM RS/6000 processors doing alpha-beta search. Uses 480 custom VLSI chess processors, Searched up to 30 x 109 positions per move with an evaluation function with over 8000 features.

    Othello: human champions refuse to compete against computers, which are too good.

    Go: Computer programs in 19x19 board play at advanced amateur level. In go, b > 300, so most programs use Montecarlo simulation and pattern knowledge bases to suggest plausible moves.

    Bridge: Bridge Baron program won the 1997 computer bridge championship. Uses complex hierarchical plans involving high level ideas, but is not optimal.

    Scrabble: using a dictionary chooses highest-scoring move. Good but not expert player since game is partially observable and stochastic. Quackle defeated world champion David Boys 3-2 in 2006.

    38

  • Summary

    Games are fun to work on!

    They illustrate several important points about AI

    perfection is unattainable must approximate

    good idea to think about what to think about (metareasoning)

    39