Lab: Adversarial Search
CSC261 - Artificial Intelligence - Weinman
- Summary:
- We explore MINIMAX search in preparation for
implementing a cutoff a-b search and experiment with state
evaluation functions.
Preparation
- Make a directory for this lab.
-
mkdir somewhere/adversarial
cd somewhere/adversarial
- Copy the starter files to your new direcdtory
-
cp ~weinman/courses/CSC261/code/adversarial/*.scm .
- Open your Scheme environment and create a new Scheme file called
prep.scm in your new directory.
Exercises
A: Getting started with Barranca and MINIMAX search
When we play the game, anyone using MINIMAX-DECISION needs
a UTILITY function that takes a state as a parameter and produces
a number. The state of the game is represented the same way for both
players. Therefore, the utility function will need to distinguish
whether a state is advantageous for the player or its opponent. Thus,
in creating a utility we will specify who we are advocating for (that
is, whose definition of utility, player 1's or player 2's).
In barranca, the state contains the player getting the next turn,
the list of numbers each player has, and the list of remaining numbers.
Thus, when evaluating utility, the procedure needs to know which list
of player-possessing numbers belongs to it. In all the games provided
(Barranca, Tic-Tac-Toe, Mancala), player 1 is represented by #t
and player 2 represented by #f.
- Add commands to load some relevant starter files to your environment.
-
(load "minimax.scm")
(load "barranca.scm")
- Create a barranca game with n=4 number possibilities and a target
k=7.
-
(define barranca (make-barranca-game 4 7))
- Use barranca-utility-fun (provided in barranca.scm)
to create a utility for player 1 with a target of 7:
-
(define barranca-player1-utility (barranca-utility-fun #t 7))
This procedure gives 1 for wins, 0 for draws, and -1 for losses.
- Ignoring complete game play for a moment, ask the search routine to
find us the best opening move (and its value) for player 1.
-
> (max-value barranca
(game-start-state barranca)
barranca-player1-utility)
This produces the pair (action . utility).
- Based on this information, what number should player 1 choose? Which
player should win this game?
- Usually we'll just want the optimal move, so ask for that more directly.
-
> (minimax-search barranca
(game-start-state barranca)
barranca-player1-utility)
Does this give you what you expected?
- What's actually going on inside minimax-search?
- Open the file minimax.scm.
- Set DO-DISPLAY to #t,
This will display a complete trace of the search tree.
- A barranca state is represented as a list of the form
-
(next-player player-1-numbers player-2-numbers available-numbers)
Re-run minimax search and trace the first ply of output.
-
> (minimax-search barranca
(game-start-state barranca)
barranca-player1-utility)
If you are unable to make sense of the trace, you may want to read
the end note.1
- Restore the value of DO-DISPLAY in minimax.scm to
#f.
B: Moving up to tic-tac-toe
- Add the tic-tac-toe routines to the top of your Scheme file.
-
(load "tictactoe.scm")
- Create a tic-tac-toe game. There are no free parameters in this game,
so the call is straightforward.
-
(define ttt (make-tictactoe-game))
- Just like barranca, a tic-tac-toe utility function has to know on
whose behalf it is evaluating a state's utility The procedure
tictactoe-utility-fun simply takes an argument indicating
whether it needs to produce a procedure for player 1 (#t)
or player 2 (#f). Going one step further than before, we
use the utility function to create a player for the game playing engine
that will be used to select moves.
-
(define ttt-smart-player1 (make-minimax-player ttt (tictactoe-utility-fun #t)))
(define ttt-smart-player2 (make-minimax-player ttt (tictactoe-utility-fun #f)))
- Of course, optimal play in tic-tac-toe should always lead to a draw.
How can we pit these fearless competitors against each other? We simply
use the game-play engine, giving it the game (so it knows the rules)
and the two decision-making procedures.
-
> (game-play ttt ttt-smart-player1 ttt-smart-player2)
After showing the initial board, it may take several seconds for a
move to appear. Does the game play out the way you expect?
- Of course, the actions being reported may not be meaningful to you,
but they correspond to the rules of the game, as internally represented.
The slots on the 3×3 grid are simply numbered from top-to-bottom,
left-to-right, starting at 0.
Verify the states displayed correspond to these action numbers.
- What if we don't want to have an optimal player? Create a remarkably
simple agent decision function that always chooses the first possible
action.
-
(define ttt-lazy-player
(lambda (state)
(caar ((game-successors-fun ttt) state))))
The successor function produces a list of pairs. Because the first
item in the list is a pair (first-action . resulting-state),
taking the car of the first item gives us the actual action
choice. (Combining the two leads to the caar shown above.)
The astute reader will notice that we haven't checked for the case
when there are no successors. We need not, since the game
play engine will not ask us for a play when there is none to make.
- What do you expect to happen when our lazy agent squares off against
the optimal player? Verify your prediction:
-
> (game-play ttt ttt-smart-player1 ttt-lazy-player)
- How important is this utility function anyhow? What if we gave player
1 the utility function used by an optimal player 2? Even if our latter
player is extremely lazy, player 1 will in fact be choosing moves
that benefit player 2. What do you expect to happen?
-
> (game-play ttt ttt-smart-player2 ttt-lazy-player)
Don't be misled by the variable names. Player 2 is actually ttt-lazy-player.
C: Mancala
In mancala, the state is represented as a pair (next-player
. board). The board is represented as a list of 14 holes, as follows.
Indices 0-5 are the holes for player 1 with index 6 as player 1's
mancala. Indices 7-12 are player 2's holes and index 13 is player
2's mancala. The figure on page one illustrates the indices along
with the contents of each slot. This corresponds to the list
-
(4 4 4 4 4 4 0 4 4 4 4 4 4 0)
Because mancala has a moderately large branching factor (six early
in the game), we may not always be able to search a game tree all
the way to terminal states. As mentioned in the lab assignment, we
can instruct the minimax search to cutoff at a particular depth (number
of plies) and then apply an evaluation function for estimating
the utility of a given game board. Let's see how to do all of that.
- Add the mancala routines and a simple evaluation function to the top
of your Scheme file.
-
(load "mancala.scm")
(load "mancala-player.scm")
- Add the minimax with cutoff routines to the top of your Scheme file.
-
(load "cutoff-minimax.scm")
- Create a mancala game.
-
(define mancala (make-mancala-game))
- Use the naive evaluation function provided in mancala-player.scm,
which simply counts the stones in our player's mancala.
-
(define mancala-player1-eval (simple-mancala-eval #t))
- Sanity check this function by asking it to evaluate the initial state
of the game. What result do you expect? Verify this result.
-
> (mancala-player1-eval (game-start-state mancala))
- Use this evaluation function to create a player that only searches,
3 plies deep in the tree.
-
(define mancala-player1 (make-cutoff-minimax-player mancala 3 mancala-player1-eval))
- Ask the player what opening move it chooses.
-
> (mancala-player1 (game-start-state mancala))
- For comparison, create an even simpler (lazier) player that simply
chooses the first action available
-
(define mancala-lazy--player
(lambda (state)
...)
- Use game-play to pit these two players, mancala-player1
and mancala-lazy--player against one another. Who do you
expect to win?
- Use simple-mancala-eval to create an evaluation function
for player 2, and make-cutoff-minimax-player to create a
player that is allowed to search four plies deep, rather than the
three plies player 1 was allowed.
-
(define mancala-player2 ...)
- What will happens when these play each other? (Aside from the fact
that player 2 will take exponentially longer). Use game-play
to verify your prediction. (Keep reading the rest of the lab while
you wait for the game to finish.)
You should note that going any deeper will make the search even more
painfully slow. Hence, we will need to use alpha-beta pruning to ignore
irrelevant sub-trees.
Lab Assignment
Between this lab and snippets in the assignment document, you now
have nearly all the pieces necessary to implement your own evaluation
functions and alpha beta pruning for these games.
Open minimax.scm and study the implementation of max-value.
When you understand how this implements MAX-VALUE of AIMA Figure
5.3 (p. 166), proceed to begin the lab's assignment.
Copyright © 2011 Jerod
Weinman.
This work is licensed under a Creative
Commons Attribution-Noncommercial-Share Alike 3.0 United States License.
Footnotes:
1We start by following the alternating MAX and MIN nodes
down the left side of the tree, as 1 is chosen by player 1, then 2
is chosen by player 2, 3 is then chosen by player 1, and finally 4
is chosen by player 2. Here the two products are 3 and 8. Because
the move belongs to MAX (player 1) and there are no remaining
possibilities, the utility is reported as -1 (3 is much farther from
the goal of 7 than player 2's value of 8).
The c node above this then reports that its only option is to choose
4, and the utility is -1. The process is repeated for the case when
the MAX node has chosen 4 as its second value, instead of three.
This leaves player 2 ( MIN) with only one possibility, and the
final MAX node with no remaining moves. The utility is again
calculated, turning out again to be in favor of MIN (player
2). The MIN node reports that its only action is to choose
3, with a utility of -1. Just above this in the tree, the MAX
node now has a choice between 3 and 4, but both of these have a utility
of -1.