MAT109 · Erich Prisner · Franklin College · 2007-2009

# Theory Chapter 3: Sequential Games I: Perfect Information and no Randomness

Note to the Teacher: This is now a fairly long chapter. Most important are Sections 1 and 2---Displaying a sequential game using Extensive Form and analyzing it using Backwards Induction. The eight topics in Section 3 are not all important. Most of them do not occur in introductory Game Theory books,
"Life can only be understood backwards, but it must be lived forwards"
Søren Kierkegaard

NIM(6): Six stones lie on the desk. Beginning with White, Black and White alternate to remove either one or two stones from the desk. Whoever first faces an empty desk when having to move loses. The winner gets \$1, the loser loses \$1. What are the best strategies for the two players?
Student Activity: Try your luck in this applet against a friend (hit the "Start new with 6" button before you start). Or play the game against the computer. Play ten rounds where you start with 7 stones. Then play ten rounds where you start with 9 stones. Then play ten rounds where you start with 8 stones. Discuss your observations.

In the chapter we look at another class of rather simple games, namely sequential games. These are games where the players move one after another, no two players move at the same time. Among sequential games we concentrate on games of perfect information, where every player knows all previous decisions when he or she moves. Randomness will also only be included after the next chapter. We will learn a little terminology, see how to display these games either as game tree or game digraph, and how to analyze them using the tool of "backward induction" provided the game is finite. We conclude the chapter by discussing whether this solution found by backward induction would really always be what real players would play, by discussing another approach for sequential games and discussing the special roles two-person zero-sum games play here, and also by discussing briefly several existing sequential games like chess, checkers, tic-tac-toe. We also mention a rather simple approach how one would play if not knowing how to play.

## 1. Extensive Form: Game Tree and Game Digraph

A sequential game, is a game where the players move one after another; never are two players supposed to move at the same time. Remember that a position is usually any situation where a player has to make a move, a decision, out of a number of possible moves. Therefore in a sequential game such a position is linked to just one of the players. Later, when we allow randomness, positions may also be situations where a random experiment is performed, so later positions may also belong to the random player. In addition to these positions belonging to some player, there are also positions where the game ends, these are called end positions. They are the outcomes, and obviously nobody is supposed to move then anymore. But at these end positions, the payoff for each player must be given. Every game starts with one and only one start position.

Note that the relevant information known to the player about to move is also part of the position. Usually in a sequential game of perfect information this information is just the sequence of all previous decisions of the other players in all previous moves. If this differs, we usually also have different positions, except in cases where this difference is not relevant for the future. We will explain this more thoroughly below when we explain the difference between Game Trees and general Game Digraphs.

In this chapter we assume perfect information and don't allow random moves. Perfect information means in the context of sequential games that every player will immediately recognize in which position the game presently is.

The game changes its position by moves of players. At each non-end position, the player linked to that position chooses one of his or her possible moves. These possible moves are also known to everybody. Each of these possible moves of the player to move leads to another position.

SENATE RACE II: An incumbent senator (from a rightist party) runs against a challenger (from a leftist party). They are first choosing a political platform, leftist or rightist, where the senator has to move first. If both choose the same platform, the incumbent wins, otherwise the challenger wins. Assume that the value of winning is 10, the value of compromising their political views (by choosing a platform not consistent with it) is -5, and the payoff is the sum of these values if occurring.
There are four outcomes: If both choose "leftist", the incumbent senator wins 10 but loses 5 for compromising his or her view, so the total payoff is 5. The challenger doesn't win anything and doesn't lose anything in this case, so the payoff is 0. In the same way, if both choose "rightist", then the senator gets a payoff of 10 and the challenger of -5. If the senator chooses "leftist" and the challenger "rightist", then the senator "wins" -5 (loses 5) and the challenger gets a payoff of 5. Finally if the senator chooses "rightist" and the challenger "leftist", then the (now former) senator gets a payoff of 0 and the challenger of 10. In addition to these four end positions, there is also the start position where the incumbent senator has to decide between "leftist" and "rightist", and two more possible positions where the challenger has to chose. One position is where the senator has chosen "rightist", and the other where the senator has chosen "leftist".

Sequential games are usually visualized or even described graphically: For each position a small circle is drawn on the paper, and the possible transitions between the positions are expressed by an arrow between the corresponding circles. Very often, instead of arrows one uses straight lines or even curves, from left to right or from up to down, with the convention that the direction is leftwards (in the first case) or downwards (in the second). We usually use lines from left to right. The small circles are usually denoted as vertices or nodes, and the arrows are denoted as arcs. These vertices and arcs are usually labeled. At each non-end vertex, there is a label indicating to whom the vertex belongs, The outgoing arcs of a non-end position correspond to the possible moves the player can make at this position. They are labeled by the move names. The end positions are those vertices with now outgoing arcs---at them the payoffs for each player have to be noted. The whole (directed) graph with all labels is called the Game Digraph or Extensive Form of the sequential game. But note that we need to extend the definition further later, when we take randomness and non-perfect information into account.

In the SENATE RACE II example above, the Extensive Form looks as follows.

Let us look at another example:

The Extensive Form for the NIM(6) game described above, looks as follows. If, following chess terminology, the first mover is called "White" and the other one "Black", then White's positions, moves, and payoffs are colored red, and Black's positions, moves, and payoffs blue. Every position has a label indicating how many stones are still on the desk at that position, and who's move it is. For instance, "4w" is the position with four stones on the desk and White to move.

The Extensive Forms in the previous two examples are so-called Game Trees. Let's avoid the formal definition and just say that a Game Tree looks like a tree, rotated by 90 or 180 degrees. Trees arise if one considers a position to be the whole sequence of previous decisions made by all the players who have moved so far.

But in the previous example, it is also obvious that there is some redundancy. Why do we have two "3w" positions? Granted, both have a different history, one resulting from position "5b" with Black taking two stones, and the other from position "4b" with Black taking one stone. But that is not relevant for the future, as can be seen in the Game Tree by the fact that both subtrees hanging at the corresponding vertices are identical. So if White decides how to play in these two positions, he or she should come to the same conclusion.

If we identify such corresponding positions in the Game Tree of the previous Nim(6) example, we get the following Extensive Form:

It should have become clear from the previous discussion that games may have different descriptions by Extensive Forms. We will see more examples later. Note also that in the literature mostly game trees are used to describe extensive forms. Our approach of game digraphs has the advantage of reducing the number of positions.

## 2. Analyzing the Game: Backward Induction

### 2.1 Finite Games

Next we will discuss a simple and very powerful method how to analyze sequential games. However, this method works only for so-called "finite games". A sequential game is finite if it has a game tree with finitely many vertices. Having a Game Digraph> with finitely many vertices is not sufficient, as can be seen in the next example:

2 FINGERS: Two players move alternatingly. A player moves by raising one or two fingers. A player loses when raising the same number of fingers than the other player in the previous move. Then the payoffs are -1 for the loser and 1 for the winner.
How do you play this simple zero-sum game? Who will win? Obviously nobody will lose, since losing can very easily be avoided. So if nobody loses, nobody wins. The two players will continue playing forever. The Game Tree goes on and on, indicated by the dashed line, and is therefore infinite. Thus the game is not finite.

We can also use a Game Digraph. Then a non-end position is uniquely determined by who is about to move, White or Black, and how many fingers were last shown. So we have four of these positions, labeled as W1, W2, B1, and B2. Of course there is one more position, the start position with White to move, where no fingers have been shown yet, and two end positions, one where White wins and one where Black wins.

Why does the previous example have a finite game digraph although the game is not finite? The reason is that the game digraph is cyclic. That means that it is possible to follow arcs from some vertex x and reach the vertex x again after some time, like in W1, B2, W1 in the above example. Games with a cyclic game digraph have always an infinite game tree, but a sequential game is finite if it has some acyclic (not cyclic) finite game digraph. Why does the previous example have a finite Game Digraph although the game is not finite? The reason is that the Game Digraph has cycles. It is possible to follow arcs from some vertex x and reach the vertex x again after some time, like in W1, B2, W1 in the above example. Thus a sequential game is finite if it has some acyclic finite Game Digraph.

There are theoretical reasons why we exclude these infinite games, but of course there are more practical reasons: We need to have a guarantee that a game eventually ends. Note that even chess would not be finite, were it not for the "50-move rule" that says that a chess play ends as draw if no pawn has been moved within 50 moves.

### 2.2 The Procedure

What players may want from Game Theory is advice how to play. In a sequential game that means that for every position that is not an end position the player who has to move there would need advice about which of his or her possible options to choose. Such a list of recommendations for all the player's positions---even for those positions for which we are rather certain that they will never occur in a play--- is called a pure strategy for that player. In this section we will present a procedure generating pure strategies for all players. Moreover, at each vertex some numbers will be attached by the procedure: the likely payoffs for each one of the players. The word "likely" has nothing to do with probabilities, which will only be discussed in the next chapter. Its meaning here will be discussed later.

Let us explain the method first at another example, this time a non-zero sum game with three players:

SEQUENTIAL LEGISLATORS VOTE: Three legislators vote whether they allow themselves a salary rise of \$2000 per year. Since voters are observing, a legislator would estimate the loss of face by having to vote for a rise as \$1000 per year. In the sequential variant considered, A has to vote first, then B, then C, and all votes are open. (This is a variant of a game described in [Kaminski].)

The Game Tree of this game is shown to the right. The behavior of legislator C is easiest to explain first, since at the time when C has to move, A and B have already moved. So C doesn't need to anticipate their strategies. C may face four different situations, corresponding to the four vertices to the right of the tree. In the first situation, when A and B have already voted for a raise, C can keep face by voting against the raise but still get the benefit of it. The same is true if A and B both rejected the idea of a raise and it is decided already. It is different if one voted for and one against a raise. Then C's vote counts. Since the money is more important than face (for our fictitious legislators only!) C would vote for a raise in that case.

Now everybody can do this analysis, therefore if B has to decide, she can anticipate C's reactions. If A voted for a raise, B knows that C will vote for a raise if she votes against, and conversely. So why not give C the burden of having to vote for the raise? Obviously B will vote against a raise in this situation. If on the other hand A voted for a raise, then B has to vote for a raise to keep things open.

Since A knows all this too, A will vote against a raise, B will vote for it, and C will vote for it as well. A has the best role in this game. Going first is sometimes useful, after all!

This method of attaching a recommendation and "likely" payoffs for all players to all vertices of the Extensive Form, starting at the vertices "late" in the game and moving backwards is called "backward induction". We will explain it more formally below.

Note that when a player accepts these decision found by backward induction, he or she must assume that the other players will also stick to these backward induction recommendations. And why wouldn't they? Well, one reason could be that they are not able to analyze the game, since the game is maybe too complicated. Or a player can analyze the game, but doubts that all others can do so.

In NIM(6) we can immediately assign "likely" payoffs to the vertices "1w" and "1b". In both cases, the player to move has just one option. In the "1w" case, White will move, take the remaining stone, and arrive at "0b", which is loss for Black, therefore a win for White. Therefore White will win when facing "1w", and the expected payoffs are 1 for White and -1 for Black. In the same way, the likely payoffs at "1b" are -1 for White and 1 for Black.
Having now values assigned to four vertices, we consider the vertices "2w" and "2b" whose successors all have values already. In the "2w" position, White can proceed to "0b" and win, and will do so, therefore the values there are 1 and -1 for White and Black. Similar, at the position "2b" the likely payoffs are -1 and 1 for White and Black.
Next we look at the positions "3w" and "3b", since again all their successors have been treated already. From "3w", White can move to positions "2b" or "1b" by taking one or two stones. But both these positions are unfavorable for White having both a likely payoff of -1 for White attached. Therefore it doesn't matter what White chooses in this situation, and the likely payoffs at vertex "3w" are -1 and 1 for White and Black. The position "3b" is analyzed in the same way.
Since from position "4w" positions "3b" and 2b" can be reached, but the first one is favorable for White and the second one is favorable for Black, White will choose the first option. This means that the likely payoffs of position "3b" are copied to position "4w". In the same way, Black will move to position "3w" from position "4b", and the likely payoffs of position "3w" are copied to position "4w".
We proceed in this way, and eventually arrive to assign likely payoffs even for the start vertex. This means that at the very beginning, White expects to lose and Black expects to win.

Procedure for Backward Induction : Your goal is to have likely payoff values for each player assigned at every vertex. As long as not all vertices have such values attached, do the following:

1. Find a vertex with no values attached yet, but with the property that all its successors in the digraph have values attached. Such a vertex can always been found (why?). Call this vertex "V".
2. At vertex V, one player, say "X" has to move. Identify the successor vertex of V for which the value for player X is highest. This is the vertex where player X want to move to (since the values will turn into payoffs eventually). Let's call this other vertex "W".
3. Copy the values of vertex W to vertex V. Backward induction strategy recommends that in position V player "X" will move such that position W is obtained.

There is still one problem in this procedure. What do we have if there are ties, if two or more successor vertices of a vertex have the same likely payoff for the player to move at that vertex. This case will be discussed in Section 3.4

### 2.3 Zermelo's Theorem

In 1913, Ernst Zermelo showed that finite perfect-information sequential games can always be analyzed using backward induction. Actually he referred to chess, but this result can easily be generalized to the following:

Theorem [E. Zermelo 1913]: Every finite perfect-information sequential game without random moves can be analyzed using backward induction. The likely payoff values of the start position are what the players should expect when playing the game provided all play rationally.
This is one of the few proofs that will be covered. Students should understand the notion of a proof, and how the proof explains the construction.
Proof: Induction over the number of states.

By construction, the outcome generated by backward induction obeys the following property, which of course has some conncetion to the Nash equilibrium definition:

Fact:If all players play their backward induction strategy, and if one single players deviates from his or her backward induction strategy, then this player will not get more as payoff.

Historical Remark 1: Ernst Zermelo (1871-1953) was a German mathematician. He was Professor in Göttingen from 1905 to 1910 and in Zürich until 1916, when he had to retire because of health problems. He is nowadays mostly known for his work in the foundation of set theory, which itself has been invented by Georg Cantor about 30 years earlier. However, his paper on chess and the procedure that is nowadays called "backward induction" was his only contribution to Game Theory. In fact, Game Theory did not exist yet at that time.
The idea underlying backward induction seems rather natural. It is not known to me whether Zermelo was the first to express it. However, the idea has been reinvented several time.

## 3. Additional Topics

### 3.1 Reality Check and Utilities

Maybe let the students play these centipede games first, before analyzing and discussing them.

Although the backward induction solution seems to be the best the players can do in many situations, there are certain sequential games where players tend to play differently:

CENTIPEDE GAME: In this game, two players alternately face two stacks of money. A player to move has the choice either to pass, in which case both stacks grow slightly and the other player faces them, or to take the larger stack, in which case the other player gets the smaller one and the game ends. The game ends anyway after a fixed and known number of rounds, in this example after six rounds.

This game was introduced by Rosenthal [Rosenthal 1981] in the 100 rounds version. It has a unique backward induction solution: Player 1 would take the money in the first round and end the game. On the other hand, players can both increase their payoff if they both wait much longer. When McKelvey and T. Palfrey made experiments with this game [McKelvey, T. Palfrey 1992], they observed that players tend to play some rounds before somebody cannot resist any longer and takes the higher stack.

Here is another classical example:

ULTIMATUM GAME: There is a fixed number, say 5, of dollar bills for both players. Ann makes an offer how to share it, which Beth can either accept or reject. If she accepts, the bills are divided as agreed upon, if she rejects nobody gets anything.

The graph to the left shows the extensive form of the game when payoffs mean the money obtained. In the backward induction solution Ann would offer only one bill, or maybe even none---we will discuss this problem with payoff ties in a later subsection---to Beth. However, experiments show that people tend to be less mean than they could. People like to be fair.

Let us try to model dissatisfaction of unfairness in the following way: Let the payoff of each player not be just the money obtained, but rather express satisfaction about the outcome. Assume this satisfaction is the money obtained, but reduced by an "unfairness penalty" for the player obtaining more. Assume this unfairness penalty is two thirds of the difference between both values. That means that the money exceeding the amount the other gets is still valuable, but only one third of this surplus money satisfies. Then the payoffs of this modified Ultimatum Game can be seen in the figure to the right. The backward induction solution is Ann offering two bills to Beth, which she accepts.

### Modeling Note 3: Modeling Altruism using Utilities

Working through the material so far, you may be a little fed up by these "rational players". It may appear to you rather cold, inhuman even. Granted that companies, aiming to maximize profits, can be modeled by rational players, and granted that in parlor games, in particular in those which are zero-sum, player may only look at their own payoff. But don't we humans, in real-world situations, (hopefully) care about each other? Doesn't that mean that we are not only looking at our own payoff, that we are not indifferent about the other's payoffs?

Does it? Things may become a little clearer if we separate the money paid and the satisfaction received. Let us denote by "payoff" this satisfaction for each player, rather than the money received. This payoff depends on the different features of the outcome obtained, including the money given to the different players. Moreover this dependency on these features, very often called "utility function", is a very individual one, different for every player. Only in the case of a very selfish player the payoff or satisfaction received may be identical to his or her own money won. The payoff of a totally altruistic player may be the sum of the money won by all the other players. And of course, there are possibilities between. In this way, altruism, as well as emphasis towards others, sympathy or antipathy, can be modeled. And in most cases the utility function is not linear---Winning twice the money doesn't make you twice as happy, an idea that will be further discussed in a later chapter.

### 3.2 Playing it Safe---Guaranteed Payoff

A player will only get the payoff predicted by backward induction analysis if all players play rationally, i.e. stick to the backward induction analysis moves themselves. Thus this "likely" payoff is not guaranteed: if the other players deviate from backward induction moves, a player may get much less. Therefore using these backward induction moves may be risky. If you want to play it safe, maybe assuming that all others do intend to just hurt you, like in the Maximin paradigm discussed in the previous chapter, you may want to play different moves. See the next example:

In this game Ann owns one position, the start position, and there are two positions where Beth is supposed to move, and there are four end positions.
The backward induction analysis goes as follows: In Beth's first position, Beth will choose move B1, and in her second position Beth will choose move B3. Therefore the likely payoffs at these two positions are 1, 10 respectively 2,9. Therefore Ann will choose move A2, since this will give her a payoff of 2.
However, if Ann doubts Beth's rationality, if maybe Ann fears that with Beth's payoffs of 9 and 8 in the end positions following Beth's second position Beth may chose any one of B3 and B4, then maybe Ann will move A1, since this can guarantee her at least a payoff of 1.

Sequential games with perfect information have guaranteed payoffs, also called security levels, for all players. These can be found for every player separately, doing a variant of backward induction, which will be called Security Level Analysis. Let's call the player in question Ann. During this analysis we also find what moves Ann would choose in each of her positions if she would fear that everybody is just playing to hurt her. The resulting strategy is called her Security Strategy. Since this is a completely egocentric point of view, taking only the payoffs for Ann into account, we can eliminate all other payoffs for this analysis. Then we attach a value, the guaranteed payoff for Ann, to every position, starting again at the end positions. For moves with all successors having already guaranteed payoffs for Ann attached, we choose the lowest of these values as the value for the position provided Ann does not move at that position, and the largest of these values of the successors provided Ann does move at that position. In that case Ann will choose the option leading to a position with maximal guaranteed payoff for Ann attached.

In the previous example, Ann's guaranteed payoff at Beth's first (upper) position is 1, and Ann's guaranteed payoff at Beth's second position is 0. Therefore Ann's guaranteed payoff at her start position is 1, achieved by move A1. This strategy differs from the backward induction strategy.

Consider the following game. There are 11 non-end positions, named as Pos1, ... Pos11.
• indicates a likely payoff of 10 for Ann and of 6 for Beth.
• shows that Ann can only guarantee a payoff of 5. To guarantee that payoff, Ann would choose move A2 in position Pos1, instead of A1 in the backward induction strategy. In position Pos6 she will also act differently. Since only Ann's payoffs are relevant for this analysis, only they are highlighted in yellow.
• Finally shows that Beth has a security level of 3. Again, Beth would sometimes, as in position Pos3, move differently than in the backward induction strategy. Here only Beth's payoffs are highlighted.

### 3.3 Two-person Zero-sum Games

Fact:If the game has only two players and is zero-sum, then the backward induction analysis equals each of the security level analyses, backward induction strategy equals each of the security level strategies of the two players, and the values obtained by backward induction are just the security levels of the players. That means that playing safe is the same as playing rational and assuming rational play of the other player in that case.
The reason is very simple. When we apply these procedures, we look at a position, say of Beth, where all successor positions have values assigned. Now a move minimizes Ann's values if it maximizes Beth's values, provided the sum of these two values equals zero.

This fact gives tremendous significance to backward induction in the two-player zero-sum case. Therefore it seems justified to call the two backward induction strategies of the two players "optimal". If both players, Ann and Beth, stick to these strategies, they will obtain the values attached at the start position as payoffs. If one of them, let's say Beth, deviates, we already know that Beth will get not more. This is even true without the assumption of two-players zero-sum games. But with this additional assumption, if Beth deviates we also know that Ann will not get less, provided Ann sticks to her backward induction strategy.

Historical Remark 2: Most people consider John von Neumann (1903-1957) to be the "inventor" of Game Theory. His first paper on the topic appeared in 1928. Although Émile Borel had published some papers on simultaneous games in the 1920s, these and Ernst Zermelo's paper mentioned above were not as influential as von Neumann's paper. In it he proved the fundamental Minimax Theorem, the existence of a Nash equilibrium in mixed strategies for every finite simultaneous two-person zero-sum game, the fact that these strategies are the Maximin mixed strategies, and that the payoffs for both players in the Nash equilibrium equal the security levels if mixed strategies are allowed. These Theorems are discussed in a later chapter. So interestingly von Neumann concentrated from the very beginning on this concept, two-person zero-sum, that allows the most impressive theory. In 1944 von Neumann and the economist Oscar Morgenstern published their fundamental book "The Theory of Games and Economic Behavior".
Born in Budapest, von Neumann studied and worked in Zürich and Berlin. In the thirties he emigrated to the USA, where he had been offered a prestigious Professorship at the newly-founded Institute of Advanced Study in Princeton, with colleagues as Albert Einstein, Gödel. He also played a leading role in the Manhattan project during World War II, and was an influential US government consultant during the Cold War. He was also very influential in the development of the first computers in the 1940s.

### 3.4. Breaking Ties

What happens in the backward induction procedure above if in step (P3) there are two or more successor vertices, let's say "W" and "U", giving both the maximum value for player X to move? Player X would be totally indifferent whether to move to W or to U. The value for X at V is obviously the value for X at W (which is the same as the value for X at U), but still the decision where to move would influence the payoffs of the other players. The most natural solution to these problems is to assume that X will move to W or U with equal probability. This is already a so-called "behavior strategy"---at certain vertices a player would not be determined what to play but rather alternate two or more moves in a random way. Then we call X a "neutral" player. Obviously the value for any other players at V would be computed as the average (expected value with 50-50 probabilities) of the values for that player at W and U.

In a two-person game, a slightly different approach could also be taken. If player X is slightly "hostile" against her opponent, she would choose that vertex of W and U that has the lower value for her opponent. If X is slightly "friendly" she would choose that vertex of W and U that has the higher value for her opponent. Note that she still cares only about her payoffs, only in a tie situation of her own payoff she would consider the other's payoff. Hostile players can be modeled by subtracting a small fraction (let's say 0.1%) of her opponent's payoffs from her payoffs, a fraction so small that the ordering of the payoffs is not changed, only in case of ties suddenly the situations are ordered. Similarly, a friendly player would add a small fraction of the opponent's payoffs to her own payoffs.

Friendly players are far from being altruistic. However, altruistic behavior can also be modeled using the same trick, but for real altruism one would add a considerable amount of the opponent's payoff to the player's own payoff.

There are even more ways of breaking ties, also for more than one player. A player may be friendly to some players and hostile to others. Backward induction analysis must be changed in each of these cases.

### 3.5 Sequentialization

In real life, where the rules are not always carved in stone, two players playing a simultaneous game may have the possibilities of either moving a little earlier than the other, or delaying their move slightly until they see what the other player played. Even if moving earlier is not possible, sometimes it is possible to publicly announce what you will play, to make a (hopefully convincing) commitment. If the other player believes you, it is as if you would have moved first. In any case, the former simultaneous game is transformed into a sequential game.

Obviously there are two roles in the sequentialization of a two-player game, moving first or moving second, and which one is preferred is probably often decided by the temperament of the players. Some like to move first and dictate the action, and others prefer to wait and see what the other has played, leaving the other in the dark about their intentions. But actually this question of whether it is better to move first or last in a sequentialization should not be left to temperament but to an analysis of the game.

Backwards analysis of the sequential version carries over to bimatrix notation quite easily: If Ann moves first and Beth second, then Ann determines the row with her move. Each row corresponds to a situation Beth may face, and in each row, Beth obviously would choose the cell maximizing her payoff. This is her best response to Ann's move. We color all these cells. Note that the colored cells include all pure Nash equilibria, since the condition for coloring is just one of two (both moves being best response to the other) required for pure Nash equilibria. Note also that every row contains at least one colored cell.

The simplest case is where in each row only one cell is colored. We call this property unique response for Beth. That means that every one of Ann's moves has exactly one best response. In this case, Ann chooses the row whose colored cell has maximum first (Ann's) entry. Similarly, unique response for Ann would mean that in every column there is exactly one entry maximizing Ann's payoff.

If there are more than one colored cell in a row, Ann chooses the row with

• maximum average first entry in the colored cells in that row, if Beth is neutral against Ann,
• maximum first entry in the colored cells, if Beth is friendly to Ann, or
• maximum minimum first entry in the colored cells, if Beth is hostile against Ann,

using the three variants to break ties discussed in the previous section.

Theorem: Under the assumption of
1. "unique response for Beth",
the first-moving Ann can select any pure Nash equilibrium of the simultaneous game as outcome of the sequential version. Under the additional condition that
1. there are some pure Nash equilibria in the simultaneous game, and they will be played,
the first moving Ann is not worse off than in the simultaneous game. Under the additional assumption that
1. the simultaneous game is symmetric
the first moving player will achieve at least as much as the second-moving player. Then there is a "first move advantage".

Let's apply the Theorem to the game CHICKEN. The classical formulation is of two cars driving at each other. We assume that it is a simultaneous game; both have at some point to decide simultaneously whether to proceed or to swerve. The one swerving loses face. If none swerves, both lose their lifes.
 swerve not swerve -1, -1 -2, 2 not 2, -2 -100, -100

This game has two pure Nash equilibria, where one swerves and the other doesn't. Furthermore it is symmetric and has unique best response for Beth. Therefore the Theorem applies and there is first move advantage. If your opponent publicly removes his steering wheel before you both start, leaving him no possibility to avoid a collision himself, then you will certainly chicken and look bad to the ladies! Then the other player did change the game into a sequential game with him moving first, and communicating this move to you.

Very often the first condition fails. People are often indifferent between options. But if we assume that Beth decides in these cases looking at Ann's payoff, as discussed in the previous subsection, this first condition is valid. The second condition is necessary since presently we don't have a solution concept yet if there are no Nash equilibria in pure strategies. Later, when we allow and discuss mixed strategies, we will see whether this condition remains necessary. The third condition makes it possible to compare Ann's and Beth's payoffs.

Look at the following symmetric game, where both player's moves are called "1", "2", and "3". The colored cells are the outcomes with best response for Beth:
 1 2 3 1 1, 1 3, 2 3, 1 2 2, 3 4, 4 2, 4 3 1, 3 4, 2 2, 2

Conditions 2 and 3 of the Theorem above are fulfilled, but not condition 1. There is one pure Nash equilibrium, where both player play move 2 and get a payoff of 4 each. However, in the sequential version with neutral Beth, Ann only achieves an average payoff of 3, since Beth will react with either playing "2" or "3" to Ann's playing "2". But since this value of 3 is higher than the values of 2 or 1 that Ann would get playing the other two moves, Ann would still start by playing move 2. Therefore in this sequentialization, the player moving first is worse off than in the simultaneous version. The reason is the missing of condition 1.

What happens in the same game with hostile or friendly Beth?
hostile Bethfriendly Beth
 1 2 3 1 1, 0.99 3, 1.97 3, 0.97 2 2, 2.98 4, 3.96 2, 3.98 3 1, 2.99 4, 1.96 2, 1.98
 1 2 3 1 1, 1.01 3, 2.03 3, 1.03 2 2, 3.02 4, 4.04 2, 4.02 3 1, 3.01 4, 2.04 2, 2.02
As can be seen in the tables, if Beth is hostile, Ann will start with move 1 and Ann gets a payoff of 3 and Beth of about 2. If Beth is friendly, Ann will choose move 2 and both get about 4 as payoff. So in this example it pays especially for Beth to be friendly to Ann, and to signal it to Ann before the game starts.

Another example where the Theorem cannot be applied since the assumptions are not there is ROCK-SCISSORS-PAPER. It has unique best response for Beth and is symmetric, but it doesn't have any Nash equilibria in pure strategies. In fact, there is a second move advantage there---you would rather want to move slightly later than your opponent there.

Also, SEBATE RACE II has second mover advantage if x (the value of winning) > y (the value of compromising your political views).

### 3.5 Existing Games

Board games are, almost by definition, games of perfect information, since everything is in the open provided no cards are part of the game. They are also very often sequential games, where the players take turns to move. Well-known example of board games are Chess, Go, Nine Men's Morris, Checkers, Pachisi, Backgammon with a very long history, or the more recent games of Monopoly, Hex, Tic-tac-toe or others. Some, like Pachisi, Backgammon, and Monopoly, are played with dice or even with cards that are revealed, and contain therefore randomness. But Chess, Go, Checkers, Tic-tac-toe, Hex are all two-players zero-sum finite sequential games without randomness and with perfect information. So shouldn't people play the backward-induction strategy? The simple and quick answer is: Yes, they should, but in some cases backward induction analysis has just not yet been performed. Thus the backward induction strategy, the optimal strategy, is unknown in many cases.

Tic-tac-toe is the simplest of these games, with the smallest game tree. The game tree has just 5478 positions. The game has been analyzed, and the expected outcome is a draw. You can play tic-tac-toe here against the computer.

Checkers will always result in a draw when played correctly ([von Nievergelt and Gasser 1994].

On the other hand, for Chess and Go the number of possible positions is so large that even with the help of the most advanced computers a full analysis is not expected for soon. Chess has about 1043 positions, a number with about 43 digits, and Go even about 2·10170. Checkers has about 1018 positions and Nine Men's Morris about 1010 positions.

### 3.6 Greedy Strategies

How do chess computers play? We have already seen that the complete extensive form is not available---it is simply too large. Therefore backward induction is also not an option. These existing chess programs usually have two ingredients: Starting at a position, they look a few moves ahead, instead of looking back. Secondly a chess program, the same as an experienced chess player, has values attached, or can attach values, to every possible position. They are supposed to be approximations of the likely payoffs found by backward induction, but they are not obtained by looking any further to successors of the position considered. Rather features of the position are considered and expressed into a value using a more or less complicated formula. In chess, material advantage are part of these features to consider, but there are also finer features, like the number of possible moves of a position, or pawn structure, and so on.

Very often such evaluations of positions come very naturally. For instance, for many games the evaluations might be the payoff if the game just ended there.

Now a Greedy strategy decides what to do in a position by simply looking at all possible successor positions, and choosing the one where the evaluation for the player to move is highest. More generally, a k-Greedy strategy looks just k moves ahead, and chooses the Maximin or backward induction choice of this small part of the Game Digraph.

Here a discussion of Nash's strategy stealing argument for Hex and nonconstructive prrofes could be included.

Nash showed in 1948 that in Hex White always must win, even though it is not known how. The proof is called a strategy-stealing argument.

Actually deciding whether a given position is a winning position for White or Black is fairly difficult (PSPACE-complete, TIME??? Nachgucken! ) [Reisch 1979]. That means that it is fairly unlikely to have an algorithm that solves this decision problem in reasonable time. But maybe it's not important enough.

### What's next?

Next we will learn what probability really is, then we will let probability enter our games.

### References

• Uri Zwick, Micahel S. Paterson, The Memory Game, Theoretical Computer Science 110 (1993) 169-196.
• E. Zermelo, Über eine Anwendung der Mengenlehre auf die Theorie des Schachspiels, Proceedings of the Fifth International Congress of Mathematicians 2 (1913) 501-504.
• Reisch, Diploma Thesis
• R. Rosenthal, Games of Perfect Information, Predatory Pricing, and the Chain Store, Journal of Economic Theory 25 (1981) 92-100.
• R. McKelvey, T. Palfrey, An experimental study of the centipede game, Econometrica 60 (1992) 803-836.
• Steven M. LaValle, Planning Algorithms, Cambridge University Press 2006.