4Non-cooperative games

IB Optimisation

4.1 Games and Solutions

Definition

(Bimatrix game)

.

A two-player game, or bimatrix game, is given

by two matrices

P, Q ∈ R

m×n

. Player 1, or the row player, chooses a row

i ∈

{

1

, · · · , m}

, while player 2, the column player, chooses a column

j ∈ {

1

, · · · , n}

.

These are selected without knowledge of the other player’s decisions. The two

players then get payoffs P

ij

and Q

ij

respectively.

Example. A game of rock-paper-scissors can have payoff matrices

P

ij

=

0 −1 1

1 0 −1

−1 1 0

, Q

ij

=

0 1 −1

−1 0 1

1 −1 0

.

Here a victory gives you a payoff of 1, a loss gives a payoff of

−

1, and a draw

gives a payoff of 1. Also the first row/column corresponds to playing rock, second

corresponds to paper and third corresponds to sciss ors.

Usually, this is not the best way to display the payoff matrices. First of all,

we need two write out two matrices, and there isn’t an easy way to indicate what

row corresponds to what decision. Instead, we usually write this as a table.

R P S

R (0, 0) (−1, 1) (1, −1)

P (1, −1) (0, 0) (−1, 1)

S (−1, 1) (1, −1) (0, 0)

By convention, the first item in the tuple (

−

1

,

1) indicates the payoff of the row

player, and the second item indicates the payoff of the column player.

Definition

(Strategy)

.

Players are allowed to play randomly. The set of strate-

gies the row player can have is

X = {x ∈ R

m

: x ≥ 0,

X

x

i

= 1}

and the column player has strategies

Y = {y ∈ R

n

: y ≥ 0,

X

y

i

= 1}

Each vector corresponds to the probabilities of selecting each row or column.

A strategy profile (

x, y

)

∈ X × Y

induces a lottery, and we write

p

(

x, y

) =

x

T

P y for the expected payoff of the row player.

If x

i

= 1 for some i, i.e. we always pick i, we call x a pure strategy.

Example

(Prisoner’s dilemma)

.

Suppose Alice and Bob commit a crime together,

and are caught by the police. They can choose to remain silent (

S

) or testify

(T ). Different options will lead to different outcomes:

–

Both keep silent: the police has little evidence and they go to jail for 2

years.

–

One testifies and one remains silent: the one who testifies gets awarded

and is freed, while the other gets stuck in jail for 10 years.

– Both testify: they both go to jail for 5 years.

We can represent this by a payoff table:

S T

S (2, 2) (0, 3)

T (3, 0) (1, 1)

Note that higher payoff is desired, so a longer serving time corresponds to a

lower payoff. Also, payoffs are interpreted relatively, so replacing (0

,

3) with

(0, 100) (and (3, 0) with (100, 0)) in the payoff table would make no difference.

Here we see that regardless of what the other person does, it is always strictly

better to testify than not (unless you want to be nice). We say

T

is a dominant

strategy, and (1, 1) is Pareto dominated by (2, 2).

Example

(Chicken)

.

The game of Chicken is as follows: two people drive their

cars towards each other at high speed. If they collide, they will di.e. Hence they

can decide to chicken out (

C

) or continue driving (

D

). If both don’t chicken,

they die, which is bad. If one chickens and the other doesn’t the person who

chicken looks silly (but doesn’t die). If both chicken out, they both look slightly

silly. This can be represented by the following table:

C D

C (2, 2) (1, 3)

D (3, 1) (0, 0)

Here there is no dominating strategy, so we need a different way of deciding

what to do.

Instead, we define the security level of the row player to be

max

x∈X

min

y∈Y

p(x, y) = max

x∈X

min

j∈{1,...,n}

m

X

i=1

x

i

p

ij

.

Such an

x

is the strategy the row player can employ that minimizes the worst

possible loss. This is called the maximin strategy.

We can formulate this as a linear program:

maximize v such that

m

X

i=1

x

i

p

ij

≥ v for all j = 1, · · · , n

m

X

i=1

x

i

= 1

x ≥ 0

Here the maximin strategy is to chicken. However, this isn’t really what we

are looking for, since if both players employ this maximin strategy, it would be

better for you to not chicken out.

Definition

(Best response and equilibrium)

.

A strategy

x ∈ X

is a best response

to y ∈ Y if for all x

0

∈ X

p(x, y) ≥ p(x

0

, y)

A pair (

x, y

) is an equilibrium if

x

is the best response against

y

and

y

is a best

response against x.

Example.

In the chicken game, there are two pure equilibrium, (3

,

1) and (1

,

3),

and there is a mixed equilibrium in which the players pick the options with equal

probability.

Theorem (Nash, 1961). Every bimatrix game has an equilibrium.

We are not proving this since it is too hard.