2Classification of chains and states
IB Markov Chains
2.3 Hitting probabilities
Recurrence and transience tells us if we are going to return to the original
state with (almost) certainty. Often, we would like to know something more
qualitative. What is the actual probability of returning to the state
i
? If we
return, what is the expected duration of returning?
We can formulate this in a more general setting. Let
S
be our state space,
and let
A ⊆ S
. We want to know how likely and how long it takes for us to
reach
A
. For example, if we are in a casino, we want to win, say, a million, and
don’t want to go bankrupt. So we would like to know the probability of reaching
A = {1 million} and A = {0}.
Definition (Hitting time). The hitting time of
A ⊆ S
is the random variable
H
A
=
min{n ≥
0 :
X
n
∈ A}
. In particular, if we start in
A
, then
H
A
= 0. We
also have
h
A
i
= P
i
(H
A
< ∞) = P
i
(ever reach A).
To determine hitting times, we mostly rely on the following result:
Theorem. The vector (h
A
i
: i ∈ S) satisfies
h
A
i
=
(
1 i ∈ A
P
j∈S
p
i,j
h
A
j
i ∈ A
,
and is minimal in that for any non-negative solution (
x
i
:
i ∈ S
) to these
equations, we have h
A
i
≤ x
i
for all i.
It is easy to show that
h
A
i
satisfies the formula given, but it takes some more
work to show that
h
A
i
is the minimal. Recall, however, that we have proved a
similar result for random walks in IA probability, and the proof is more-or-less
the same.
Proof. By definition, h
A
i
= 1 if i ∈ A. Otherwise, we have
h
A
i
= P
i
(H
A
< ∞) =
X
j∈S
P
i
(H
A
< ∞ | X
1
= j)p
i,j
=
X
j∈S
h
A
j
p
i,j
.
So h
A
i
is indeed a solution to the equations.
To show that
h
A
i
is the minimal solution, suppose
x
= (
x
i
:
i ∈ S
) is a
non-negative solution, i.e.
x
i
=
(
1 i ∈ A
P
j∈S
p
i,j
x
j
A i ∈ A
,
If i ∈ A, we have h
A
i
= x
i
= 1. Otherwise, we can write
x
i
=
X
j
p
i,j
x
j
=
X
j∈A
p
i,j
x
j
+
X
j∈A
p
i,j
x
j
=
X
j∈A
p
i,j
+
X
j∈A
p
i,j
x
j
≥
X
j∈A
p
i,j
= P
i
(H
A
= 1).
By iterating this process, we can write
x
i
=
X
j∈A
p
i,j
+
X
j∈A
p
i,j
X
k
p
i,k
x
k
!
=
X
j∈A
p
i,j
+
X
j∈A
p
i,j
X
k∈A
p
i,k
x
k
+
X
k∈A
p
i,k
x
k
≥ P
i
(H
A
= 1) +
X
j∈A,k∈A
p
i,j
p
j,k
= P
i
(H
A
= 1) + P
i
(H
A
= 2)
= P
i
(H
A
≤ 2).
By induction, we obtain
x
i
≥ P
i
(H
A
≤ n)
for all n. Taking the limit as n → ∞, we get
x
i
≥ P
i
(H
A
≤ ∞) = h
A
i
.
So h
A
i
is minimal.
The next question we want to ask is how long it will take for us to hit
A
.
We want to find
E
i
(
H
A
) =
k
A
i
. Note that we have to be careful — if there is a
chance that we never hit
A
, then
H
A
could be infinite, and
E
i
(
H
A
) =
∞
. This
occurs if
h
A
i
<
1. So often we are only interested in the case where
h
A
i
= 1 (note
that h
A
i
= 1 does not imply that k
A
i
< ∞. It is merely a necessary condition).
We get a similar result characterizing the expected hitting time.
Theorem. (k
A
i
: i ∈ S) is the minimal non-negative solution to
k
A
i
=
(
0 i ∈ A
1 +
P
j
p
i,j
k
A
j
i ∈ A.
Note that we have this “1+” since when we move from
i
to
j
, one step has
already passed.
The proof is almost the same as the proof we had above.
Proof. The proof that (k
A
i
) satisfies the equations is the same as before.
Now let (y
i
: i ∈ S) be a non-negative solution. We show that y
i
≥ k
A
i
.
If i ∈ A, we get y
i
= k
A
i
= 0. Otherwise, suppose i ∈ A. Then we have
y
i
= 1 +
X
j
p
i,j
y
j
= 1 +
X
j∈A
p
i,j
y
j
+
X
j∈A
p
i,j
y
j
= 1 +
X
j∈A
p
i,j
y
j
= 1 +
X
j∈A
p
i,j
1 +
X
k∈A
p
j,k
y
k
≥ 1 +
X
j∈A
p
i,j
= P
i
(H
A
≥ 1) + P
i
(H
A
≥ 2).
By induction, we know that
y
i
≥ P
i
(H
A
≥ 1) + ··· + P
i
(H
A
≥ n)
for all n. Let n → ∞. Then we get
y
i
≥
X
m≥1
P
i
(H
A
≥ m) =
X
m≥1
mP
i
(H
A
= m) = k
A
i
.
Example (Gambler’s ruin). This time, we will consider a random walk on
N
.
In each step, we either move to the right with probability
p
, or to the left with
probability
q
= 1
− p
. What is the probability of ever hitting 0 from a given
initial point? In other words, we want to find h
i
= h
{0}
i
.
We know h
i
is the minimal solution to
h
i
=
(
1 i = 0
qh
i−1
+ ph
i+1
i = 0.
What are the solutions to these equations? We can view this as a difference
equation
ph
i+1
− h
i
+ qh
i−1
= 0, i ≥ 1.
with the boundary condition that
h
0
= 1. We all know how to solve difference
equations, so let’s just jump to the solution.
If p = q, i.e. p =
1
2
, then the solution has the form
h
i
= A + B
q
p
i
for
i ≥
0. If
p < q
, then for large
i
,
q
p
i
is very large and blows up. However,
since
h
i
is a probability, it can never blow up. So we must have
B
= 0. So
h
i
is
constant. Since h
0
= 1, we have h
i
= 1 for all i. So we always get to 0.
If p > q, since h
0
= 1, we have A + B = 1. So
h
i
=
q
p
i
+ A
1 −
q
p
i
!
.
This is in fact a solution for all A. So we want to find the smallest solution.
As
i → ∞
, we get
h
i
→ A
. Since
h
i
≥
0, we know that
A ≥
0. Subject to this
constraint, the minimum is attained when
A
= 0 (since (
q/p
)
i
and (1
−
(
q/p
)
i
)
are both positive). So we have
h
i
=
q
p
i
.
There is another way to solve this. We can give ourselves a ceiling, i.e. we also
stop when we hit
k >
0, i.e.
h
k
= 1. We now have two boundary conditions
and can find a unique solution. Then we take the limit as
k → ∞
. This is the
approach taken in IA Probability.
Here if p = q, then by the same arguments, we get h
i
= 1 for all i.
Example (Birth-death chain). Let (
p
i
:
i ≥
1) be an arbitrary sequence such
that
p
i
∈
(0
,
1). We let
q
i
= 1
− p
i
. We let
N
be our state space and define the
transition probabilities to be
p
i,i+1
= p
i
, p
i,i−1
= q
i
.
This is a more general case of the random walk — in the random walk we have
a constant p
i
sequence.
This is a general model for population growth, where the change in population
depends on what the current population is. Here each “step” does not correspond
to some unit time, since births and deaths occur rather randomly. Instead, we
just make a “step” whenever some birth or death occurs, regardless of what time
they occur.
Here, if we have no people left, then it is impossible for us to reproduce and
get more population. So we have
p
0,0
= 1.
We say 0 is absorbing in that {0} is closed. We let h
i
= h
{0}
i
. We know that
h
0
= 1, p
i
h
i+1
− h
i
+ q
i
h
i−1
= 0, i ≥ 1.
This is no longer a difference equation, since the coefficients depends on the
index i. To solve this, we need magic. We rewrite this as
p
i
h
i+1
− h
i
+ q
i
h
i−1
= p
i
h
i+1
− (p
i
+ q
i
)h
i
+ q
i
h
i−1
= p
i
(h
i+1
− h
i
) − q
i
(h
i
− h
i−1
).
We let
u
i
=
h
i−1
− h
i
(picking
h
i
− h
i−1
might seem more natural, but this
definition makes u
i
positive). Then our equation becomes
u
i+1
=
q
i
p
i
u
i
.
We can iterate this to become
u
i+1
=
q
i
p
i
q
i−1
p
i−1
···
q
1
p
1
u
1
.
We let
γ
i
=
q
1
q
2
···q
i
p
1
p
2
···p
i
.
Then we get
u
i+1
=
γ
i
u
1
. For convenience, we let
γ
0
= 1. Now we want to
retrieve our
h
i
. We can do this by summing the equation
u
i
=
h
i−1
− h
i
. So we
get
h
0
− h
i
= u
1
+ u
2
+ ··· + u
i
.
Using the fact that h
0
= 1, we get
h
i
= 1 − u
1
(γ
0
+ γ
1
+ ··· + γ
i−1
).
Here we have a parameter
u
1
, and we need to find out what this is. Our theorem
tells us the value of u
1
minimizes h
i
. This all depends on the value of
S =
∞
X
i=0
γ
i
.
By the law of excluded middle,
S
either diverges or converges. If
S
=
∞
, then
we must have
u
1
= 0. Otherwise,
h
i
blows up for large
i
, but we know that
0
≤ h
i
≤
1. If
S
is finite, then
u
1
can be non-zero. We know that the
γ
i
are
all positive. So to minimize
h
i
, we need to maximize
u
1
. We cannot make
u
1
arbitrarily large, since this will make
h
i
negative. To find the maximum possible
value of
u
1
, we take the limit as
i → ∞
. Then we know that the maximum value
of u
1
satisfies
0 = 1 − u
1
S.
In other words, u
1
= 1/S. So we have
h
i
=
P
∞
k=i
γ
k
P
∞
k=0
γ
k
.