7Big theorems

II Probability and Measure



7.1 The strong law of large numbers
Before we start proving the strong law of large numbers, we first spend some
time discussing the difference between the strong law and the weak law. In both
cases, we have a sequence (
X
n
) of iid random variables with
E
[
X
i
] =
µ
. We let
S
n
= X
1
+ ··· + X
n
.
The weak law of large number says
S
n
/n µ
in probability as
n
,
provided E[X
2
1
] < .
The strong law of large number says S
n
/n µ a.s. provided E|X
1
| < .
So we see that the strong law is indeed stronger, because convergence almost
everywhere implies convergence in measure.
We are actually going to do two versions of the strong law with different
hypothesis.
Theorem
(Strong law of large numbers assuming finite fourth moments)
.
Let
(
X
n
) be a sequence of independent random variables such that there exists
µ R
and M > 0 such that
E[X
n
] = µ, E[X
4
n
] M
for all n. With S
n
= X
1
+ ··· + X
n
, we have that
S
n
n
µ a.s. as n .
Note that in this version, we do not require that the
X
n
are iid. We simply
need that they are independent and have the same mean.
The proof is completely elementary.
Proof. We reduce to the case that µ = 0 by setting
Y
n
= X
n
µ.
We then have
E[Y
n
] = 0, E[Y
4
n
] 2
4
(E[µ
4
+ X
4
n
]) 2
4
(µ
4
+ M).
So it suffices to show that the theorem holds with
Y
n
in place of
X
n
. So we can
assume that µ = 0.
By independence, we know that for i 6= j, we have
E[X
i
X
3
j
] = E[X
i
]E[X
3
j
] = 0.
Similarly, for all i, j, k, ` distinct, we have
E[X
i
X
j
X
2
k
] = E[X
i
X
j
X
k
X
`
] = 0.
Hence we know that
E[S
4
n
] = E
n
X
k=1
X
4
k
+ 6
X
1i<jn
X
2
i
X
2
j
.
We know the first term is bounded by
nM
, and we also know that for
i 6
=
j
, we
have
E[X
2
i
X
2
j
] = E[X
2
i
]E[X
2
j
]
q
E[X
4
i
]E[X
4
j
] M
by Jensen’s inequality. So we know
E
6
X
1i<jn
X
2
i
X
2
j
3n(n 1)M.
Putting everything together, we have
E[S
4
n
] nM + 3n(n 1)M 3n
2
M.
So we know
E
(S
n
/n)
4
3M
n
2
.
So we know
E
"
X
n=1
S
n
n
4
#
X
n=1
3M
n
2
< .
So we know that
X
n=1
S
n
n
4
< a.s.
So we know that (S
n
/n)
4
0 a.s., i.e. S
n
/n 0 a.s.
We are now going to get rid of the assumption that we have finite fourth
moments, but we’ll need to work with iid random variables.
Theorem
(Strong law of large numbers)
.
Let (
Y
n
) be an iid sequence of inte-
grable random variables with mean ν. With S
n
= Y
1
+ ··· + Y
n
, we have
S
n
n
ν a.s.
We will use the ergodic theorem to prove this. This is not the “usual” proof
of the strong law, but since we’ve done all that work on ergodic theory, we might
as well use it to get a clean proof. Most of the work left is setting up the right
setting for the proof.
Proof.
Let
m
be the law of
Y
1
and let
Y
= (
Y
1
, Y
2
, Y
3
, ···
). We can view
Y
as
a function
Y : Ω R
N
= E.
We let (
E, E, µ
) be the canonical space associated with the distribution
m
so
that
µ = P Y
1
.
We let f : E R be given by
f(x
1
, x
2
, ···) = X
1
(x
1
, ··· , x
n
) = x
1
.
Then
X
1
has law given by
m
, and in particular is integrable. Also, the shift map
Θ : E E given by
Θ(x
1
, x
2
, ···) = (x
2
, x
3
, ···)
is measure-preserving and ergodic. Thus, with
S
n
(f) = f + f Θ + ··· + f Θ
n1
= X
1
+ ··· + X
n
,
we have that
S
n
(f)
n
¯
f a.e.
by Birkhoff’s ergodic theorem. We also have convergence in
L
1
by von Neumann
ergodic theorem.
Here
¯
f
is
E
Θ
-measurable, and Θ is ergodic, so we know that
¯
f
=
c
a.e. for
some constant c. Moreover, we have
c = µ(
¯
f) = lim
n→∞
µ(S
n
(f)/n) = ν.
So done.