7Big theorems

II Probability and Measure

7.1 The strong law of large numbers

Before we start proving the strong law of large numbers, we first spend some

time discussing the difference between the strong law and the weak law. In both

cases, we have a sequence (

X

n

) of iid random variables with

E

[

X

i

] =

µ

. We let

S

n

= X

1

+ ··· + X

n

.

–

The weak law of large number says

S

n

/n → µ

in probability as

n → ∞

,

provided E[X

2

1

] < ∞.

– The strong law of large number says S

n

/n → µ a.s. provided E|X

1

| < ∞.

So we see that the strong law is indeed stronger, because convergence almost

everywhere implies convergence in measure.

We are actually going to do two versions of the strong law with different

hypothesis.

Theorem

(Strong law of large numbers assuming finite fourth moments)

.

Let

(

X

n

) be a sequence of independent random variables such that there exists

µ ∈ R

and M > 0 such that

E[X

n

] = µ, E[X

4

n

] ≤ M

for all n. With S

n

= X

1

+ ··· + X

n

, we have that

S

n

n

→ µ a.s. as n → ∞.

Note that in this version, we do not require that the

X

n

are iid. We simply

need that they are independent and have the same mean.

The proof is completely elementary.

Proof. We reduce to the case that µ = 0 by setting

Y

n

= X

n

− µ.

We then have

E[Y

n

] = 0, E[Y

4

n

] ≤ 2

4

(E[µ

4

+ X

4

n

]) ≤ 2

4

(µ

4

+ M).

So it suffices to show that the theorem holds with

Y

n

in place of

X

n

. So we can

assume that µ = 0.

By independence, we know that for i 6= j, we have

E[X

i

X

3

j

] = E[X

i

]E[X

3

j

] = 0.

Similarly, for all i, j, k, ` distinct, we have

E[X

i

X

j

X

2

k

] = E[X

i

X

j

X

k

X

`

] = 0.

Hence we know that

E[S

4

n

] = E

n

X

k=1

X

4

k

+ 6

X

1≤i<j≤n

X

2

i

X

2

j

.

We know the first term is bounded by

nM

, and we also know that for

i 6

=

j

, we

have

E[X

2

i

X

2

j

] = E[X

2

i

]E[X

2

j

] ≤

q

E[X

4

i

]E[X

4

j

] ≤ M

by Jensen’s inequality. So we know

E

6

X

1≤i<j≤n

X

2

i

X

2

j

≤ 3n(n − 1)M.

Putting everything together, we have

E[S

4

n

] ≤ nM + 3n(n − 1)M ≤ 3n

2

M.

So we know

E

(S

n

/n)

4

≤

3M

n

2

.

So we know

E

"

∞

X

n=1

S

n

n

4

#

≤

∞

X

n=1

3M

n

2

< ∞.

So we know that

∞

X

n=1

S

n

n

4

< ∞ a.s.

So we know that (S

n

/n)

4

→ 0 a.s., i.e. S

n

/n → 0 a.s.

We are now going to get rid of the assumption that we have finite fourth

moments, but we’ll need to work with iid random variables.

Theorem

(Strong law of large numbers)

.

Let (

Y

n

) be an iid sequence of inte-

grable random variables with mean ν. With S

n

= Y

1

+ ··· + Y

n

, we have

S

n

n

→ ν a.s.

We will use the ergodic theorem to prove this. This is not the “usual” proof

of the strong law, but since we’ve done all that work on ergodic theory, we might

as well use it to get a clean proof. Most of the work left is setting up the right

setting for the proof.

Proof.

Let

m

be the law of

Y

1

and let

Y

= (

Y

1

, Y

2

, Y

3

, ···

). We can view

Y

as

a function

Y : Ω → R

N

= E.

We let (

E, E, µ

) be the canonical space associated with the distribution

m

so

that

µ = P ◦ Y

−1

.

We let f : E → R be given by

f(x

1

, x

2

, ···) = X

1

(x

1

, ··· , x

n

) = x

1

.

Then

X

1

has law given by

m

, and in particular is integrable. Also, the shift map

Θ : E → E given by

Θ(x

1

, x

2

, ···) = (x

2

, x

3

, ···)

is measure-preserving and ergodic. Thus, with

S

n

(f) = f + f ◦ Θ + ··· + f ◦ Θ

n−1

= X

1

+ ··· + X

n

,

we have that

S

n

(f)

n

→

¯

f a.e.

by Birkhoff’s ergodic theorem. We also have convergence in

L

1

by von Neumann

ergodic theorem.

Here

¯

f

is

E

Θ

-measurable, and Θ is ergodic, so we know that

¯

f

=

c

a.e. for

some constant c. Moreover, we have

c = µ(

¯

f) = lim

n→∞

µ(S

n

(f)/n) = ν.

So done.