4Inequalities and L^{p} spaces
II Probability and Measure
4.4 Convergence in L
1
(P) and uniform integrability
What we are looking at here is the following question — suppose (
X
n
)
, X
are
random variables and
X
n
→ X
in probability. Under what extra assumptions is
it true that X
n
also converges to X in L
1
, i.e. E[X
n
− X] → 0 as X → ∞?
This is not always true.
Example. If we take (Ω, F, P) = ((0, 1), B((0, 1)), Lebesgue), and
X
n
= n1
(0,1/n)
.
Then X
n
→ 0 in probability, and in fact X
n
→ 0 almost surely. However,
E[X
n
− 0] = E[X
n
] = n ·
1
n
= 1,
which does not converge to 1.
We see that the problem with this series is that there is a lot of “stuff”
concentrated near 0, and indeed the functions can get unbounded near 0. We
can easily curb this problem by requiring our functions to be bounded:
Theorem
(Bounded convegence theorem)
.
Suppose
X,
(
X
n
) are random vari
ables. Assume that there exists a (nonrandom) constant
C >
0 such that
X
n
 ≤ C. If X
n
→ X in probability, then X
n
→ X in L
1
.
The proof is a rather standard manipulation.
Proof. We first show that X ≤ C a.e. Let ε > 0. We then have
P[X > C + ε] ≤ P[X −X
n
 + X
n
 > C + ε]
≤ P[X − X
n
 > ε] + P[X
n
 > C]
We know the second term vanishes, while the first term
→
0 as
n → ∞
. So we
know
P[X > C + ε] = 0
for all ε. Since ε was arbitrary, we know X ≤ C a.s.
Now fix an ε > 0. Then
E[X
n
− X = E
X
n
− X(1
X
n
−X≤ε
+ 1
X
n
−X>ε
)
≤ ε + 2C P [X
n
− X > ε] .
Since
X
n
→ X
in probability, for
N
sufficiently large, the second term is
≤ ε
.
So E[X
n
− X] ≤ 2ε, and we have convergence in L
1
.
But we can do better than that. We don’t need the functions to be actually
bounded. We just need that the functions aren’t concentrated in arbitrarily
small subsets of Ω. Thus, we make the following definition:
Definition
(Uniformly integrable)
.
Let
X
be a family of random variables.
Define
I
X
(δ) = sup{E[X1
A
] : X ∈ X, A ∈ F with P[A] < δ}.
Then we say
X
is uniformly integrable if
X
is
L
1
bounded (see below), and
I
X
(δ) → 0 as δ → 0.
Definition
(
L
p
bounded)
.
Let
X
be a family of random variables. Then we say
X is L
p
bounded if
sup{kXk
p
: X ∈ X} < ∞.
In some sense, this is “uniform continuity for integration”. It is immediate
that
Proposition.
Finite unions of uniformly integrable sets are uniformly integrable.
How can we find uniformly integrable families? The following proposition
gives us a large class of such families.
Proposition.
Let
X
be an
L
p
bounded family for some
p >
1. Then
X
is
uniformly integrable.
Proof. We let
C = sup{kXk
p
: X ∈ X} < ∞.
Suppose that X ∈ X and A ∈ F. We then have
E[X1
A
] =≤ E[X
p
]
1/p
P[A]
1/q
≤ CP[A]
1/q
.
by H¨older’s inequality, where p, q are conjugates. This is now a uniform bound
depending only on P[A]. So done.
This is the best we can get.
L
1
boundedness is not enough. Indeed, our
earlier example
X
n
= n1
(0,1/n)
,
is L
1
bounded but not uniformly integrable. So L
1
boundedness is not enough.
For many practical purposes, it is convenient to rephrase the definition of
uniform integrability as follows:
Lemma.
Let
X
be a family of random variables. Then
X
is uniformly integrable
if and only if
sup{E[X1
X>k
] : X ∈ X} → 0
as k → ∞.
Proof.
(⇒)
Suppose that
χ
is uniformly integrable. For any
k
, and
X ∈ X
by Cheby
shev inequality, we have
P[X ≥ k] ≤
E[X]
k
.
Given
ε >
0, we pick
δ
such that
P
[
X1
A
]
< ε
for all
A
with
µ
(
A
)
< δ
.
Then pick
k
sufficiently large such that
kδ < sup{E
[
X
] :
X ∈ X}
. Then
P[X ≥ k] < δ, and hence E[X1
X>k
] < ε for all X ∈ X.
(⇐)
Suppose that the condition in the lemma holds. We first show that
X
is
L
1
bounded. We have
E[X] = E[X(1
X≤k
+ 1
X>k
)] ≤ k + E[X1
X>k
] < ∞
by picking a large enough k.
Next note that for any measurable A and X ∈ X, we have
E[X1
A
] = E[X1
A
(1
X>k
+ 1
X≤k
)] ≤ E[X1
X>k
] + kP[A].
Thus, for any
ε >
0, we can pick
k
sufficiently large such that the first
term is
<
ε
2
for all
X ∈ X
by assumption. Then when
P
[
A
]
<
ε
2k
, we have
EX1
A
] ≤ ε.
As a corollary, we find that
Corollary. Let X = {X}, where X ∈ L
1
(P). Then X is uniformly integrable.
Hence, a finite collection of L
1
functions is uniformly integrable.
Proof. Note that
E[X] =
∞
X
k=0
E[X1
X∈[k,k+1)
].
Since the sum is finite, we must have
E[X1
X≥K
] =
∞
X
k=K
E[X1
X∈[k,k+1)
] → 0.
With all that preparation, we now come to the main theorem on uniform
integrability.
Theorem.
Let
X,
(
X
n
) be random variables. Then the following are equivalent:
(i) X
n
, X ∈ L
1
for all n and X
n
→ X in L
1
.
(ii) {X
n
} is uniformly integrable and X
n
→ X in probability.
The (i)
⇒
(ii) direction is just a standard manipulation. The idea of the
(ii)
⇒
(i) direction is that we use uniformly integrability to cut off
X
n
and
X
at some large value
K
, which gives us a small error, then apply bounded
convergence.
Proof.
We first assume that
X
n
, X
are
L
1
and
X
n
→ X
in
L
1
. We want to show
that {X
n
} is uniformly integrable and X
n
→ X in probability.
We first show that
X
n
→ X
in probability. This is just going to come from
the Chebyshev inequality. For ε > 0. Then we have
P[X − X
n
 > ε] ≤
E[X − X
n
]
ε
→ 0
as n → ∞.
Next we show that
{X
n
}
is uniformly integrable. Fix
ε >
0. Take
N
such
that
n ≥ N
implies
E
[
X −X
n

]
≤
ε
2
. Since finite families of
L
1
random variables
are uniformly integrable, we can pick
δ >
0 such that
A ∈ F
and
P
[
A
]
< δ
implies
E[X1
A
], E[X
n
1
A
] ≤
ε
2
for n = 1, ··· , N.
Now when n > N and A ∈ F with P[A] ≤ δ, then we have
E[X
n
1
A
] ≤ E[X − X
n
1
A
] + E[X1
A
]
≤ E[X − X
n
] +
ε
2
≤
ε
2
+
ε
2
= ε.
So {X
n
} is uniformly integrable.
Assume that {X
n
} is uniformly integrable and X
n
→ X in probability.
The first step is to show that
X ∈ L
1
. We want to use Fatou’s lemma, but
to do so, we want almost sure convergence, not just convergence in probability.
Recall that we have previously shown that there is a subsequence (
X
n
k
) of
(X
n
) such that X
n
k
→ X a.s. Then we have
E[X] = E
lim inf
k→∞
X
n
k

≤ lim inf
k→∞
E[X
n
k
] < ∞
since uniformly integrable families are
L
1
bounded. So
E
[
X
]
< ∞
, hence
X ∈ L
1
.
Next we want to show that
X
n
→ X
in
L
1
. Take
ε >
0. Then there exists
K ∈ (0, ∞) such that
E
X1
{X>K}
, E
X
n
1
{X
n
>K}
≤
ε
3
.
To set things up so that we can use the bounded convergence theorem, we have
to invent new random variables
X
K
n
= (X
n
∨ −K) ∧K, X
K
= (X ∨ −K) ∧ K.
Since X
n
→ X in probability, it follows that X
K
n
→ X
K
in probability.
Now bounded convergence tells us that there is some
N
such that
n ≥ N
implies
E[X
K
n
− X
K
] ≤
ε
3
.
Combining, we have for n ≥ N that
E[X
n
− X] ≤ E[X
K
n
− X
K
] + E[X1
{X≥K}
] + E[X
n
1
{X
n
≥K}
] ≤ ε.
So we know that X
n
→ X in L
1
.
The main application is that when
{X
n
}
is a type of stochastic process known
as a martingale. This will be done in III Advanced Probability and III Stochastic
Calculus.