1Measures

II Probability and Measure 1.1 Measures
The starting point of all these is to come up with a function that determines
the “size” of a given set, known as a measure. It turns out we cannot sensibly
define a size for all subsets of [0
,
1]. Thus, we need to restrict our attention to a
collection of “nice” subsets. Specifying which subsets are “nice” would involve
specifying a σ-algebra.
This section is mostly technical.
Definition
(
σ
-algebra)
.
Let
E
be a set. A
σ
-algebra
E
on
E
is a collection of
subsets of E such that
(i) E.
(ii) A E implies that A
C
= X \ A E.
(iii) For any sequence (A
n
) in E, we have that
[
n
A
n
E.
The pair (E, E) is called a measurable space.
Note that the axioms imply that
σ
-algebras are also closed under countable
intersections, as we have A B = (A
C
B
C
)
C
.
Definition
(Measure)
.
A measure on a measurable space (
E, E
) is a function
µ : E [0, ] such that
(i) µ() = 0
(ii) Countable additivity: For any disjoint sequence (A
n
) in E, then
µ
[
n
A
n
!
=
X
n=1
µ(A
n
).
Example.
Let
E
be any countable set, and
E
=
P
(
E
) be the set of all subsets
of
E
. A mass function is any function
m
:
E
[0
,
]. We can then define a
measure by setting
µ(A) =
X
xA
m(x).
In particular, if we put
m
(
x
) = 1 for all
x E
, then we obtain the counting
measure.
Countable spaces are nice, because we can always take
E
=
P
(
E
), and the
measure can be defined on all possible subsets. However, for “bigger” spaces, we
have to be more careful. The set of all subsets is often “too large”. We will see
a concrete and also important example of this later.
In general,
σ
-algebras are often described on large spaces in terms of a smaller
set, known as the generating sets.
Definition
(Generator of
σ
-algebra)
.
Let
E
be a set, and that
A P
(
E
) be a
collection of subsets of E. We define
σ(A) = {A E : A E for all σ-algebras E that contain A}.
In other words
σ
(
A
) is the smallest sigma algebra that contains
A
. This is
known as the sigma algebra generated by A.
Example.
Take
E
=
Z
, and
A
=
{{x}
:
x Z}
. Then
σ
(
A
) is just
P
(
E
), since
every subset of E can be written as a countable union of singletons.
Example.
Take
E
=
Z
, and let
A
=
{{x, x
+ 1
, x
+ 2
, x
+ 3
, ···}
:
x E}
. Then
again σ(E) is the set of all subsets of E.
The following is the most important σ-algebra in the course:
Definition
(Borel
σ
-algebra)
.
Let
E
=
R
, and
A
=
{U R
:
U is open}
. Then
σ(A) is known as the Borel σ-algebra, which is not the set of all subsets of R.
We can equivalently define this by
˜
A
=
{
(
a, b
) :
a < b, a, b Q}
. Then
σ
(
˜
A
)
is also the Borel σ-algebra.
Often, we would like to prove results that allow us to deduce properties
σ
-algebra just by checking it on a generating set. However, usually,
we cannot just check it on an arbitrary generating set. Instead, the generating
set has to satisfy some nice closure properties. We are now going to introduce a
bunch of many different definitions that you need not aim to remember (except
when exams are near).
Definition
(
π
-system)
.
Let
A
be a collection of subsets of
E
. Then
A
is called
a π-system if
(i) A
(ii) If A, B A, then A B A.
Definition
(d-system)
.
Let
A
be a collection of subsets of
E
. Then
A
is called
a d-system if
(i) E A
(ii) If A, B A and A B, then B \ A A
(iii) For all increasing sequences (A
n
) in A, we have that
S
n
A
n
A.
The point of d-systems and
π
-systems is that they separate the axioms of a
σ-algebra into two parts. More precisely, we have
Proposition.
A collection
A
is a
σ
-algebra if and only if it is both a
π
-system
and a d-system.
This follows rather straightforwardly from the definitions.
The following definitions are also useful:
Definition
(Ring)
.
A collection of subsets
A
is a ring on
E
if
A
and for all
A, B A, we have B \ A A and A B A.
Definition
(Algebra)
.
A collection of subsets
A
is an algebra on
E
if
A
,
and for all A, B A, we have A
C
A and A B A.
So an algebra is like a
σ
-algebra, but it is just closed under finite unions only,
rather than countable unions.
While the names
π
-system and
d
-system are rather arbitrary, we can make
some sense of the names “ring” and “algebra”. Indeed, a ring forms a ring
(without unity) in the algebraic sense with symmetric difference as “addition”
and intersection as “multiplication”. Then the empty set acts as the additive
identity, and
E
, if present, acts as the multiplicative identity. Similarly, an
algebra is a boolean subalgebra under the boolean algebra P (E).
A very important lemma about these things is Dynkin’s lemma:
Lemma
(Dynkin’s
π
-system lemma)
.
Let
A
be a
π
-system. Then any d-system
which contains A contains σ(A).
This will be very useful in the future. If we want to show that all elements of
σ
(
A
) satisfy a particular property for some generating
π
-system
A
, we just have
to show that the elements of
A
satisfy that property, and that the collection of
things that satisfy the property form a d-system.
While this use case might seem rather contrived, it is surprisingly common
when we have to prove things.
Proof.
Let
D
be the intersection of all d-systems containing
A
, i.e. the smallest
d-system containing
A
. We show that
D
contains
σ
(
A
). To do so, we will show
that D is a π-system, hence a σ-algebra.
There are two steps to the proof, both of which are straightforward verifica-
tions:
(i) We first show that if B D and A A, then B A D.
(ii) We then show that if A, B D, then A B D.
Then the result immediately follows from the second part.
We let
D
0
= {B D : B A D for all A A}.
We note that
D
0
A
because
A
is a
π
-system, and is hence closed under
intersections. We check that
D
0
is a d-system. It is clear that
E D
0
. If we
have B
1
, B
2
D
0
, where B
1
B
2
, then for any A A, we have
(B
2
\ B
1
) A = (B
2
A) \ (B
1
A).
By definition of
D
0
, we know
B
2
A
and
B
1
A
are elements of
D
. Since
D
is
a d-system, we know this intersection is in D. So B
2
\ B
1
D
0
.
Finally, suppose that (
B
n
) is an increasing sequence in
D
0
, with
B
=
S
B
n
.
Then for every A A, we have that
[
B
n
A =
[
(B
n
A) = B A D.
Therefore B D
0
.
Therefore
D
0
is a d-system contained in
D
, which also contains
A
. By our
choice of D, we know D
0
= D.
We now let
D
00
= {B D : B A D for all A D}.
Since
D
0
=
D
, we again have
A D
00
, and the same argument as above implies
that
D
00
is a d-system which is between
A
and
D
. But the only way that can
happen is if D
00
= D, and this implies that D is a π-system.
After defining all sorts of things that are “weaker versions” of
σ
-algebras, we
now defined a bunch of measure-like objects that satisfy fewer properties. Again,
no one really remembers these definitions:
Definition
(Set function)
.
Let
A
be a collection of subsets of
E
with
A
. A
set function function µ : A [0, ] such that µ() = 0.
Definition
(Increasing set function)
.
A set function is increasing if it has the
property that for all A, B A with A B, we have µ(A) µ(B).
Definition
.
A set function is additive if whenever
A, B A and A B A, A B = , then µ(A B) = µ(A) + µ(B).
Definition
.
A set function is countably addi-
tive if whenever A
n
is a sequence of disjoint sets in A with A
n
A, then
µ
[
n
A
n
!
=
X
n
µ(A
n
).
Under these definitions, a measure is just a countable additive set function
defined on a σ-algebra.
Definition
.
A set function is countably
n
) is a sequence of sets in A with
S
n
A
n
A, then
µ
[
n
A
n
!
X
n
µ(A
n
).
The big theorem that allows us to construct measures is the Caratheodory
extension theorem. In particular, this will help us construct the Lebesgue measure
on R.
Theorem
(Caratheodory extension theorem)
.
Let
A
be a ring on
E
, and
µ
a countably additive set function on
A
. Then
µ
extends to a measure on the
σ-algebra generated by A.
Proof.
(non-examinable) We start by defining what we want our measure to be.
For B E, we set
µ
(B) = inf
(
X
n
µ(A
n
) : (A
n
) A and B
[
A
n
)
.
If it happens that there is no such sequence, we set this to be
. This measure is
known as the outer measure. It is clear that
µ
(
φ
) = 0, and that
µ
is increasing.
We say a set A E is µ
-measurable if
µ
(B) = µ
(B A) + µ
(B A
C
)
for all B E. We let
M = {µ
-measurable sets}.
We will show the following:
(i) M is a σ-algebra containing A.
(ii) µ
is a measure on M with µ
|
A
= µ.
Note that it is not true in general that
M
=
σ
(
A
). However, we will always
have M σ(A).
We are going to break this up into five nice bite-size chunks.
Claim. µ
Suppose
B
S
n
B
n
. We need to show that
µ
(
B
)
P
n
µ
(
B
n
). We can
wlog assume that
µ
(
B
n
) is finite for all
n
, or else the inequality is trivial. Let
ε >
0. Then by definition of the outer measure, for each
n
, we can find a
sequence (B
n,m
)
m=1
in A with the property that
B
n
[
m
B
n,m
and
µ
(B
n
) +
ε
2
n
X
m
µ(B
n,m
).
Then we have
B
[
n
B
n
[
n,m
B
n,m
.
Thus, by definition, we have
µ
(B)
X
n,m
µ
(B
n,m
)
X
n
µ
(B
n
) +
ε
2
n
= ε +
X
n
µ
(B
n
).
Since ε was arbitrary, we are done.
Claim. µ
agrees with µ on A.
In the first example sheet, we will show that if
A
is a ring and
µ
is a countably
µ
, then
µ
is in fact countably subadditive and increasing.
Assuming this, suppose that
A,
(
A
n
) are in
A
and
A
S
n
A
n
. Then by
µ(A)
X
n
µ(A A
n
)
X
n
µ(A
n
),
using that
µ
is countably subadditivity and increasing. Note that we have to do
this in two steps, rather than just applying countable subadditivity, since we did
not assume that
S
n
A
n
A. Taking the infimum over all sequences, we have
µ(A) µ
(A).
Also, we see by definition that
µ
(
A
)
µ
(
A
), since
A
covers
A
. So we get that
µ(A) = µ
(A) for all A A.
Claim. M contains A.
Suppose that A A and B E. We need to show that
µ
(B) = µ
(B A) + µ
(B A
C
).
Since
µ
is countably subadditive, we immediately have
µ
(
B
)
µ
(
B A
) +
µ
(
B A
C
). For the other inequality, we first observe that it is trivial if
µ
(
B
)
is infinite. If it is finite, then by definition, given
ε >
0, we can find some (
B
n
)
in A such that B
S
n
B
n
and
µ
(B) + ε
X
n
µ(B
n
).
Then we have
B A
[
n
(B
n
A)
B A
C
[
n
(B
n
A
C
)
We notice that B
n
A
C
= B
n
\ A A. Thus, by definition of µ
, we have
µ
(B A) + µ
(B A
c
)
X
n
µ(B
n
A) +
X
n
µ(B
n
A
C
)
=
X
n
(µ(B
n
A) + µ(B
n
A
C
))
=
X
n
µ(B
n
)
µ
(B
n
) + ε.
Since ε was arbitrary, the result follows.
Claim. We show that M is an algebra.
We first show that E M. This is true since we obviously have
µ
(B) = µ
(B E) + µ
(B E
C
)
for all B E.
Next, note that if A M, then by definition we have, for all B,
µ
(B) = µ
(B A) + µ
(B A
C
).
Now note that this definition is symmetric in
A
and
A
C
. So we also have
A
C
M.
Finally, we have to show that
M
is closed under intersection (which is
equivalent to being closed under union when we have complements). Suppose
A
1
, A
2
M and B E. Then we have
µ
(B) = µ
(B A
1
) + µ
(B A
C
1
)
= µ
(B A
1
A
2
) + µ
(B A
1
A
C
2
) + µ
(B A
C
1
)
= µ
(B (A
1
A
2
)) + µ
(B (A
1
A
2
)
C
A
1
)
+ µ
(B (A
1
A
2
)
C
A
C
1
)
= µ
(B (A
1
A
2
)) + µ
(B (A
1
A
2
)
C
).
So we have A
1
A
2
M. So M is an algebra.
Claim. M is a σ-algebra, and µ
is a measure on M.
To show that
M
is a
σ
-algebra, we need to show that it is closed under
countable unions. We let (
A
n
) be a disjoint collection of sets in
M
, then we
want to show that A =
S
n
A
n
M and µ
(A) =
P
n
µ
(A
n
).
Suppose that B E. Then we have
µ
(B) = µ
(B A
1
) + µ
(B A
C
1
)
Using the fact that A
2
M and A
1
A
2
= , we have
= µ
(B A
1
) + µ
(B A
2
) + µ
(B A
C
1
A
C
2
)
= ···
=
n
X
i=1
µ
(B A
i
) + µ
(B A
C
1
··· A
C
n
)
n
X
i=1
µ
(B A
i
) + µ
(B A
C
).
Taking the limit as n , we have
µ
(B)
X
i=1
µ
(B A
i
) + µ
(B A
C
).
, we have
µ
(B A)
X
i=1
µ
(B A
i
).
Thus we obtain
µ
(B) µ
(B A) + µ
(B A
C
).
By countable subadditivity, we also have inequality in the other direction. So
equality holds. So A M. So M is a σ-algebra.
To see that µ
is a measure on M, note that the above implies that
µ
(B) =
X
i=1
(B A
i
) + µ
(B A
C
).
Taking B = A, this gives
µ
(A) =
X
i=1
(A A
i
) + µ
(A A
C
) =
X
i=1
µ
(A
i
).
Note that when
A
itself is actually a
σ
-algebra, the outer measure can be
simply written as
µ
(B) = inf{µ(A) : A A, B A}.
Caratheodory gives us the existence of some measure extending the set function
on
A
. Could there be many? In general, there could. However, in the special
case where the measure is finite, we do get uniqueness.
Theorem.
Suppose that
µ
1
, µ
2
are measures on (
E, E
) with
µ
1
(
E
) =
µ
2
(
E
)
<
. If
A
is a
π
-system with
σ
(
A
) =
E
, and
µ
1
agrees with
µ
2
on
A
, then
µ
1
=
µ
2
.
Proof. Let
D = {A E : µ
1
(A) = µ
2
(A)}
We know that
D A
. By Dynkin’s lemma, it suffices to show that
D
is a
d-system. The things to check are:
(i) E D this follows by assumption.
(ii) If A, B D with A B, then B \ A D. Indeed, we have the equations
µ
1
(B) = µ
1
(A) + µ
1
(B \ A) <
µ
2
(B) = µ
2
(A) + µ
2
(B \ A) < .
Since
µ
1
(
B
) =
µ
2
(
B
) and
µ
1
(
A
) =
µ
2
(
A
), we must have
µ
1
(
B \ A
) =
µ
2
(B \ A).
(iii) Let (A
n
) D be an increasing sequence with
S
A
n
= A. Then
µ
1
(A) = lim
n→∞
µ
1
(A
n
) = lim
n→∞
µ
2
(A
n
) = µ
2
(A).
So A D.
The assumption that
µ
1
(
E
) =
µ
2
(
E
)
<
is necessary. The theorem does
not necessarily hold without it. We can see this from a simple counterexample:
Example. Let E = Z, and let E = P (E). We let
A = {{x, x + 1, x + 2, ···} : x E} {∅}.
This is a
π
-system with
σ
(
A
) =
E
. We let
µ
1
(
A
) be the number of elements in
A
,
and
µ
2
= 2
µ
1
(
A
). Then obviously
µ
1
6
=
µ
2
, but
µ
1
(
A
) =
=
µ
2
(
A
) for
A A
.
Definition
(Borel
σ
-algebra)
.
Let
E
be a topological space. We define the
Borel σ-algebra as
B(E) = σ({U E : U is open}).
We write B for B(R).
Definition
.
A measure
µ
on (
E, B
(
E
)) is
called a Borel measure. If
µ
(
K
)
<
for all
K E
compact, then
µ
measure.
The most important example of a Borel measure we will consider is the
Lebesgue measure.
Theorem. There exists a unique Borel measure µ on R with µ([a, b]) = b a.
Proof.
We first show uniqueness. Suppose
˜µ
is another measure on
B
satisfying
the above property. We want to apply the previous uniqueness theorem, but our
measure is not finite. So we need to carefully get around that problem.
For each n Z, we set
µ
n
(A) = µ(A (n, n + 1]))
˜µ
n
(A) = ˜µ(A (n, n + 1]))
Then
µ
n
and
˜µ
n
are finite measures on
B
which agree on the
π
-system of intervals
of the form (
a, b
] with
a, b R
,
a < b
. Therefore we have
µ
n
=
˜µ
n
for all
n Z
.
Now we have
µ(A) =
X
nZ
µ(A (n, n + 1]) =
X
nZ
µ
n
(A) =
X
nZ
˜µ
n
(A) = ˜µ(A)
for all Borel sets A.
To show existence, we want to use the Caratheodory extension theorem. We
let A be the collection of finite, disjoint unions of the form
A = (a
1
, b
1
] (a
2
, b
2
] ··· (a
n
, b
n
].
Then
A
is a ring of subsets of
R
, and
σ
(
A
) =
B
(details are to be checked on
the first example sheet).
We set
µ(A) =
n
X
i=1
(b
i
a
i
).
We note that µ is well-defined, since if
A = (a
1
, b
1
] ··· (a
n
, b
n
] = (˜a
1
,
˜
b
1
] ··· (˜a
n
,
˜
b
n
],
then
n
X
i=1
(b
i
a
i
) =
n
X
i=1
(
˜
b
i
˜a
i
).
Also, if
µ
A, B A
,
A B
=
and
A B A
, we obviously have
µ(A B) = µ(A) + µ(B). So µ is additive.
Finally, we have to show that
µ
is in fact countably additive. Let (
A
n
) be a
disjoint sequence in
A
, and let
A
=
S
i=1
A
n
A
. Then we need to show that
µ(A) =
P
n=1
µ(A
n
).
Since µ is additive, we have
µ(A) = µ(A
1
) + µ(A \ A
1
)
= µ(A
1
) + µ(A
2
) + µ(A \ A
1
A
2
)
=
n
X
i=1
µ(A
i
) + µ
A \
n
[
i=1
A
i
!
To finish the proof, we show that
µ
A \
n
[
i=1
A
i
!
0 as n .
We are going to reduce this to the finite intersection property of compact sets in
R
:
if (
K
n
) is a sequence of compact sets in
R
with the property that
T
n
m=1
K
m
6
=
for all n, then
T
m=1
K
m
6= .
We first introduce some new notation. We let
B
n
= A \
n
[
m=1
A
m
.
We now suppose, for contradiction, that
µ
(
B
n
)
6→
0 as
n
. Since the
B
n
’s
are decreasing, there must exist ε > 0 such that µ(B
n
) 2ε for every n.
For each
n
, we take
C
n
A
with the property that
C
n
B
n
and
µ
(
B
n
\C
n
)
ε
2
n
. This is possible since each
B
n
is just a finite union of intervals. Thus we
have
µ(B
n
) µ
n
\
m=1
C
m
!
= µ
B
n
\
n
\
m=1
C
m
!
µ
n
[
m=1
(B
m
\ C
m
)
!
n
X
m=1
µ(B
m
\ C
m
)
n
X
m=1
ε
2
m
ε.
On the other hand, we also know that µ(B
n
) 2ε.
µ
n
\
m=1
C
m
!
ε
for all
n
. We now let that
K
n
=
T
n
m=1
C
m
. Then
µ
(
K
n
)
ε
, and in particular
K
n
6= for all n.
Thus, the finite intersection property says
6=
\
n=1
K
n
\
n=1
B
n
= .
This is a contradiction. So we have µ(B
n
) 0 as n . So done.
Definition
(Lebesgue measure)
.
The Lebesgue measure is the unique Borel
measure µ on R with µ([a, b]) = b a.
Note that the Lebesgue measure is not a finite measure, since
µ
(
R
) =
.
However, it is a σ-finite measure.
Definition
(
σ
-finite measure)
.
Let (
E, E
) be a measurable space, and
µ
a
measure. We say
µ
is
σ
-finite if there exists a sequence (
E
n
) in
E
such that
S
n
E
n
= E and µ(E
n
) < for all n.
This is the next best thing we can hope after finiteness, and often proofs
that involve finiteness carry over to σ-finite measures.
Proposition. The Lebesgue measure is translation invariant, i.e.
µ(A + x) = µ(A)
for all A B and x R, where
A + x = {y + x, y A}.
Proof. We use the uniqueness of the Lebesgue measure. We let
µ
x
(A) = µ(A + x)
for
A B
. Then this is a measure on
B
satisfying
µ
x
([
a, b
]) =
b a
. So the
uniqueness of the Lebesgue measure shows that µ
x
= µ.
It turns out that translation invariance actually characterizes the Lebesgue
measure.
Proposition.
Let
˜µ
be a Borel measure on
R
that is translation invariant and
µ([0, 1]) = 1. Then ˜µ is the Lebesgue measure.
Proof. We show that any such measure must satisfy
µ([a, b]) = b a.
By additivity and translation invariance, we can show that
µ
([
p, q
]) =
q p
for all
rational
p < q
. By considering
µ
([
p, p
+ 1
/n
]) for all
n
and using the increasing
property, we know
µ
(
{p}
) = 0. So
µ
(([
p, q
)) =
µ
((
p, q
]) =
µ
((
p, q
)) =
q p
for
all rational p, q.
Finally, by countable additivity, we can extend this to all real intervals. Then
the result follows from the uniqueness of the Lebesgue measure.
In the proof of the Caratheodory extension theorem, we constructed a measure
µ
on the
σ
-algebra
M
of
µ
-measurable sets which contains
A
. This contains
B
=
σ
(
A
), but could in fact be bigger than it. We call
M
the Lebesgue
σ
-algebra.
Indeed, it can be given by
M = {A N : A B, N B B with µ(B) = 0}.
If A N M, then µ(A N) = µ(A). The proof is left for the example sheet.
It is also true that
M
is strictly larger than
B
, so there exists
A M
with
A 6∈ B. Construction of such a set was on last year’s exam (2016).
On the other hand, it is also true that not all sets are Lebesgue measurable.
This is a rather funny construction.
Example.
For
x, y
[0
,
1), we say
x y
if
x y
is rational. This defines an
equivalence relation on [0
,
1). By the axiom of choice, we pick a representative
of each equivalence class, and put them into a set
S
[0
,
1). We will show that
S is not Lebesgue measurable.
Suppose that
S
were Lebesgue measurable. We are going to get a contra-
diction to the countable additivity of the Lebesgue measure. For each rational
r [0, 1) Q, we define
S
r
= {s + r mod 1 : s S}.
By translation invariance, we know
S
r
is also Lebesgue measurable, and
µ
(
S
r
) =
µ(S).
Also, by construction of
S
, we know (
S
r
)
rQ
is disjoint, and
S
rQ
S
r
= [0
,
1).
Now by countable additivity, we have
1 = µ([0, 1)) = µ
[
rQ
S
r
=
X
rQ
µ(S
r
) =
X
rQ
µ(S),
which is clearly not possible. Indeed, if
µ
(
S
) = 0, then this says 1 = 0; If
µ(S) > 0, then this says 1 = . Both are absurd.