II Probability and Measure

1Measures

1.1 Measures

The starting point of all these is to come up with a function that determines

the “size” of a given set, known as a measure. It turns out we cannot sensibly

define a size for all subsets of [0

1]. Thus, we need to restrict our attention to a

collection of “nice” subsets. Specifying which subsets are “nice” would involve

specifying a σ-algebra.

This section is mostly technical.

Definition

(

-algebra)

Let

be a set. A

-algebra

is a collection of

subsets of E such that

(i) ∅ ∈ E.

(ii) A ∈ E implies that A

= X \ A ∈ E.

(iii) For any sequence (A

) in E, we have that

[

∈ E.

The pair (E, E) is called a measurable space.

Note that the axioms imply that

-algebras are also closed under countable

intersections, as we have A ∩ B = (A

∪ B

)

Definition

(Measure)

A measure on a measurable space (

E, E

) is a function

µ : E → [0, ∞] such that

(i) µ(∅) = 0

(ii) Countable additivity: For any disjoint sequence (A

) in E, then

[

∞

n=1

µ(A

Example.

Let

be any countable set, and

(

) be the set of all subsets

. A mass function is any function

E →

, ∞

]. We can then define a

measure by setting

µ(A) =

x∈A

m(x).

In particular, if we put

(

) = 1 for all

x ∈ E

, then we obtain the counting

measure.

Countable spaces are nice, because we can always take

(

), and the

measure can be defined on all possible subsets. However, for “bigger” spaces, we

have to be more careful. The set of all subsets is often “too large”. We will see

a concrete and also important example of this later.

In general,

-algebras are often described on large spaces in terms of a smaller

set, known as the generating sets.

Definition

(Generator of

-algebra)

Let

be a set, and that

A ⊆ P

(

) be a

collection of subsets of E. We define

σ(A) = {A ⊆ E : A ∈ E for all σ-algebras E that contain A}.

In other words

(

) is the smallest sigma algebra that contains

. This is

known as the sigma algebra generated by A.

Example.

Take

, and

{{x}

x ∈ Z}

. Then

(

) is just

(

), since

every subset of E can be written as a countable union of singletons.

Example.

Take

, and let

{{x, x

+ 1

, x

+ 2

, x

+ 3

, ···}

x ∈ E}

. Then

again σ(E) is the set of all subsets of E.

The following is the most important σ-algebra in the course:

Definition

(Borel

-algebra)

Let

, and

{U ⊆ R

U is open}

. Then

σ(A) is known as the Borel σ-algebra, which is not the set of all subsets of R.

We can equivalently define this by

{

(

a, b

) :

a < b, a, b ∈ Q}

. Then

(

)

is also the Borel σ-algebra.

Often, we would like to prove results that allow us to deduce properties

about the

-algebra just by checking it on a generating set. However, usually,

we cannot just check it on an arbitrary generating set. Instead, the generating

set has to satisfy some nice closure properties. We are now going to introduce a

bunch of many different definitions that you need not aim to remember (except

when exams are near).

Definition

(

-system)

Let

be a collection of subsets of

. Then

is called

a π-system if

(i) ∅ ∈ A

(ii) If A, B ∈ A, then A ∩ B ∈ A.

Definition

(d-system)

Let

be a collection of subsets of

. Then

is called

a d-system if

(i) E ∈ A

(ii) If A, B ∈ A and A ⊆ B, then B \ A ∈ A

(iii) For all increasing sequences (A

) in A, we have that

∈ A.

The point of d-systems and

-systems is that they separate the axioms of a

σ-algebra into two parts. More precisely, we have

Proposition.

A collection

is a

-algebra if and only if it is both a

-system

and a d-system.

This follows rather straightforwardly from the definitions.

The following definitions are also useful:

Definition

(Ring)

A collection of subsets

is a ring on

∅ ∈ A

and for all

A, B ∈ A, we have B \ A ∈ A and A ∪ B ∈ A.

Definition

(Algebra)

A collection of subsets

is an algebra on

∅ ∈ A

and for all A, B ∈ A, we have A

∈ A and A ∪ B ∈ A.

So an algebra is like a

-algebra, but it is just closed under finite unions only,

rather than countable unions.

While the names

-system and

-system are rather arbitrary, we can make

some sense of the names “ring” and “algebra”. Indeed, a ring forms a ring

(without unity) in the algebraic sense with symmetric difference as “addition”

and intersection as “multiplication”. Then the empty set acts as the additive

identity, and

, if present, acts as the multiplicative identity. Similarly, an

algebra is a boolean subalgebra under the boolean algebra P (E).

A very important lemma about these things is Dynkin’s lemma:

Lemma

(Dynkin’s

-system lemma)

Let

be a

-system. Then any d-system

which contains A contains σ(A).

This will be very useful in the future. If we want to show that all elements of

(

) satisfy a particular property for some generating

-system

, we just have

to show that the elements of

satisfy that property, and that the collection of

things that satisfy the property form a d-system.

While this use case might seem rather contrived, it is surprisingly common

when we have to prove things.

Proof.

Let

be the intersection of all d-systems containing

, i.e. the smallest

d-system containing

. We show that

contains

(

). To do so, we will show

that D is a π-system, hence a σ-algebra.

There are two steps to the proof, both of which are straightforward verifica-

tions:

(i) We first show that if B ∈ D and A ∈ A, then B ∩ A ∈ D.

(ii) We then show that if A, B ∈ D, then A ∩ B ∈ D.

Then the result immediately follows from the second part.

We let

= {B ∈ D : B ∩ A ∈ D for all A ∈ A}.

We note that

⊇ A

because

is a

-system, and is hence closed under

intersections. We check that

is a d-system. It is clear that

E ∈ D

. If we

have B

, B

∈ D

, where B

⊆ B

, then for any A ∈ A, we have

\ B

) ∩ A = (B

∩ A) \ (B

∩ A).

By definition of

, we know

∩ A

and

∩ A

are elements of

. Since

a d-system, we know this intersection is in D. So B

\ B

∈ D

Finally, suppose that (

) is an increasing sequence in

, with

Then for every A ∈ A, we have that



[



∩ A =

[

∩ A) = B ∩ A ∈ D.

Therefore B ∈ D

Therefore

is a d-system contained in

, which also contains

. By our

choice of D, we know D

= D.

We now let

= {B ∈ D : B ∩ A ∈ D for all A ∈ D}.

Since

, we again have

A ⊆ D

, and the same argument as above implies

that

is a d-system which is between

and

. But the only way that can

happen is if D

= D, and this implies that D is a π-system.

After defining all sorts of things that are “weaker versions” of

-algebras, we

now defined a bunch of measure-like objects that satisfy fewer properties. Again,

no one really remembers these definitions:

Definition

(Set function)

Let

be a collection of subsets of

with

∅ ∈ A

. A

set function function µ : A → [0, ∞] such that µ(∅) = 0.

Definition

(Increasing set function)

A set function is increasing if it has the

property that for all A, B ∈ A with A ⊆ B, we have µ(A) ≤ µ(B).

Definition

(Additive set function)

A set function is additive if whenever

A, B ∈ A and A ∪ B ∈ A, A ∩ B = ∅, then µ(A ∪ B) = µ(A) + µ(B).

Definition

(Countably additive set function)

A set function is countably addi-

tive if whenever A

is a sequence of disjoint sets in A with ∪A

∈ A, then

[

µ(A

Under these definitions, a measure is just a countable additive set function

defined on a σ-algebra.

Definition

(Countably subadditive set function)

A set function is countably

subadditive if whenever (A

) is a sequence of sets in A with

∈ A, then

[

≤

µ(A

The big theorem that allows us to construct measures is the Caratheodory

extension theorem. In particular, this will help us construct the Lebesgue measure

on R.

Theorem

(Caratheodory extension theorem)

Let

be a ring on

, and

a countably additive set function on

. Then

extends to a measure on the

σ-algebra generated by A.

Proof.

(non-examinable) We start by defining what we want our measure to be.

For B ⊆ E, we set

∗

(B) = inf

(

µ(A

) : (A

) ∈ A and B ⊆

[

)

If it happens that there is no such sequence, we set this to be

∞

. This measure is

known as the outer measure. It is clear that

∗

(

) = 0, and that

∗

is increasing.

We say a set A ⊆ E is µ

∗

-measurable if

∗

(B) = µ

∗

(B ∩ A) + µ

∗

(B ∩ A

)

for all B ⊆ E. We let

M = {µ

∗

-measurable sets}.

We will show the following:

(i) M is a σ-algebra containing A.

(ii) µ

∗

is a measure on M with µ

∗

= µ.

Note that it is not true in general that

(

). However, we will always

have M ⊇ σ(A).

We are going to break this up into five nice bite-size chunks.

Claim. µ

∗

is countably subadditive.

Suppose

B ⊆

. We need to show that

∗

(

)

≤

∗

(

). We can

wlog assume that

∗

(

) is finite for all

, or else the inequality is trivial. Let

ε >

0. Then by definition of the outer measure, for each

, we can find a

sequence (B

n,m

)

∞

m=1

in A with the property that

⊆

[

n,m

and

∗

) +

≥

µ(B

n,m

Then we have

B ⊆

[

⊆

[

n,m

Thus, by definition, we have

∗

(B) ≤

n,m

∗

n,m

) ≤



∗

) +



= ε +

∗

Since ε was arbitrary, we are done.

Claim. µ

∗

agrees with µ on A.

In the first example sheet, we will show that if

is a ring and

is a countably

additive set function on

, then

is in fact countably subadditive and increasing.

Assuming this, suppose that

(

) are in

and

A ⊆

. Then by

subadditivity, we have

µ(A) ≤

µ(A ∩ A

) ≤

µ(A

using that

is countably subadditivity and increasing. Note that we have to do

this in two steps, rather than just applying countable subadditivity, since we did

not assume that

∈ A. Taking the infimum over all sequences, we have

µ(A) ≤ µ

∗

(A).

Also, we see by definition that

(

)

≥ µ

∗

(

), since

covers

. So we get that

µ(A) = µ

∗

(A) for all A ∈ A.

Claim. M contains A.

Suppose that A ∈ A and B ⊆ E. We need to show that

∗

(B) = µ

∗

(B ∩ A) + µ

∗

(B ∩ A

Since

∗

is countably subadditive, we immediately have

∗

(

)

≤ µ

∗

(

B ∩ A

) +

∗

(

B ∩ A

). For the other inequality, we first observe that it is trivial if

∗

(

)

is infinite. If it is finite, then by definition, given

ε >

0, we can find some (

)

in A such that B ⊆

and

∗

(B) + ε ≥

µ(B

Then we have

B ∩ A ⊆

[

∩ A)

B ∩ A

⊆

[

∩ A

)

We notice that B

∩ A

= B

\ A ∈ A. Thus, by definition of µ

∗

, we have

∗

(B ∩ A) + µ

∗

(B ∩ A

) ≤

µ(B

∩ A) +

µ(B

∩ A

)

(µ(B

∩ A) + µ(B

∩ A

))

µ(B

)

≤ µ

∗

) + ε.

Since ε was arbitrary, the result follows.

Claim. We show that M is an algebra.

We first show that E ∈ M. This is true since we obviously have

∗

(B) = µ

∗

(B ∩ E) + µ

∗

(B ∩ E

)

for all B ⊆ E.

Next, note that if A ∈ M, then by definition we have, for all B,

∗

(B) = µ

∗

(B ∩ A) + µ

∗

(B ∩ A

Now note that this definition is symmetric in

and

. So we also have

∈ M.

Finally, we have to show that

is closed under intersection (which is

equivalent to being closed under union when we have complements). Suppose

, A

∈ M and B ⊆ E. Then we have

∗

(B) = µ

∗

(B ∩ A

) + µ

∗

(B ∩ A

)

= µ

∗

(B ∩ A

∩ A

) + µ

∗

(B ∩ A

∩ A

) + µ

∗

(B ∩ A

)

= µ

∗

(B ∩ (A

∩ A

)) + µ

∗

(B ∩ (A

∩ A

)

∩ A

)

+ µ

∗

(B ∩ (A

∩ A

)

∩ A

)

= µ

∗

(B ∩ (A

∩ A

)) + µ

∗

(B ∩ (A

∩ A

)

So we have A

∩ A

∈ M. So M is an algebra.

Claim. M is a σ-algebra, and µ

∗

is a measure on M.

To show that

is a

-algebra, we need to show that it is closed under

countable unions. We let (

) be a disjoint collection of sets in

, then we

want to show that A =

∈ M and µ

∗

(A) =

∗

Suppose that B ⊆ E. Then we have

∗

(B) = µ

∗

(B ∩ A

) + µ

∗

(B ∩ A

)

Using the fact that A

∈ M and A

∩ A

= ∅, we have

= µ

∗

(B ∩ A

) + µ

∗

(B ∩ A

) + µ

∗

(B ∩ A

∩ A

)

= ···

i=1

∗

(B ∩ A

) + µ

∗

(B ∩ A

∩ ··· ∩ A

)

≥

i=1

∗

(B ∩ A

) + µ

∗

(B ∩ A

Taking the limit as n → ∞, we have

∗

(B) ≥

∞

i=1

∗

(B ∩ A

) + µ

∗

(B ∩ A

By the countable-subadditivity of µ

∗

, we have

∗

(B ∩ A) ≤

∞

i=1

∗

(B ∩ A

Thus we obtain

∗

(B) ≥ µ

∗

(B ∩ A) + µ

∗

(B ∩ A

By countable subadditivity, we also have inequality in the other direction. So

equality holds. So A ∈ M. So M is a σ-algebra.

To see that µ

∗

is a measure on M, note that the above implies that

∗

(B) =

∞

i=1

(B ∩ A

) + µ

∗

(B ∩ A

Taking B = A, this gives

∗

(A) =

∞

i=1

(A ∩ A

) + µ

∗

(A ∩ A

) =

∞

i=1

∗

Note that when

itself is actually a

-algebra, the outer measure can be

simply written as

∗

(B) = inf{µ(A) : A ∈ A, B ⊆ A}.

Caratheodory gives us the existence of some measure extending the set function

. Could there be many? In general, there could. However, in the special

case where the measure is finite, we do get uniqueness.

Theorem.

Suppose that

, µ

are measures on (

E, E

) with

(

) =

(

)

∞

. If

is a

-system with

(

) =

, and

agrees with

, then

Proof. Let

D = {A ∈ E : µ

(A) = µ

(A)}

We know that

D ⊇ A

. By Dynkin’s lemma, it suffices to show that

is a

d-system. The things to check are:

(i) E ∈ D — this follows by assumption.

(ii) If A, B ∈ D with A ⊆ B, then B \ A ∈ D. Indeed, we have the equations

(B) = µ

(A) + µ

(B \ A) < ∞

(B) = µ

(A) + µ

(B \ A) < ∞.

Since

(

) =

(

) and

(

) =

(

), we must have

(

B \ A

) =

(B \ A).

(iii) Let (A

) ∈ D be an increasing sequence with

= A. Then

(A) = lim

n→∞

) = lim

n→∞

) = µ

(A).

So A ∈ D.

The assumption that

(

) =

(

)

< ∞

is necessary. The theorem does

not necessarily hold without it. We can see this from a simple counterexample:

Example. Let E = Z, and let E = P (E). We let

A = {{x, x + 1, x + 2, ···} : x ∈ E} ∪ {∅}.

This is a

-system with

(

) =

. We let

(

) be the number of elements in

and

= 2

(

). Then obviously

, but

(

) =

∞

(

) for

A ∈ A

Definition

(Borel

-algebra)

Let

be a topological space. We define the

Borel σ-algebra as

B(E) = σ({U ⊆ E : U is open}).

We write B for B(R).

Definition

(Borel measure and Radon measure)

A measure

on (

E, B

(

)) is

called a Borel measure. If

(

)

< ∞

for all

K ⊆ E

compact, then

is a Radon

measure.

The most important example of a Borel measure we will consider is the

Lebesgue measure.

Theorem. There exists a unique Borel measure µ on R with µ([a, b]) = b − a.

Proof.

We first show uniqueness. Suppose

˜µ

is another measure on

satisfying

the above property. We want to apply the previous uniqueness theorem, but our

measure is not finite. So we need to carefully get around that problem.

For each n ∈ Z, we set

(A) = µ(A ∩ (n, n + 1]))

˜µ

(A) = ˜µ(A ∩ (n, n + 1]))

Then

and

˜µ

are finite measures on

which agree on the

-system of intervals

of the form (

a, b

] with

a, b ∈ R

a < b

. Therefore we have

˜µ

for all

n ∈ Z

Now we have

µ(A) =

n∈Z

µ(A ∩ (n, n + 1]) =

n∈Z

(A) =

n∈Z

˜µ

(A) = ˜µ(A)

for all Borel sets A.

To show existence, we want to use the Caratheodory extension theorem. We

let A be the collection of finite, disjoint unions of the form

A = (a

, b

] ∪ (a

, b

] ∪ ···∪ (a

, b

Then

is a ring of subsets of

, and

(

) =

(details are to be checked on

the first example sheet).

We set

µ(A) =

i=1

− a

We note that µ is well-defined, since if

A = (a

, b

] ∪ ··· ∪ (a

, b

] = (˜a

] ∪ ··· ∪ (˜a

then

i=1

− a

) =

i=1

(

− ˜a

Also, if

is additive,

A, B ∈ A

A ∩ B

∅

and

A ∪ B ∈ A

, we obviously have

µ(A ∪ B) = µ(A) + µ(B). So µ is additive.

Finally, we have to show that

is in fact countably additive. Let (

) be a

disjoint sequence in

, and let

∞

i=1

∈ A

. Then we need to show that

µ(A) =

∞

n=1

µ(A

Since µ is additive, we have

µ(A) = µ(A

) + µ(A \ A

)

= µ(A

) + µ(A

) + µ(A \ A

∪ A

)

i=1

µ(A

) + µ

A \

[

i=1

To finish the proof, we show that

A \

[

i=1

→ 0 as n → ∞.

We are going to reduce this to the finite intersection property of compact sets in

if (

) is a sequence of compact sets in

with the property that

m=1

∅

for all n, then

∞

m=1

6= ∅.

We first introduce some new notation. We let

= A \

[

m=1

We now suppose, for contradiction, that

(

)

6→

0 as

n → ∞

. Since the

’s

are decreasing, there must exist ε > 0 such that µ(B

) ≥ 2ε for every n.

For each

, we take

∈ A

with the property that

⊆ B

and

(

)

≤

. This is possible since each

is just a finite union of intervals. Thus we

have

µ(B

) − µ

m=1

= µ

m=1

≤ µ

[

m=1

\ C

)

≤

m=1

µ(B

\ C

)

≤

m=1

≤ ε.

On the other hand, we also know that µ(B

) ≥ 2ε.

m=1

≥ ε

for all

. We now let that

m=1

. Then

(

)

≥ ε

, and in particular

6= ∅ for all n.

Thus, the finite intersection property says

∅ 6=

∞

n=1

⊆

∞

n=1

= ∅.

This is a contradiction. So we have µ(B

) → 0 as n → ∞. So done.

Definition

(Lebesgue measure)

The Lebesgue measure is the unique Borel

measure µ on R with µ([a, b]) = b − a.

Note that the Lebesgue measure is not a finite measure, since

(

) =

∞

However, it is a σ-finite measure.

Definition

(

-finite measure)

Let (

E, E

) be a measurable space, and

measure. We say

-finite if there exists a sequence (

) in

such that

= E and µ(E

) < ∞ for all n.

This is the next best thing we can hope after finiteness, and often proofs

that involve finiteness carry over to σ-finite measures.

Proposition. The Lebesgue measure is translation invariant, i.e.

µ(A + x) = µ(A)

for all A ∈ B and x ∈ R, where

A + x = {y + x, y ∈ A}.

Proof. We use the uniqueness of the Lebesgue measure. We let

(A) = µ(A + x)

for

A ∈ B

. Then this is a measure on

satisfying

([

a, b

]) =

b − a

. So the

uniqueness of the Lebesgue measure shows that µ

= µ.

It turns out that translation invariance actually characterizes the Lebesgue

measure.

Proposition.

Let

˜µ

be a Borel measure on

that is translation invariant and

µ([0, 1]) = 1. Then ˜µ is the Lebesgue measure.

Proof. We show that any such measure must satisfy

µ([a, b]) = b − a.

By additivity and translation invariance, we can show that

([

p, q

]) =

q −p

for all

rational

p < q

. By considering

([

p, p

+ 1

]) for all

and using the increasing

property, we know

(

{p}

) = 0. So

(([

p, q

)) =

((

p, q

]) =

((

p, q

)) =

q − p

for

all rational p, q.

Finally, by countable additivity, we can extend this to all real intervals. Then

the result follows from the uniqueness of the Lebesgue measure.

In the proof of the Caratheodory extension theorem, we constructed a measure

∗

on the

-algebra

∗

-measurable sets which contains

. This contains

(

), but could in fact be bigger than it. We call

the Lebesgue

-algebra.

Indeed, it can be given by

M = {A ∪ N : A ∈ B, N ⊆ B ∈ B with µ(B) = 0}.

If A ∪ N ∈ M, then µ(A ∪ N) = µ(A). The proof is left for the example sheet.

It is also true that

is strictly larger than

, so there exists

A ∈ M

with

A 6∈ B. Construction of such a set was on last year’s exam (2016).

On the other hand, it is also true that not all sets are Lebesgue measurable.

This is a rather funny construction.

Example.

For

x, y ∈

1), we say

x ∼ y

x − y

is rational. This defines an

equivalence relation on [0

1). By the axiom of choice, we pick a representative

of each equivalence class, and put them into a set

S ⊆

1). We will show that

S is not Lebesgue measurable.

Suppose that

were Lebesgue measurable. We are going to get a contra-

diction to the countable additivity of the Lebesgue measure. For each rational

r ∈ [0, 1) ∩ Q, we define

= {s + r mod 1 : s ∈ S}.

By translation invariance, we know

is also Lebesgue measurable, and

(

) =

µ(S).

Also, by construction of

, we know (

)

r∈Q

is disjoint, and

r∈Q

= [0

1).

Now by countable additivity, we have

1 = µ([0, 1)) = µ





[

r∈Q





r∈Q

µ(S

) =

r∈Q

µ(S),

which is clearly not possible. Indeed, if

(

) = 0, then this says 1 = 0; If

µ(S) > 0, then this says 1 = ∞. Both are absurd.