3Integration

II Probability and Measure

3.5 Product measures and Fubini’s theorem

Recall the following definition of the product σ-algebra.

Definition

(Product

σ

-algebra)

.

Let (

E

1

, E

1

, µ

1

) and (

E

2

, E

2

, µ

2

) be finite mea-

sure spaces. We let

A = {A

1

× A

2

: A

2

× E

1

, A

2

× E

2

}.

Then A is a π-system on E

1

× E

2

. The product σ-algebra is

E = E

1

⊗ E

2

= σ(A).

We now want to construct a measure on the product

σ

-algebra. We can, of

course, just apply the Caratheodory extension theorem, but we would want a

more explicit description of the integral. The idea is to define, for A ∈ E

1

⊗ E

2

,

µ(A) =

Z

E

1

Z

E

2

1

A

(x

1

, x

2

) µ

2

(dx

2

)

µ

1

(dx

1

).

Doing this has the advantage that it would help us in a step of proving Fubini’s

theorem.

However, before we can make this definition, we need to do some preparation

to make sure the above statement actually makes sense:

Lemma.

Let

E

=

E

1

× E

2

be a product of

σ

-algebras. Suppose

f

:

E → R

is

E-measurable function. Then

(i) For each x

2

∈ E

2

, the function x

1

7→ f(x

1

, x

2

) is E

1

-measurable.

(ii) If f is bounded or non-negative measurable, then

f

2

(x

2

) =

Z

E

1

f(x

1

, x

2

) µ

1

(dx

1

)

is E

2

-measurable.

Proof.

The first part follows immediately from the fact that for a fixed

x

2

,

the map

ι

1

:

E

1

→ E

given by

ι

1

(

x

1

) = (

x

1

, x

2

) is measurable, and that the

composition of measurable functions is measurable.

For the second part, we use the monotone class theorem. We let

V

be

the set of all measurable functions

f

such that

x

2

7→

R

E

1

f

(

x

1

, x

2

)

µ

1

(d

x

1

) is

E

2

-measurable.

(i)

It is clear that

1

E

, 1

A

∈ V

for all

A ∈ A

(where

A

is as in the definition

of the product σ-algebra).

(ii) V is a vector space by linearity of the integral.

(iii) Suppose (f

n

) is a non-negative sequence in V and f

n

% f, then

x

2

7→

Z

E

1

f

n

(x

1

, x

2

) µ

1

(dx

1

)

%

x

2

7→

Z

E

1

f(x

1

, x

2

) µ(dx

1

)

by the monotone convergence theorem. So f ∈ V .

So the monotone class theorem tells us

V

contains all bounded measurable

functions.

Now if

f

is a general non-negative measurable function, then

f ∧n

is bounded

and measurable, hence

f ∧n ∈ V

. Therefore

f ∈ V

by the monotone convergence

theorem.

Theorem.

There exists a unique measurable function

µ

=

µ

1

⊗ µ

2

on

E

such

that

µ(A

1

× A

2

) = µ(A

1

)µ(A

2

)

for all A

1

× A

2

∈ A.

Here it is crucial that the measure space is finite. Actually, everything

still works for

σ

-finite measure spaces, as we can just reduce to the finite case.

However, things start to go wrong if we don’t have σ-finite measure spaces.

Proof.

One might be tempted to just apply the Caratheodory extension theorem,

but we have a more direct way of doing it here, by using integrals. We define

µ(A) =

Z

E

1

Z

E

2

1

A

(x

1

, x

2

) µ

2

(dx

2

)

µ

1

(dx

1

).

Here the previous lemma is very important. It tells us that these integrals

actually make sense!

We first check that this is a measure:

(i) µ(∅) = 0 is immediate since 1

∅

= 0.

(ii) Suppose (A

n

) is a disjoint sequence and A =

S

A

n

. Then we have

µ(A) =

Z

E

1

Z

E

2

1

A

(x

1

, x

2

) µ

2

(dx

2

)

µ

1

(dx

1

)

=

Z

E

1

Z

E

2

X

n

1

A

n

(x

1

, x

2

) µ

2

(dx

2

)

!

µ

1

(dx

1

)

We now use the fact that integration commutes with the sum of non-

negative measurable functions to get

=

Z

E

1

X

n

Z

E

2

1

A

(x

1

, x

2

) µ

2

(dx

2

)

!

µ

1

(dx

1

)

=

X

n

Z

E

1

Z

E

2

1

A

n

(x

1

, x

2

) µ

2

(dx

2

)

µ

1

(dx

1

)

=

X

n

µ(A

n

).

So we have a working measure, and it clearly satisfies

µ(A

1

× A

2

) = µ(A

1

)µ(A

2

).

Uniqueness follows because

µ

is finite, and is thus characterized by its values on

the π-system A that generates E.

Exercise.

Show the non-uniqueness of the product Lebesgue measure on [0

,

1]

and the counting measure on [0, 1].

Note that we could as well have defined the measure as

µ(A) =

Z

E

2

Z

E

1

1

A

(x

1

, x

2

) µ

1

(dx

1

)

µ

2

(dx

2

).

The same proof would go through, so we have another measure on the space.

However, by uniqueness, we know they must be the same! Fubini’s theorem

generalizes this to arbitrary functions.

Theorem (Fubini’s theorem).

(i) If f is non-negative measurable, then

µ(f) =

Z

E

1

Z

E

2

f(x

1

, x

2

) µ

2

(dx

2

)

µ

1

(dx

1

). (∗)

In particular, we have

Z

E

1

Z

E

2

f(x

1

, x

2

) µ

2

(dx

2

)

µ

1

(dx

1

) =

Z

E

2

Z

E

1

f(x

1

, x

2

) µ

1

(dx

1

)

µ

2

(dx

2

).

This is sometimes known as Tonelli’s theorem.

(ii) If f is integrable, and

A =

x

1

∈ E :

Z

E

2

|f(x

1

, x

2

)|µ

2

(dx

2

) < ∞

.

then

µ

1

(E

1

\ A) = 0.

If we set

f

1

(x

1

) =

(

R

E

2

f(x

1

, x

2

) µ

2

(dx

2

) x

1

∈ A

0 x

1

6∈ A

,

then f

1

is a µ

1

integrable function and

µ

1

(f

1

) = µ(f).

Proof.

(i)

Let

V

be the set of all measurable functions such that (

∗

) holds. Then

V

is a vector space since integration is linear.

(a) By definition of µ, we know 1

E

and 1

A

are in V for all A ∈ A.

(b)

The monotone convergence theorem on both sides tell us that

V

is

closed under monotone limits of the form f

n

% f, f

n

≥ 0.

By the monotone class theorem, we know

V

contains all bounded measur-

able functions. If

f

is non-negative measurable, then (

f ∧ n

)

∈ V

, and

monotone convergence for f ∧ n % f gives that f ∈ V .

(ii) Assume that f is µ-integrable. Then

x

1

7→

Z

E

2

|f(x

1

, x

2

)| µ(dx

2

)

is

E

1

-measurable, and, by (i), is

µ

1

-integrable. So

A

1

, being the inverse

image of

∞

under that map, lies in

E

1

. Moreover,

µ

1

(

E

1

\A

1

) = 0 because

integrable functions can only be infinite on sets of measure 0.

We set

f

+

1

(x

1

) =

Z

E

2

f

+

(x

1

, x

2

) µ

2

(dx

2

)

f

−

1

(x

1

) =

Z

E

2

f

−

(x

1

, x

2

) µ

2

(dx

2

).

Then we have

f

1

= (f

+

1

− f

−

1

)1

A

1

.

So the result follows since

µ(f) = µ(f

+

) − µ(f

−

) = µ(f

+

1

) − µ

1

(f

−

1

) = µ

1

(f

1

).

by (i).

Since

R

is

σ

-finite, we know that we can sensibly talk about the

d

-fold product

of the Lebesgue measure on R to obtain the Lebesgue measure on R

d

.

What

σ

-algebra is the Lebesgue measure on

R

d

defined on? We know the

Lebesgue measure on

R

is defined on

B

. So the Lebesgue measure is defined on

B × ··· × B = σ(B

1

× ··· × B

d

: B

i

∈ B).

By looking at the definition of the product topology, we see that this is just the

Borel σ-algebra on R

d

!

Recall that when we constructed the Lebesgue measure, the Caratheodory

extension theorem yields a measure on the “Lebesgue

σ

-algebra”

M

, which

was strictly bigger than the Borel

σ

-algebra. It was shown in the first example

sheet that

M

is complete, i.e. if we have

A ⊆ B ⊆ R

with

B ∈ M

,

µ

(

B

) = 0,

then

A ∈ M

. We can also take the Lebesgue measure on

R

d

to be defined on

M ⊗ ··· ⊗ M

. However, it happens that

M ⊗ M

together with the Lebesgue

measure on

R

2

is no longer complete (proof is left as an exercise for the reader).

We now turn to probability. Recall that random variables

X

1

, ··· , X

n

are

independent iff the

σ

-algebras

σ

(

X

1

)

, ··· , σ

(

X

n

) are independent. We will show

that random variables are independent iff their laws are given by the product

measure.

Proposition.

Let

X

1

, ··· , X

n

be random variables on (Ω

, F, P

) with values in

(E

1

, E

1

), ··· , (E

n

, E

n

) respectively. We define

E = E

1

× ··· × E

n

, E = E

1

⊗ ··· ⊗ E

n

.

Then X = (X

1

, ··· , X

n

) is E-measurable and the following are equivalent:

(i) X

1

, ··· , X

n

are independent.

(ii) µ

X

= µ

X

1

⊗ ··· ⊗ µ

X

n

.

(iii) For any f

1

, ··· , f

n

bounded and measurable, we have

E

"

n

Y

k=1

f

k

(X

k

)

#

=

n

Y

k=1

E[f

k

(X

k

)].

Proof.

–

(i)

⇒

(ii): Let

ν

=

µ

X

1

× ··· ⊗ µ

X

n

. We want to show that

ν

=

µ

X

. To

do so, we just have to check that they agree on a

π

-system generating the

entire σ-algebra. We let

A = {A

1

× ··· × A

n

: A

1

∈ E

1

, ··· , A

k

∈ E

k

}.

Then

A

is a generating

π

-system of

E

. Moreover, if

A

=

A

1

×···×A

n

∈ A

,

then we have

µ

X

(A) = P[X ∈ A]

= P[X

1

∈ A

1

, ··· , X

n

∈ A

n

]

By independence, we have

=

n

Y

k=1

P[X

k

∈ A

k

]

= ν(A).

So we know that µ

X

= ν = µ

X

1

⊗ ··· ⊗ µ

X

n

on E.

– (ii) ⇒ (iii): By assumption, we can evaluate the expectation

E

"

n

Y

k=1

f

k

(X

k

)

#

=

Z

E

n

Y

k=1

f

k

(x

k

)µ(dx

k

)

=

n

Y

k=1

Z

E

k

f(x

k

)µ

k

(dx

k

)

=

n

Y

k=1

E[f

k

(X

k

)].

Here in the middle we have used Fubini’s theorem.

– (iii) ⇒ (i): Take f

k

= 1

A

k

for A

k

∈ E

k

. Then we have

P[X

1

∈ A

1

, ··· , X

n

∈ A

n

] = E

"

n

Y

k=1

1

A

k

(X

k

)

#

=

n

Y

k=1

E[1

A

k

(X

k

)]

=

n

Y

k=1

P[X

k

∈ A

k

]

So X

1

, ··· , X

n

are independent.