3Integration

II Probability and Measure 3.1 Definition and basic properties
We are now going to work towards defining the integral of a measurable function
on a measure space (
E, E, µ
). Different sources use different notations for the
integral. The following notations are all commonly used:
µ(f) =
Z
E
f dµ =
Z
E
f(x) dµ(x) =
Z
E
f(x)µ(dx).
In the case where (E, E, µ) = (R, B, Lebesgue), people often just write this as
µ(f) =
Z
R
f(x) dx.
On the other hand, if (
E, E, µ
) = (Ω
, F, P
) is a probability space, and
X
is a
random variable, then people write the integral as E[X], the expectation of X.
So how are we going to define the integral? There are two steps to defining
the integral. The idea is that we first define the integral on simple functions,
and then extend the definition to more general measurable functions by taking
the limit. When we do the definition for simple functions, it will be obvious that
the definition satisfies the nice properties, and we will have to check that they
are preserved when we take the limit.
Definition
(Simple function)
.
A simple function is a measurable function that
can be written as a finite non-negative linear combination of indicator functions
of measurable sets, i.e.
f =
n
X
k=1
a
k
1
A
k
for some A
k
E and a
k
0.
Note that some sources do not assume that
a
k
0, but assuming this makes
our life easier.
It is obvious that
Proposition.
A function is simple iff it is measurable, non-negative, and takes
on only finitely-many values.
Definition (Integral of simple function). The integral of a simple function
f =
n
X
k=1
a
k
1
A
k
is given by
µ(f) =
n
X
k=1
a
k
µ(A
k
).
Note that it can be that
µ
(
A
k
) =
, but
a
k
= 0. When this happens, we
are just going to declare that 0
·
= 0 (this makes sense because this means
we are ignoring all 0
· 1
A
terms for any
A
). After we do this, we can check the
integral is well-defined.
We are now going to extend this definition to non-negative measurable
functions by a limiting procedure. Once we’ve done this, we are going to extend
the definition to measurable functions by linearity of the integral. Then we
would have a definition of the integral, and we are going to deduce properties of
the integral using approximation.
Definition (Integral). Let f be a non-negative measurable function. We set
µ(f) = sup{µ(g) : g f, g is simple}.
For arbitrary f, we write
f = f
+
f
= (f 0) + (f 0).
We put |f| = f
+
+ f
. We say f is integrable if µ(|f|) < . In this case, set
µ(f) = µ(f
+
) µ(f
).
If only one of
µ
(
f
+
)
, µ
(
f
)
<
, then we can still make the above definition,
and the result will be infinite.
In the case where we are integrating over (a subset of) the reals, we call it
the Lebesgue integral.
Proposition.
Let
f
: [0
,
1]
R
be Riemann integrable. Then it is also Lebesgue
integrable, and the two integrals agree.
We will not prove this, but this immediately gives us results like the funda-
mental theorem of calculus, and also helps us to actually compute the integral.
However, note that this does not hold for infinite domains, as you will see in the
second example sheet.
But the Lebesgue integrable functions are better. A lot of functions are
Lebesgue integrable but not Riemann integrable.
Example. Take the standard non-Riemann integrable function
f = 1
[0,1]\Q
.
Then f is not Riemann integrable, but it is Lebesgue integrable, since
µ(f) = µ([0, 1] \Q) = 1.
We are now going to study some basic properties of the integral. We will first
look at the properties of integrals of simple functions, and then extend them to
general integrable functions.
For f, g simple, and α, β 0, we have that
µ(αf + βg) = αµ(f) + βµ(g).
So the integral is linear.
Another important property is monotonicity if f g, then µ(f) µ(g).
Finally, we have
f
= 0 a.e. iff
µ
(
f
) = 0. It is absolutely crucial here that we
Our goal is to show that these three properties are also satisfied for arbitrary
non-negative measurable functions, and the first two hold for integrable functions.
In order to achieve this, we prove a very important tool the monotone
convergence theorem. Later, we will also learn about the dominated convergence
theorem and Fatou’s lemma. These are the main and very important results
Theorem
(Monotone convergence theorem)
.
Suppose that (
f
n
)
, f
are non-
negative measurable with f
n
% f. Then µ(f
n
) % µ(f).
In the proof we will use the fact that the integral is monotonic, which we
shall prove later.
Proof.
We will split the proof into five steps. We will prove each of the following
in turn:
(i) If f
n
and f are indicator functions, then the theorem holds.
(ii) If f is an indicator function, then the theorem holds.
(iii) If f is simple, then the theorem holds.
(iv) If f is non-negative measurable, then the theorem holds.
Each part follows rather straightforwardly from the previous one, and the reader
is encouraged to try to prove it themself.
We first consider the case where
f
n
=
1
A
n
and
f
=
1
A
. Then
f
n
% f
is true
iff A
n
% A. On the other hand, µ(f
n
) % µ(f) iff µ(A
n
) % µ(A).
For convenience, we let A
0
= . We can write
µ(A) = µ
[
n
A
n
\ A
n1
!
=
X
n=1
µ(A
n
\ A
n1
)
= lim
N→∞
N
X
n=1
µ(A
n
\ A
n1
)
= lim
N→∞
µ(A
N
).
So done.
We next consider the case where f = 1
A
for some A. Fix ε > 0, and set
A
n
= {f
n
> 1 ε} E.
Then we know that A
n
% A, as f
n
% f. Moreover, by definition, we have
(1 ε)1
A
n
f
n
f = 1
A
.
As A
n
% A, we have that
(1 ε)µ(f) = (1 ε) lim
n→∞
µ(A
n
) lim
n→∞
µ(f
n
) µ(f)
since f
n
f. Since ε is arbitrary, we know that
lim
n→∞
µ(f
n
) = µ(f).
Next, we consider the case where f is simple. We write
f =
m
X
k=1
a
k
1
A
k
,
where a
k
> 0 and A
k
are pairwise disjoint. Since f
n
% f, we know
a
1
k
f
n
1
A
k
% 1
A
k
.
So we have
µ(f
n
) =
m
X
k=1
µ(f
n
1
A
k
) =
m
X
k=1
a
k
µ(a
1
k
f
n
1
A
k
)
m
X
k=1
a
k
µ(A
k
) = µ(f).
Suppose
f
is non-negative measurable. Suppose
g f
is a simple function.
As
f
n
% f
, we know
f
n
g % f g
=
g
. So by the previous case, we know that
µ(f
n
g) µ(g).
We also know that
µ(f
n
) µ(f
n
g).
So we have
lim
n→∞
µ(f
n
) µ(g)
for all g f. This is possible only if
lim
n→∞
µ(f
n
) µ(f)
by definition of the integral. However, we also know that
µ
(
f
n
)
µ
(
f
) for all
n
,
again by definition of the integral. So we must have equality. So we have
µ(f) = lim
n→∞
µ(f
n
).
Theorem. Let f, g be non-negative measurable, and α, β 0. We have that
(i) µ(αf + βg) = αµ(f) + βµ(g).
(ii) f g implies µ(f) µ(g).
(iii) f = 0 a.e. iff µ(f) = 0.
Proof.
(i) Let
f
n
= 2
n
b2
n
fc n
g
n
= 2
n
b2
n
gc n.
Then
f
n
, g
n
are simple with
f
n
% f
and
g
n
% g
. Hence
µ
(
f
n
)
% µ
(
f
)
and
µ
(
g
n
)
% µ
(
g
) and
µ
(
αf
n
+
βg
n
)
% µ
(
αf
+
βg
), by the monotone
convergence theorem. As f
n
, g
n
are simple, we have that
µ(αf
n
+ βg
n
) = αµ(f
n
) + βµ(g
n
).
Taking the limit as n , we get
µ(αf + βg) = αµ(f) + βµ(g).
(ii)
We shall be careful not to use the monotone convergence theorem. We
have
µ(g) = sup{µ(h) : h g simple}
sup{µ(h) : h f simple}
= µ(f).
(iii) Suppose f 6= 0 a.e. Let
A
n
=
x : f(x) >
1
n
.
Then
{x : f(x) 6= 0} =
[
n
A
n
.
Since the left hand set has non-negative measure, it follows that there is
some A
n
with non-negative measure. For that n, we define
h =
1
n
1
A
n
.
Then µ(f ) µ(h) > 0. So µ(f) 6= 0.
Conversely, suppose f = 0 a.e. We let
f
n
= 2
n
b2
n
fc n
be a simple function. Then f
n
% f and f
n
= 0 a.e. So
µ(f) = lim
n→∞
µ(f
n
) = 0.
We now prove the analogous statement for general integrable functions.
Theorem. Let f, g be integrable, and α, β 0. We have that
(i) µ(αf + βg) = αµ(f) + βµ(g).
(ii) f g implies µ(f) µ(g).
(iii) f = 0 a.e. implies µ(f) = 0.
Note that in the last case, the converse is no longer true, as one can easily
see from the sign function sgn : [1, 1] R.
Proof.
(i) We are going to prove these by applying the previous theorem.
By definition of the integral, we have
µ
(
f
) =
µ
(
f
). Also, if
α
0, then
µ(αf) = µ(αf
+
) µ(αf
) = αµ(f
+
) αµ(f
) = αµ(f).
Combining these two properties, it then follows that if
α
is a real number,
then
µ(αf) = αµ(f).
To finish the proof of (i), we have to show that
µ
(
f
+
g
) =
µ
(
f
) +
µ
(
g
).
We know that this is true for non-negative functions, so we need to employ
a little trick to make this a statement about the non-negative version. If
we let h = f + g, then we can write this as
h
+
h
= (f
+
f
) + (g
+
g
).
We now rearrange this as
h
+
f
+ g
= f
+
+ g
+
+ h
.
Now everything is non-negative measurable. So applying µ gives
µ(f
+
) + µ(f
) + µ(g
) = µ(f
+
) + µ(g
+
) + µ(h
).
Rearranging, we obtain
µ(h
+
) µ(h
) = µ(f
+
) µ(f
) + µ(g
+
) µ(g
).
This is exactly the same thing as saying
µ(f + g) = µ(h) = µ(f) = µ(g).
(ii)
If
f g
, then
g f
0. So
µ
(
g f
)
0. By (i), we know
µ
(
g
)
µ
(
f
)
0.
So µ(g) µ(f).
(iii)
If
f
= 0 a.e., then
f
+
, f
= 0 a.e. So
µ
(
f
+
) =
µ
(
f
) = 0. So
µ
(
f
) =
µ(f
+
) µ(f
) = 0.
As mentioned, the converse to (iii) is no longer true. However, we do have
the following partial converse:
Proposition.
If
A
is a
π
-system with
E A
and
σ
(
A
) =
E
, and
f
is an
integrable function that
µ(f1
A
) = 0
for all A A. Then µ(f) = 0 a.e.
Proof. Let
D = {A E : µ(f1
A
) = 0}.
It follows immediately from the properties of the integral that
D
is a d-system.
So D = E by Dynkin’s lemma. Let
A
+
= {x E : f (x) > 0},
A
= {x E : f (x) < 0}.
Then A
±
E, and
µ(f1
A
+
) = µ(f1
A
) = 0.
So f 1
A
+
and f 1
A
vanish a.e. So f vanishes a.e.
Proposition.
Suppose that (
g
n
) is a sequence of non-negative measurable
functions. Then we have
µ
X
n=1
g
n
!
=
X
n=1
µ(g
n
).
Proof. We know
N
X
n=1
g
n
!
%
X
n=1
g
n
!
as N . So by the monotone convergence theorem, we have
N
X
n=1
µ(g
n
) = µ
N
X
n=1
g
n
!
% µ
X
n=1
g
n
!
.
But we also know that
N
X
n=1
µ(g
n
) %
X
n=1
µ(g
n
)
by definition. So we are done.
So for non-negative measurable functions, we can always switch the order of
integration and summation.
Note that we can consider summation as integration. We let
E
=
N
and
E
=
{all subsets of N}
. We let
µ
be the counting measure, so that
µ
(
A
) is the
size of
A
. Then integrability (and having a finite integral) is the same as absolute
convergence. Then if it converges, then we have
Z
f dµ =
X
n=1
f(n).
So we can just view our proposition as proving that we can swap the order of
two integrals. The general statement is known as Fubini’s theorem.