5Metric spaces
IB Analysis II
5.6 The contraction mapping theorem
If you have already taken IB Metric and Topological Spaces, then you were
probably bored by the above sections, since you’ve already met them all. Finally,
we get to something new. This section is comprised of just two theorems. The
first is the contraction mapping theorem, and we will use it to prove Picard-
Lindel¨of existence theorem. Later, we will prove the inverse function theorem
using the contraction mapping theorem. All of these are really powerful and
important theorems in analysis. They have many more applications and useful
corollaries, but we do not have time to get into those.
Definition (Contraction mapping). Let (
X, d
) be metric space. A mapping
f : X → X is a contraction if there exists some λ with 0 ≤ λ < 1 such that
d(f(x), f(y)) ≤ λd(x, y).
Note that a contraction mapping is by definition Lipschitz and hence (uni-
formly) continuous.
Theorem (Contraction mapping theorem). Let
X
be a (non-empty) complete
metric space, and if
f
:
X → X
is a contraction, then
f
has a unique fixed point,
i.e. there is a unique x such that f (x) = x.
Moreover, if
f
:
X → X
is a function such that
f
(m)
:
X → X
(i.e.
f
composed with itself
m
times) is a contraction for some
m
, then
f
has a unique
fixed point.
We can see finding fixed points as the process of solving equations. One
important application we will have is to use this to solve differential equations.
Note that the theorem is false if we drop the completeness assumption. For
example,
f
: (0
,
1)
→
(0
,
1) defined by
x
2
is clearly a contraction with no fixed
point. The theorem is also false if we drop the assumption
λ <
1. In fact, it is
not enough to assume
d
(
f
(
x
)
, f
(
y
))
< d
(
x, y
) for all
x, y
. A counterexample is
to be found on example sheet 3.
Proof. We first focus on the case where f itself is a contraction.
Uniqueness is straightforward. By assumption, there is some 0
≤ λ <
1 such
that
d(f(x), f(y)) ≤ λd(x, y)
for all x, y ∈ X. If x and y are both fixed points, then this says
d(x, y) = d(f(x), f(y)) ≤ λd(x, y).
This is possible only if d(x, y) = 0, i.e. x = y.
To prove existence, the idea is to pick a point
x
0
and keep applying
f
. Let
x
0
∈ X. We define the sequence (x
n
) inductively by
x
n+1
= f(x
n
).
We first show that this is Cauchy. For any n ≥ 1, we can compute
d(x
n+1
, x
n
) = d(f(x
n
), f(x
n−1
)) ≤ λd(x
n
, x
n−1
) ≤ λ
n
d(x
1
, x
0
).
Since this is true for any n, for m > n, we have
d(x
m
, x
n
) ≤ d(x
m
, x
m−1
) + d(x
m−1
, x
m−2
) + ··· + d(x
n+1
, x
n
)
=
m−1
X
j=n
d(x
j+1
, x
j
)
=
m−1
X
j=n
λ
j
d(x
1
, x
0
)
≤ d(x
1
, x
0
)
∞
X
j=n
λ
j
=
λ
n
1 − λ
d(x
1
, x
0
).
Note that we have again used the property that λ < 1.
This implies
d
(
x
m
, x
n
)
→
0 as
m, n → ∞
. So this sequence is Cauchy. By
the completeness of
X
, there exists some
x ∈ X
such that
x
n
→ x
. Since
f
is a contraction, it is continuous. So
f
(
x
n
)
→ f
(
x
). However, by definition
f
(
x
n
) =
x
n+1
. So taking the limit on both sides, we get
f
(
x
) =
x
. So
x
is a
fixed point.
Now suppose that
f
(m)
is a contraction for some
m
. Hence by the first part,
there is a unique x ∈ X such that f
(m)
(x) = x. But then
f
(m)
(f(x)) = f
(m+1)
(x) = f(f
(m)
(x)) = f(x).
So
f
(
x
) is also a fixed point of
f
(n)
(
x
). By uniqueness of fixed points, we must
have
f
(
x
) =
x
. Since any fixed point of
f
is clearly a fixed point of
f
(n)
as well,
it follows that x is the unique fixed point of f.
Based on the proof of the theorem, we have the following error estimate in
the contraction mapping theorem: for
x
0
∈ X
and
x
n
=
f
(
x
n−1
), we showed
that for m > n, we have
d(x
m
, x
n
) ≤
λ
n
1 − λ
d(x
1
, x
0
).
If x
n
→ x, taking the limit of the above bound as m → ∞ gives
d(x, x
n
) ≤
λ
n
1 − λ
d(x
1
, x
0
).
This is valid for all n.
We are now going to use this to obtain the Picard-Lindel¨of existence theorem
for ordinary differential equations. The objective is as follows. Suppose we are
given a function
F = (F
1
, F
2
, ··· , F
n
) : R × R
n
→ R
n
.
We interpret the R as time and the R
n
as space.
Given
t
0
∈ R
and x
0
∈ R
n
, we want to know when can we find a solution to
the ODE
df
dt
= F(t, f (t))
subject to
f
(
t
0
) = x
0
. We would like this solution to be valid (at least) for all
t
in some interval I containing t
0
.
More explicitly, we want to understand when will there be some
ε >
0
and a differentiable function f = (
f
1
, ··· , f
n
) : (
t
0
− ε, t
0
+
ε
)
→ R
n
(i.e.
f
j
: (t
0
− ε, t
0
+ ε) → R is differentiable for all j) satisfying
df
j
dt
= F
j
(t, f
1
(t), ··· , f
n
(t))
such that f
j
(t
0
) = x
(j)
0
for all j = 1, . . . , n and t ∈ (t
0
− ε, t
0
+ ε).
We can imagine this scenario as a particle moving in
R
n
, passing through x
0
at time
t
0
. We then ask if there is a trajectory f(
t
) such that the velocity of the
particle at any time t is given by F(t, f(t)).
This is a complicated system, since it is a coupled system of many variables.
Explicit solutions are usually impossible, but in certain cases, we can prove the
existence of a solution. Of course, solutions need not exist for arbitrary
F
. For
example, there will be no solution if
F
is everywhere discontinuous, since any
derivative is continuous in a dense set of points. The Picard-Lindel¨of existence
theorem gives us sufficient conditions for a unique solution to exists.
We will need the following notation
Notation. For x
0
∈ R
n
, R > 0, we let
B
R
(x
0
) = {x ∈ R
n
: ∥x − x
0
∥
2
≤ R}.
Then the theorem says
Theorem (Picard-Lindel¨of existence theorem). Let x
0
∈ R
n
,
R >
0,
a < b
,
t
0
∈ [a, b]. Let F : [a, b] ×B
R
(x
0
) → R
n
be a continuous function satisfying
∥F(t, x) − F(t, y)∥
2
≤ κ∥x − y∥
2
for some fixed
κ >
0 and all
t ∈
[
a, b
]
,
x
∈ B
R
(x
0
)
. In other words,
F
(
t, ·
) :
R
n
→ R
n
is Lipschitz on
B
R
(x
0
) with the same Lipschitz constant for every
t
.
Then
(i)
There exists an
ε >
0 and a unique differentiable function f : [
t
0
− ε, t
0
+
ε] ∩ [a, b] → R
n
such that
df
dt
= F(t, f (t)) (∗)
and f(t
0
) = x
0
.
(ii) If
sup
[a,b]×
B
R
(x
0
)
∥F∥
2
≤
R
b − a
,
then there exists a unique differential function f : [
a, b
]
→ R
n
that satisfies
the differential equation and boundary conditions above.
Even
n
= 1 is an important, special, non-trivial case. Even if we have only
one dimension, explicit solutions may be very difficult to find, if not impossible.
For example,
df
dt
= f
2
+ sin f + e
f
would be almost impossible to solve. However, the theorem tells us there will be
a solution, at least locally.
Note that any differentiable
f
satisfying the differential equation is auto-
matically continuously differentiable, since the derivative is F(
t,
f(
t
)), which is
continuous.
Before we prove the theorem, we first show the requirements are indeed
necessary. We first look at that
ε
in (i). Without the addition requirement in (ii),
there might not exist a solution globally on [
a, b
]. For example, we can consider
the n = 1 case, where we want to solve
df
dt
= f
2
,
with boundary condition
f
(0) = 1. Our
F
(
t, f
) =
f
2
is a nice, uniformly
Lipschitz function on any [0
, b
]
× B
R
(1) = [0
, b
]
×
[1
− R,
1 +
R
]. However, we
will shortly see that there is no global solution.
If we assume f = 0, then for all t ∈ [0, b], the equation is equivalent to
d
dt
(t + f
−1
) = 0.
So we need
t
+
f
−1
to be constant. The initial conditions tells us this constant
is 1. So we have
f(t) =
1
1 − t
.
Hence the solution on [0
,
1) is
1
1−t
. Any solution on [0
, b
] must agree with this
on [0, 1). So if b ≥ 1, then there is no solution in [0, b].
The Lipschitz condition is also necessary to guarantee uniqueness. Without
this condition, existence of a solution is still guaranteed (but is another theorem,
the Cauchy-Peano theorem), but we could have many different solutions. For
example, we can consider the differential equation
df
dt
=
p
|f|
with
f
(0) = 0. Here
F
(
t, x
) =
p
|x|
is not Lipschitz near
x
= 0. It is easy to see
that both
f
= 0 and
f
(
t
) =
1
4
t
2
are both solutions. In fact, for any
α ∈
[0
, b
],
the function
f
α
(t) =
(
0 0 ≤ t ≤ α
1
4
(t − α)
2
α ≤ t ≤ b
is also a solution. So we have an infinite number of solutions.
We are now going to use the contraction mapping theorem to prove this. In
general, this is a very useful idea. It is in fact possible to use other fixed point
theorems to show the existence of solutions to partial differential equations. This
is much more difficult, but has many far-reaching important applications to
theoretical physics and geometry, say. For these, see Part III courses.
Proof. First, note that (ii) implies (i). We know that
sup
[a,b]×B
R
(x)
∥F∥
is bounded since it is a continuous function on a compact domain. So we can
pick an ε such that
2ε ≤
R
sup
[a,b]×B
R
(x)
∥F∥
.
Then writing [t
0
− ε, t
0
+ ε] ∩ [a, b] = [a
1
, b
1
], we have
sup
[a
1
,b
1
]×B
R
(x)
∥F∥ ≤ sup
[a,b]×B
R
(x)
∥F∥ ≤
R
2ε
≤
R
b
1
− a
1
.
So (ii) implies there is a solution on [
t
0
− ε, t
0
+
ε
]
∩
[
a, b
]. Hence it suffices to
prove (ii).
To apply the contraction mapping theorem, we need to convert this into
a fixed point problem. The key is to reformulate the problem as an integral
equation. We know that a differentiable f : [
a, b
]
→ R
n
satisfies the differential
equation (∗) if and only if f : [a, b] → B
R
(x
0
) is continuous and satisfies
f(t) = x
0
+
Z
t
t
0
F(s, f (s)) ds
by the fundamental theorem of calculus. Note that we don’t require f is dif-
ferentiable, since if a continuous f satisfies this equation, it is automatically
differentiable by the fundamental theorem of calculus. This is very helpful, since
we can work over the much larger vector space of continuous functions, and it
would be easier to find a solution.
We let X = C([a, b], B
R
(x
0
)). We equip X with the supremum metric
∥g − h∥ = sup
t∈[a,b]
∥g(t) − h(t)∥
2
.
We see that
X
is a closed subset of the complete metric space
C
([
a, b
]
, R
n
) (again
taken with the supremum metric). So
X
is complete. For every g
∈ X
, we define
a function T g : [a, b] → R
n
by
(T g)(t) = x
0
+
Z
t
t
0
F(s, g(s)) ds.
Our differential equation is thus
f = T f .
So we first want to show that
T
is actually mapping
X → X
, i.e.
T
g
∈ X
whenever g ∈ X, and then prove it is a contraction map.
We have
∥T g(t) − x
0
∥
2
=
Z
t
t
0
F(s, g(s)) ds
≤
Z
t
t
0
∥F(s, g(s))∥
2
ds
≤ sup
[a,b]×B
R
(x
0
)
∥F∥ · |b − a|
≤ R
Hence we know that T g(t) ∈ B
R
(x
0
). So T g ∈ X.
Next, we need to show this is a contraction. However, it turns out
T
need
not be a contraction. Instead, what we have is that for g
1
, g
2
∈ X, we have
∥T g
1
(t) − T g
2
(t)∥
2
=
Z
t
t
0
F(s, g
1
(s)) − F(s, g
2
(s)) ds
2
≤
Z
t
t
0
∥F(s, g
1
(s)) − F(s, g
2
(s))∥
2
ds
≤ κ(b − a)∥g
1
− g
2
∥
∞
by the Lipschitz condition on F . If we indeed have
κ(b − a) < 1, (†)
then the contraction mapping theorem gives an f ∈ X such that
T f = f ,
i.e.
f = x
0
+
Z
t
t
0
F(s, f (s)) ds.
However, we do not necessarily have (
†
). There are many ways we can solve this
problem. Here, we can solve it by finding an
m
such that
T
(m)
=
T ◦T ◦···◦T
:
X → X
is a contraction map. We will in fact show that this map satisfies the
bound
sup
t∈[a,b]
∥T
(m)
g
1
(t) − T
(m)
g
2
(t)∥ ≤
(b − a)
m
κ
m
m!
sup
t∈[a,b]
∥g
1
(t) − g
2
(t)∥. (‡)
The key is the
m
!, since this grows much faster than any exponential. Given this
bound, we know that for sufficiently large m, we have
(b − a)
m
κ
m
m!
< 1,
i.e.
T
(m)
is a contraction. So by the contraction mapping theorem, the result
holds.
So it only remains to prove the bound. To prove this, we prove instead the
pointwise bound: for any t ∈ [a, b], we have
∥T
(m)
g
1
(t) − T
(m)
g
2
(t)∥
2
≤
(|t − t
0
|)
m
κ
m
m!
sup
s∈[t
0
,t]
∥g
1
(s) − g
2
(s)∥.
From this, taking the supremum on the left, we obtain the bound (‡).
To prove this pointwise bound, we induct on
m
. We wlog assume
t > t
0
. We
know that for every m, the difference is given by
∥T
(m)
g
1
(t) − T
(m)
g
2
(t)∥
2
=
Z
t
t
0
F (s, T
(m−1)
g
1
(s)) − F (s, T
(m−1)
g
2
(s)) ds
2
.
≤ κ
Z
t
t
0
∥T
(m−1)
g
1
(s) − T
(m−1)
g
2
(s)∥
2
ds.
This is true for all m. If m = 1, then this gives
∥T g
1
(t) − T g
2
(t)∥ ≤ κ(t − t
0
) sup
[t
0
,t]
∥g
1
− g
2
∥
2
.
So the base case is done.
For
m ≥
2, assume by induction the bound holds with
m −
1 in place of
m
.
Then the bounds give
∥T
(m)
g
1
(t) − T
(m)
g
2
(t)∥ ≤ κ
Z
t
t
0
k
m−1
(s − t
0
)
m−1
(m − 1)!
sup
[t
0
,s]
∥g
1
− g
2
∥
2
ds
≤
κ
m
(m − 1)!
sup
[t
0
,t]
∥g
1
− g
2
∥
2
Z
t
t
0
(s − t
0
)
m−1
ds
=
κ
m
(t − t
0
)
m
m!
sup
[t
0
,t]
∥g
1
− g
2
∥
2
.
So done.
Note that to get the factor of
m
!, we had to actually perform the integral,
instead of just bounding (
s − t
0
)
m−1
by (
t − t
0
). In general, this is a good
strategy if we want tight bounds. Instead of bounding
Z
b
a
f(x) dx
≤ (b − a) sup |f(x)|,
we write
f
(
x
) =
g
(
x
)
h
(
x
), where
h
(
x
) is something easily integrable. Then we
can have a bound
Z
b
a
f(x) dx
≤ sup |g(x)|
Z
b
a
|h(x)| dx.