IB Analysis II - Metric spaces

5Metric spaces

IB Analysis II

5.6 The contraction mapping theorem

If you have already taken IB Metric and Topological Spaces, then you were

probably bored by the above sections, since you’ve already met them all. Finally,

we get to something new. This section is comprised of just two theorems. The

first is the contraction mapping theorem, and we will use it to prove Picard-

Lindel¨of existence theorem. Later, we will prove the inverse function theorem

using the contraction mapping theorem. All of these are really powerful and

important theorems in analysis. They have many more applications and useful

corollaries, but we do not have time to get into those.

Definition (Contraction mapping). Let (

X, d

) be metric space. A mapping

f : X → X is a contraction if there exists some λ with 0 ≤ λ < 1 such that

d(f(x), f(y)) ≤ λd(x, y).

Note that a contraction mapping is by definition Lipschitz and hence (uni-

formly) continuous.

Theorem (Contraction mapping theorem). Let

be a (non-empty) complete

metric space, and if

X → X

is a contraction, then

has a unique fixed point,

i.e. there is a unique x such that f (x) = x.

Moreover, if

X → X

is a function such that

(m)

X → X

(i.e.

composed with itself

times) is a contraction for some

, then

has a unique

fixed point.

We can see finding fixed points as the process of solving equations. One

important application we will have is to use this to solve differential equations.

Note that the theorem is false if we drop the completeness assumption. For

example,

: (0

→

1) defined by

is clearly a contraction with no fixed

point. The theorem is also false if we drop the assumption

λ <

1. In fact, it is

not enough to assume

(

)

, f

(

))

< d

(

x, y

) for all

x, y

. A counterexample is

to be found on example sheet 3.

Proof. We first focus on the case where f itself is a contraction.

Uniqueness is straightforward. By assumption, there is some 0

≤ λ <

1 such

that

d(f(x), f(y)) ≤ λd(x, y)

for all x, y ∈ X. If x and y are both fixed points, then this says

d(x, y) = d(f(x), f(y)) ≤ λd(x, y).

This is possible only if d(x, y) = 0, i.e. x = y.

To prove existence, the idea is to pick a point

and keep applying

. Let

∈ X. We define the sequence (x

) inductively by

n+1

= f(x

We first show that this is Cauchy. For any n ≥ 1, we can compute

d(x

n+1

, x

) = d(f(x

), f(x

n−1

)) ≤ λd(x

, x

n−1

) ≤ λ

d(x

, x

Since this is true for any n, for m > n, we have

d(x

, x

) ≤ d(x

, x

m−1

) + d(x

m−1

, x

m−2

) + ··· + d(x

n+1

, x

)

m−1

j=n

d(x

j+1

, x

)

m−1

j=n

d(x

, x

)

≤ d(x

, x

)

∞

j=n

1 − λ

d(x

, x

Note that we have again used the property that λ < 1.

This implies

(

, x

)

→

0 as

m, n → ∞

. So this sequence is Cauchy. By

the completeness of

, there exists some

x ∈ X

such that

→ x

. Since

is a contraction, it is continuous. So

(

)

→ f

(

). However, by definition

(

) =

n+1

. So taking the limit on both sides, we get

(

) =

. So

is a

fixed point.

Now suppose that

(m)

is a contraction for some

. Hence by the first part,

there is a unique x ∈ X such that f

(m)

(x) = x. But then

(m)

(f(x)) = f

(m+1)

(x) = f(f

(m)

(x)) = f(x).

(

) is also a fixed point of

(n)

(

). By uniqueness of fixed points, we must

have

(

) =

. Since any fixed point of

is clearly a fixed point of

(n)

as well,

it follows that x is the unique fixed point of f.

Based on the proof of the theorem, we have the following error estimate in

the contraction mapping theorem: for

∈ X

and

(

n−1

), we showed

that for m > n, we have

d(x

, x

) ≤

1 − λ

d(x

, x

If x

→ x, taking the limit of the above bound as m → ∞ gives

d(x, x

) ≤

1 − λ

d(x

, x

This is valid for all n.

We are now going to use this to obtain the Picard-Lindel¨of existence theorem

for ordinary differential equations. The objective is as follows. Suppose we are

given a function

F = (F

, F

, ··· , F

) : R × R

→ R

We interpret the R as time and the R

as space.

Given

∈ R

and x

∈ R

, we want to know when can we find a solution to

the ODE

= F(t, f (t))

subject to

(

) = x

. We would like this solution to be valid (at least) for all

in some interval I containing t

More explicitly, we want to understand when will there be some

ε >

and a differentiable function f = (

, ··· , f

) : (

− ε, t

)

→ R

(i.e.

: (t

− ε, t

+ ε) → R is differentiable for all j) satisfying

= F

(t, f

(t), ··· , f

(t))

such that f

) = x

(j)

for all j = 1, . . . , n and t ∈ (t

− ε, t

+ ε).

We can imagine this scenario as a particle moving in

, passing through x

at time

. We then ask if there is a trajectory f(

) such that the velocity of the

particle at any time t is given by F(t, f(t)).

This is a complicated system, since it is a coupled system of many variables.

Explicit solutions are usually impossible, but in certain cases, we can prove the

existence of a solution. Of course, solutions need not exist for arbitrary

. For

example, there will be no solution if

is everywhere discontinuous, since any

derivative is continuous in a dense set of points. The Picard-Lindel¨of existence

theorem gives us sufficient conditions for a unique solution to exists.

We will need the following notation

Notation. For x

∈ R

, R > 0, we let

) = {x ∈ R

: ∥x − x

∥

≤ R}.

Then the theorem says

Theorem (Picard-Lindel¨of existence theorem). Let x

∈ R

R >

a < b

∈ [a, b]. Let F : [a, b] ×B

) → R

be a continuous function satisfying

∥F(t, x) − F(t, y)∥

≤ κ∥x − y∥

for some fixed

κ >

0 and all

t ∈

[

a, b

]

∈ B

)

. In other words,

(

t, ·

) :

→ R

is Lipschitz on

) with the same Lipschitz constant for every

Then

(i)

There exists an

ε >

0 and a unique differentiable function f : [

− ε, t

ε] ∩ [a, b] → R

such that

= F(t, f (t)) (∗)

and f(t

) = x

(ii) If

sup

[a,b]×

)

∥F∥

≤

b − a

then there exists a unique differential function f : [

a, b

]

→ R

that satisfies

the differential equation and boundary conditions above.

Even

= 1 is an important, special, non-trivial case. Even if we have only

one dimension, explicit solutions may be very difficult to find, if not impossible.

For example,

= f

+ sin f + e

would be almost impossible to solve. However, the theorem tells us there will be

a solution, at least locally.

Note that any differentiable

satisfying the differential equation is auto-

matically continuously differentiable, since the derivative is F(

)), which is

continuous.

Before we prove the theorem, we first show the requirements are indeed

necessary. We first look at that

in (i). Without the addition requirement in (ii),

there might not exist a solution globally on [

a, b

]. For example, we can consider

the n = 1 case, where we want to solve

= f

with boundary condition

(0) = 1. Our

(

t, f

) =

is a nice, uniformly

Lipschitz function on any [0

, b

]

× B

(1) = [0

, b

]

− R,

1 +

]. However, we

will shortly see that there is no global solution.

If we assume f = 0, then for all t ∈ [0, b], the equation is equivalent to

(t + f

−1

) = 0.

So we need

−1

to be constant. The initial conditions tells us this constant

is 1. So we have

f(t) =

1 − t

Hence the solution on [0

1) is

1−t

. Any solution on [0

, b

] must agree with this

on [0, 1). So if b ≥ 1, then there is no solution in [0, b].

The Lipschitz condition is also necessary to guarantee uniqueness. Without

this condition, existence of a solution is still guaranteed (but is another theorem,

the Cauchy-Peano theorem), but we could have many different solutions. For

example, we can consider the differential equation

|f|

with

(0) = 0. Here

(

t, x

) =

|x|

is not Lipschitz near

= 0. It is easy to see

that both

= 0 and

(

) =

are both solutions. In fact, for any

α ∈

, b

the function

(t) =

(

0 0 ≤ t ≤ α

(t − α)

α ≤ t ≤ b

is also a solution. So we have an infinite number of solutions.

We are now going to use the contraction mapping theorem to prove this. In

general, this is a very useful idea. It is in fact possible to use other fixed point

theorems to show the existence of solutions to partial differential equations. This

is much more difficult, but has many far-reaching important applications to

theoretical physics and geometry, say. For these, see Part III courses.

Proof. First, note that (ii) implies (i). We know that

sup

[a,b]×B

(x)

∥F∥

is bounded since it is a continuous function on a compact domain. So we can

pick an ε such that

2ε ≤

sup

[a,b]×B

(x)

∥F∥

Then writing [t

− ε, t

+ ε] ∩ [a, b] = [a

, b

], we have

sup

]×B

(x)

∥F∥ ≤ sup

[a,b]×B

(x)

∥F∥ ≤

2ε

≤

− a

So (ii) implies there is a solution on [

− ε, t

]

∩

[

a, b

]. Hence it suffices to

prove (ii).

To apply the contraction mapping theorem, we need to convert this into

a fixed point problem. The key is to reformulate the problem as an integral

equation. We know that a differentiable f : [

a, b

]

→ R

satisfies the differential

equation (∗) if and only if f : [a, b] → B

) is continuous and satisfies

f(t) = x

F(s, f (s)) ds

by the fundamental theorem of calculus. Note that we don’t require f is dif-

ferentiable, since if a continuous f satisfies this equation, it is automatically

differentiable by the fundamental theorem of calculus. This is very helpful, since

we can work over the much larger vector space of continuous functions, and it

would be easier to find a solution.

We let X = C([a, b], B

)). We equip X with the supremum metric

∥g − h∥ = sup

t∈[a,b]

∥g(t) − h(t)∥

We see that

is a closed subset of the complete metric space

([

a, b

]

, R

) (again

taken with the supremum metric). So

is complete. For every g

∈ X

, we define

a function T g : [a, b] → R

(T g)(t) = x

F(s, g(s)) ds.

Our differential equation is thus

f = T f .

So we first want to show that

is actually mapping

X → X

, i.e.

∈ X

whenever g ∈ X, and then prove it is a contraction map.

We have

∥T g(t) − x

∥



F(s, g(s)) ds



≤



∥F(s, g(s))∥



≤ sup

[a,b]×B

)

∥F∥ · |b − a|

≤ R

Hence we know that T g(t) ∈ B

). So T g ∈ X.

Next, we need to show this is a contraction. However, it turns out

need

not be a contraction. Instead, what we have is that for g

, g

∈ X, we have

∥T g

(t) − T g

(t)∥



F(s, g

(s)) − F(s, g

(s)) ds



≤



∥F(s, g

(s)) − F(s, g

(s))∥



≤ κ(b − a)∥g

− g

∥

∞

by the Lipschitz condition on F . If we indeed have

κ(b − a) < 1, (†)

then the contraction mapping theorem gives an f ∈ X such that

T f = f ,

i.e.

f = x

F(s, f (s)) ds.

However, we do not necessarily have (

†

). There are many ways we can solve this

problem. Here, we can solve it by finding an

such that

(m)

T ◦T ◦···◦T

X → X

is a contraction map. We will in fact show that this map satisfies the

bound

sup

t∈[a,b]

∥T

(m)

(t) − T

(m)

(t)∥ ≤

(b − a)

sup

t∈[a,b]

∥g

(t) − g

(t)∥. (‡)

The key is the

!, since this grows much faster than any exponential. Given this

bound, we know that for sufficiently large m, we have

(b − a)

< 1,

i.e.

(m)

is a contraction. So by the contraction mapping theorem, the result

holds.

So it only remains to prove the bound. To prove this, we prove instead the

pointwise bound: for any t ∈ [a, b], we have

∥T

(m)

(t) − T

(m)

(t)∥

≤

(|t − t

sup

s∈[t

,t]

∥g

(s) − g

(s)∥.

From this, taking the supremum on the left, we obtain the bound (‡).

To prove this pointwise bound, we induct on

. We wlog assume

t > t

. We

know that for every m, the difference is given by

∥T

(m)

(t) − T

(m)

(t)∥



F (s, T

(m−1)

(s)) − F (s, T

(m−1)

(s)) ds



≤ κ

∥T

(m−1)

(s) − T

(m−1)

(s)∥

ds.

This is true for all m. If m = 1, then this gives

∥T g

(t) − T g

(t)∥ ≤ κ(t − t

) sup

,t]

∥g

− g

∥

So the base case is done.

For

m ≥

2, assume by induction the bound holds with

m −

1 in place of

Then the bounds give

∥T

(m)

(t) − T

(m)

(t)∥ ≤ κ

m−1

(s − t

)

m−1

(m − 1)!

sup

,s]

∥g

− g

∥

≤

(m − 1)!

sup

,t]

∥g

− g

∥

(s − t

)

m−1

(t − t

)

sup

,t]

∥g

− g

∥

So done.

Note that to get the factor of

!, we had to actually perform the integral,

instead of just bounding (

s − t

)

m−1

by (

t − t

). In general, this is a good

strategy if we want tight bounds. Instead of bounding



f(x) dx



≤ (b − a) sup |f(x)|,

we write

(

) =

(

)

(

), where

(

) is something easily integrable. Then we

can have a bound



f(x) dx



≤ sup |g(x)|

|h(x)| dx.