2The Cauchy–Kovalevskaya theorem

III Analysis of Partial Differential Equations



2.1 The Cauchy–Kovalevskaya theorem
Before we begin talking about PDEs, let’s recall what we already know about
ODEs. Fix some
U R
n
an open subset, and assume
f
:
U R
n
is given.
Consider the ODE
˙u(t) = f(u(t)).
This is an autonomous ODE because there is no explicit
t
dependence on the
right. This assumption is usually harmless, as we can just increment
n
and use
the new variable to keep track of
t
. Here
u
: (
a, b
)
U
is the unknown, where
a < 0 < b.
The Cauchy problem for this equation is to find a solution to the ODE
satisfying u(0) = u
0
U for any u
0
.
The Picard–Lindel¨of theorem says we can always do so under some mild
conditions.
Theorem (Picard–Lindel¨of theorem). Suppose that there exists r, K > 0 such
that B
r
(u
0
) U, and
kf(x) f(y)k Kkx u
0
k
for all
x, y B
r
(
u
0
). Then there exists an
ε >
0 depending on
K, r
and a unique
C
1
function u : (ε, ε) U solving the Cauchy problem.
It is instructive to give a quick proof sketch of the result.
Proof sketch.
If
u
is a solution, then by the fundamental theorem of calculus,
we have
u(t) = u
0
+
Z
t
0
f(u(s)) ds.
Conversely, if
u
is a
C
0
solution to this integral equation, then it solves the
ODE. Crucially, this only requires
u
to be
C
0
. Indeed, if
u
is
C
0
and satisfies
the integral equation, then
u
is automatically
C
1
. So we can work in a larger
function space when we seek for u.
Thus, we have reformulated our initial problem into an integral equation. In
particular, we reformulated it in a way that assumes less about the function. In
the case of PDEs, this is what is known as a weak formulation.
Returning to the proof, we have reformulated our problem as looking for a
fixed point of the map
B : w 7→ u
0
+
Z
t
0
f(w(s)) ds
acting on
C = {w : [ε, ε] B
r/2
(u
0
) : w is continuous}.
This is a complete metric space when we equip it with the supremum norm (in
fact, it is a closed ball in a Banach space).
We then show that for
ε
small enough, this map
B
:
C C
is a contraction
map. There are two parts to show that it actually lands in
C
, and that it is
a contraction. If we managed to show these, then by the contraction mapping
theorem, there is a unique fixed point, and we are done.
The idea of formulating our problem as a fixed point problem is a powerful
technique that allows us to understand many PDEs, especially non-linear ones.
This theorem tells us that a unique
C
1
solution exists locally. It is not reasonable
to believe it can exist globally, as we might run out of
U
in finite time. However,
if
f
is better behaved, we might expect
u
to be more regular, and indeed this is
the case. We shall not go into the details.
How can we actually use the theorem in practice? Can we actually obtain
a solution from this? Recall that to prove the contraction mapping theorem,
what we do is that we arbitrarily pick a point in
C
, keep applying
B
, and by the
contraction, we must approach the fixed point. This gives us a way to construct
an approximation to the ODE.
However, if we were a physicist, we would have done things differently.
Suppose
f C
. We can then attempt to construct a Taylor series of the
solution near the origin. First we note that for any solution u, we must have
u(0) = u
0
, ˙u(0) = f (u
0
).
Assuming
u
is in fact a smooth solution, we can differentiate the ODE and obtain
¨u(t) =
d
dt
˙u(t) =
d
dt
f(u(t)) = Df(u(t)) ˙u(t) f
2
(u(t), ˙u(t)).
At the origin, we already know what
u
and
˙u
. We can proceed iteratively to
determine
u
(k)
(t) = f
k
(u, ˙u, . . . , u
(k1)
).
So in particular, we can in principle determine
u
k
u
(k)
= 0. At least formally,
we can write
u(t) =
X
k=0
u
k
t
k
k!
.
If we were physicists, we would say we are done. But being honest mathemati-
cians, in order to claim that we have a genuine solution, we need to at least
show that this converges. Under suitable circumstances, this is given by the
Cauchy–Kovalevskaya theorem.
Theorem (Cauchy–Kovalevskaya for ODEs). The series
u(t) =
X
k=0
u
k
t
k
k!
.
converges to the Picard–Lindel¨of solution of the Cauchy problem if
f
is real
analytic in a neighbourhood of u
0
.
Recall that being real analytic means being equal to its Taylor series:
Definition
(Real analytic)
.
Let
U R
n
be open, and suppose
f
:
U R
. We
say
f
is real analytic near
x
0
U
if there exists
r >
0 and constants
f
α
R
for
each multi-index α such that
f(x) =
X
α
f
α
(x x
0
)
α
for |x x
0
| < r.
Note that if
f
is real analytic near
x
0
, then it is in fact
C
in the corre-
sponding neighbourhood. Furthermore, the constants f
α
are given by
f
α
=
1
α!
D
α
f(x
0
).
In other words,
f
equals its Taylor expansion. Of course, by translation, we can
usually assume x
0
= 0.
Example. If r > 0, set
f(x) =
r
r (x
1
+ x
2
+ ··· + x
n
)
for |x| <
r
n
. Then this is real analytic, since we have
f(x) =
1
1 (x
1
+ ··· + x
n
)/r
=
X
k=0
x
1
+ ··· + x
n
r
k
.
We can then expand out each term to see that this is given by a power series.
Explicitly, it is given by
f(x) =
X
α
1
r
|α|
|α|
α
x
α
,
where
|α|
α
=
|α|!
α!
.
One sees that this series is absolutely convergent for |x| <
r
n
.
Recall that in single-variable analysis, essentially the only way we have to
show that a series converges is by comparison to the geometric series. Here with
multiple variables, our only way to show that a power series converges is by
comparing it to this f.
Definition (Majorant). Let
f =
X
α
f
α
x
α
, g =
X
α
g
α
x
α
be formal power series. We say
g
majorizes
f
(or
g
is a majorant of
f
), written
g f, if g
α
|f
α
| for all multi-indices α.
If f and A are vector-valued, then this means g
i
f
i
for all indices i.
Lemma.
(i) If g f and g converges for |x| < r, then f converges for |x| < r.
(ii)
If
f
(
x
) =
P
α
f
α
x
α
converges for
x < r
and 0
< s
n < r
, then
f
has a
majorant which converges on |x| < s.
Proof.
(i) Given x, define ˜x = (|x
1
|, |x
2
|, . . . , |x
n
|). We then note that
X
α
|f
α
x
α
| =
X
α
|f
α
|˜x
α
X
α
g
α
˜x
α
= g(˜x).
Since |˜x| = |x| < r, we know g converges at ˜x.
(ii) Let 0 < s
n < r and set y = s(1, 1, . . . , 1). Then we have
|y| = s
n < r.
So by assumption, we know
X
α
f
α
y
α
converges. A convergent series has bounded terms, so there exists
C
such
that
|f
α
y
α
| C
for all α. But y
α
= s
|α|
. So we know
|f
α
|
C
s
|α|
C
s
|α|
|α|!
α!
.
But then if we set
g(x) =
Cs
s (x
1
+ ··· + x
n
)
= C
X
α
|α|!
s
|α|
α!
x
α
,
we are done, since this converges for |x| <
s
n
.
With this lemma in mind, we can now prove the Cauchy–Kovalevskaya
theorem for first-order PDEs. This concerns a class of problems similar to the
Cauchy problem for ODEs. We first set up our notation.
We shall consider functions
u
:
R
n
R
m
. Writing
x
= (
x
1
, . . . , x
n
)
R
n
,
we will consider the last variable
x
n
as being the “time variable”, and the others
as being space. However, for notational convenience, we will not write it as
t
.
We will adopt the shorthand x
0
= (x
1
, . . . , x
n1
), so that x = (x
0
, x
n
).
Suppose we are given two real analytic functions
B : R
m
× R
n1
Mat
m×m
(R)
c : R
m
× R
n1
R
m
.
We seek a solution to the PDE
u
x
n
=
n1
X
j=1
B(u, x
0
)u
x
j
+ c(u, x
0
)
subject to
u
= 0 when
x
n
= 0. We shall not require a solution on all of
R
n
,
but only on an open neighbourhood of the origin. Consequently, we will allow
for
B
and
c
to not be everywhere defined, but merely convergent on some
neighbourhood of the identity.
Note that we assumed
B
and
c
do not depend on
x
n
, but this is not a
restriction, since we can always introduce a new variable
u
m+1
=
x
n
, and enlarge
the target space.
Theorem
(Cauchy–Kovalevskaya theorem)
.
Given the above assumptions, there
exists a real analytic function
u
=
P
α
u
α
x
α
solving the PDE in a neighbourhood
of the origin. Moreover, it is unique among real analytic functions.
The uniqueness part of the proof is not difficult. If we write out
u
,
B
and
c
in power series and plug them into the PDE, we can then simply collect terms
and come up with an expression for what
u
must be. This is the content of the
following lemma:
Lemma.
For
k
= 1
, . . . , m
and
α
a multi-index in
N
n
, there exists a polynomial
q
k
α
in the power series coefficients of
B
and
c
such that any analytic solution to
the PDE must be given by
u =
X
α
q
α
(B, c)x
α
,
where q
α
is the vector with entries q
k
α
.
Moreover, all coefficients of q
α
are non-negative.
Note that despite our notation,
q
is not a function of
B
and
c
(which are
themselves functions of
u
and
x
). It is a function of the coefficients in the power
series expansion of B and c, which are some fixed constants.
This lemma proves uniqueness. To prove existence, we must show that this
converges in a neighbourhood of the origin, and for this purpose, the fact that
the coefficients of
q
α
are non-negative is crucial. After we have established this,
we will use the comparison test to reduce the theorem to the case of a single,
particular PDE, which we can solve by hand.
Proof.
We construct the polynomials
q
k
α
by induction on
α
n
. If
α
n
= 0, then
since u = 0 on {x
n
= 0}, we conclude that we must have
u
α
=
D
α
u(0)
α!
= 0.
For
α
n
= 1, we note that whenever
x
n
= 0, we have
u
x
j
= 0 for
j
= 1
, . . . , n
1.
So the PDE reads
u
x
n
(x
0
, 0) = c(0, x
0
).
Differentiating this relation in directions tangent to
x
n
= 0, we find that if
α = (α
0
, 1), then
D
α
u(0) = D
α
0
c(0, 0).
So
q
k
α
is a polynomial in the power series coefficients of
c
, and has non-negative
coefficients.
Now suppose α
n
= 2, so that α = (α
0
, 2). Then
D
α
u = D
α
0
(u
x
n
)
x
n
= D
α
0
X
j
B
j
u
x
j
+ c
x
n
= D
α
0
X
j
B
j
u
x
j
,x
n
+
X
p
B
u
p
u
x
j
u
p
x
n
!
+
X
p
c
u
p
u
p
x
n
We don’t really care what this looks like. The point is that when we evaluate at 0,
and expand all the terms out, we get a polynomial in the derivatives of
B
j
and
c
,
and also D
β
u
with
β
n
<
2. The derivatives of
B
j
and
c
are just the coefficients
of the power series expansion of
B
j
and
c
, and by the induction hypothesis, we
can also express the D
β
u
in terms of these power series coefficients. Thus, we
can use this to construct
q
α
. By inspecting what the formula looks like, we see
that all coefficients in q
α
are non-negative.
We see that we can continue doing the same computations to obtain all q
α
.
An immediate consequence of the non-negativity is that
Lemma. If
˜
B
j
B
j
and
˜
c c, then
q
k
α
(
˜
B,
˜
c) > q
k
α
(B, c).
for all α. In particular,
˜
u u.
So given any
B
and
c
, if we can find some
˜
B
and
˜
c
that majorizes
B
and
c
respectively, and show that the corresponding series converges for
˜
B
and
˜
c
, then
we are done.
But we previously saw that every power series is majorized by
Cr
r (x
1
+ ··· + x
n
)
for
C
sufficiently large and
r
sufficiently small. So we have reduced the problem
to the following case:
Lemma. For any C and r, define
h(z, x
0
) =
Cr
r (x
1
+ ··· + x
n1
) (z
1
+ ··· + z
m
)
If B and c are given by
B
j
(z, x
0
) = h(z, x
0
)
1 ··· 1
.
.
.
.
.
.
.
.
.
1 ··· 1
, c
(z, x
0
) = h(z, x
0
)
1
.
.
.
1
,
then the power series
u =
X
α
q
α
(B, c)x
α
converges in a neighbourhood of the origin.
We’ll provide a rather cheap proof, by just writing down a solution of
the corresponding PDE. The solution itself can be found via the method of
characteristics, which we will learn about soon. However, the proof itself only
requires the existence of the solution, not how we got it.
Proof. We define
v(x) =
1
mn
r (x
1
+ ··· + x
n1
)
p
(r (x
1
+ ··· + x
n1
))
2
2mnCrx
n
,
which is real analytic around the origin, and vanishes when
x
n
= 0. We then
observe that
u(x) = v(x)
1
.
.
.
1
gives a solution to the corresponding PDE, and is real analytic around the origin.
Hence it must be given by that power series, and in particular, the power series
must converge.