Part III — Analysis of Partial Differential
Equations
Based on lectures by C. Warnick
Notes taken by Dexter Chua
Michaelmas 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
This course serves as an introduction to the mathematical study of Partial Differential
Equations (PDEs). The theory of PDEs is nowadays a huge area of active research,
and it goes back to the very birth of mathematical analysis in the 18th and 19th
centuries. The subject lies at the crossroads of physics and many areas of pure and
applied mathematics.
The course will mostly focus on four prototype linear equations: Laplace’s equation,
the heat equation, the wave equation and Schr¨odinger’s equation. Emphasis will be
given to modern functional analytic techniques, relying on a priori estimates, rather
than explicit solutions, although the interaction with classical methods (such as the
fundamental solution and Fourier representation) will be discussed. The following basic
unifying concepts will be studied: well-posedness, energy estimates, elliptic regularity,
characteristics, propagation of singularities, group velocity, and the maximum principle.
Some non-linear equations may also be discussed. The course will end with a discussion
of major open problems in PDEs.
Pre-requisites
There are no specific pre-requisites beyond a standard undergraduate analysis back-
ground, in particular a familiarity with measure theory and integration. The course
will be mostly self-contained and can be used as a first introductory course in PDEs
for students wishing to continue with some specialised PDE Part III courses in the
Lent and Easter terms.
Contents
0 Introduction
1 Basics of PDEs
2 The Cauchy–Kovalevskaya theorem
2.1 The Cauchy–Kovalevskaya theorem
2.2 Reduction to first-order systems
3 Function spaces
3.1 The H¨older spaces
3.2 Sobolev spaces
3.3 Approximation of functions in Sobolev spaces
3.4 Extensions and traces
3.5 Sobolev inequalities
4 Elliptic boundary value problems
4.1 Existence of weak solutions
4.2 The Fredholm alternative
4.3 The spectrum of elliptic operators
4.4 Elliptic regularity
5 Hyperbolic equations
0 Introduction
Partial differential equations are ubiquitous in mathematics, physics, and beyond.
The first equation we have met might be Laplace’s equation, saying
−∆u = −
n
X
i=1
∂
2
u
∂x
2
i
= 0.
This is the canonical example of an elliptic PDE, and we will spend a lot of
time thinking about elliptic PDEs, since they tend to be very well-behaved.
Instead of trying to explicitly solve equations, as we did in, say, IB Methods, our
focus is mostly on the existence and uniqueness of solutions, without explicitly
constructing them. This will involve the use of machinery from functional
analysis, and indeed a lot of the work will be about showing that we satisfy the
hypotheses required by the functional-analytic results (as well as proving the
functional-analytic results themselves (sometimes)).
We will also consider hyperbolic equations. The canonical example is the
wave equation
∂
2
u
∂t
2
− ∆u = 0.
The difference is that the time derivative term now has a different sign from the
rest. In Laplace’s equation, all directions were equal. Here time is a “special”
direction, and often our questions are about how the solution evolves in time.
Of course, we don’t “just solve” such equations. Usually, we impose some
data, such as the desired values of
u
on the boundary of our domain, or the
“starting configuration” in the case of the wave equation. In general, given such
a system, there are several questions we can ask:
– Does a solution exist?
– Is the solution unique?
– Does the solution depend continuously on the data?
–
How regular is the solution? Is it continuously differentiable? Or even
smooth?
These questions are closely related. To even make sense of the question, we
need to specify our “search space”, i.e. the sort of functions we are willing to
consider. For example, we may consider the space of all smooth functions, or
less ambitiously, the space of all twice-differentiable functions. This somewhat
answers the last question, but it doesn’t answer it completely. It could be that
we can try to search for the solution in the space of
C
2
functions, but it turns
out the solutions are always smooth!
The choice of this function space affects the answers to the other questions as
well. If we have a larger function space, then we are more likely to get a positive
answer to the first question. However, since there are more functions around, we
are more likely to get a negative function to the second question. So there is
some tension here.
The choice affects the third question in a slightly more subtle way. To speak
of continuity, we must pick a topology, and this usually comes from a norm on
the function space. Thus, to make sense of the third question, we must pick the
appropriate norm on both the space of data and the space of potential solutions.
After choosing the appropriate function spaces, if the answers to the first
three questions are all “yes”, then we say the problem is well-posed.
1 Basics of PDEs
It might be wise to define what a partial differential equation is.
Definition
(Partial differential equation)
.
Suppose
U ⊆ R
n
is open. A partial
differential equation (PDE) of order k is a relation of the form
F (x, u(x), Du(x), . . . , D
k
u(x)) = 0, (∗)
where
F
:
U ×R ×R
n
×R
n
2
×···×R
n
k
→ R
is a given function, and
u
:
U → R
is the “unknown”.
Definition
(Classical solution)
.
We say
u ∈ C
k
(
U
) is a classical solution of a
PDE if in fact the PDE is identically satisfied on
U
when
u,
D
u, . . . ,
D
k
u
are
substituted in.
More generally, we can allow
u
and
F
to take values in a vector space. In
this case, we say it is a system of PDEs.
We can now entertain ourselves by writing out a large list of PDEs that are
naturally found in physics and mathematics.
Example
(Transport equation)
.
Suppose
v
:
R
4
× R → R
3
and
f
:
R
4
→ R
are
given. The transport equation is
∂u
∂t
(x, t) + v(x, t, u(x, t)) · D
x
u(x, t) = f(x, t)
where we think of
x ∈ R
3
and
t ∈ R
. This describes the evolution of the density
u of some chemical being advected by a flow v and produced at a rate f.
We see that this is a PDE of order 1, and a relatively straightforward solution
method exists, namely the method of characteristics.
Example
(Laplace’s and Poissson’s equations)
.
Taking
u
:
R
n
→ R
, Laplace’s
equation is
∆u(x) =
n
X
i=1
∂
2
u
∂x
i
∂x
i
(x) = 0.
This describes, for example, the electrostatic potential in vacuum and the static
distribution of heat inside a uniform solid body. It also has applications to
steady flows in 2d fluids.
There is an inhomogeneous version of this:
∆u(x) = f(x),
where
f
:
R
n
→ R
is a fixed function. This is known as Poisson’s equation, and
describes, for example, the electrostatic field due to a charge distribution, and
the gravitational field in Newtonian gravity.
Example (Heat/diffusion equation). This is given by
∂u
∂t
= ∆u,
where
u
:
R
n
× R → R
is now a function of space and time. This describes the
evolution of temperature inside a uniform body, or equivalently the diffusion of
some chemical (where u is the density).
Example (Wave equation). The wave equation is given by
∂
2
u
∂t
2
= ∆u,
where
u
:
R
n
× R → R
is again a function of space and time. This describes
oscillations of
– strings (n = 1)
– membrane/drum (n = 2)
– air density in a sound wave (n = 3)
Example
(Schr¨odinger’s equation)
.
Let
u
:
R
n
× R → C
∼
=
R
2
. Up to choices
of units and convention, the Schr¨odinger’s equation is
i
∂u
∂t
+ ∆u − V u = 0.
Here u is the wavefunction of a particle moving in a potential V : R
n
→ R.
Example
(Maxwell’s equations)
.
The unknowns here are
E, B
:
R
3
× R → R
3
.
They satisfy Maxwell’s equations
∇ · E = ρ ∇ · B = 0
∇ × E +
∂B
∂t
= 0 ∇ × B −
∂E
∂t
= J,
where
ρ
is the electric charge density,
J
is the electric current,
E
is the electric
field and B is the magnetic field.
This is a system of 6 equations and 6 unknowns.
Example (Einstein’s equations). The Einstein’s equation in vacuum are
R
µν
[g] = 0,
where
g
is a Lorentzian metric (encoding the gravitational field), and
R
µν
[
g
] is
the Ricci curvature of g.
Since we haven’t said what
g
and
R
µν
are, it is not clear that this is a partial
differential equation, but it is.
Example (Minimal surface equation). The minimal surface equation is
Div
Du
p
1 + |Du|
2
!
= 0,
where
u
:
R
n
→ R
is some function. This is the condition that the graph of
u
,
{(x, u(x))} ⊆ R
n
× R, is locally an extremizer of area.
Example
(Ricci flow)
.
Let
g
be a Riemannian metric on some manifold. The
Ricci flow is a PDE that evolves this metric:
∂g
ij
∂t
= R
ij
[g],
where R
ij
is again the Ricci curvature.
The most famous application is in proving the Poincar´e conjecture, which is
a topological conjecture about 3-manifolds.
These PDEs exhibit a wide variety of behaviours. For example, waves behave
very differently from the evolution of temperature. This means it is unlikely
that we can say anything about PDEs as a whole, since everything we say must
be true for both the heat equation and the wave equation. We must restrict
to some particular classes of PDEs to say something useful. Thus, we seek to
classify our PDEs into different types. We first introduce some notation.
In this course, the natural numbers start at 0.
Notation
(Multi-index/Schwartz notation)
.
We say an element
α ∈ N
n
is a
multi-index. Writing α = (α
1
, . . . , α
n
). We write
|α| = α
1
+ α
2
+ ··· + α
n
.
Also, we have
D
α
f =
∂
|α|
f
∂x
α
1
1
∂x
α
2
2
···∂x
α
n
n
.
If x = (x
1
, . . . , x
n
) ∈ R
n
, then
x
α
= x
α
1
1
x
α
2
2
···x
α
n
n
.
We also write
α! = α
1
!α
2
! ···α
n
!.
We now try to crudely classify the PDEs we have written down. Recall that
our PDEs take the general form
F (x, u(x), Du(x), . . . , D
k
u(x)) = 0.
Definition
(Linear PDE)
.
We say a PDE is linear if
F
is a linear function of
u
and its derivatives. In this case, we can re-write it as
X
|α|≤k
a
α
(x)D
α
u = 0.
Definition
(Semi-linear PDE)
.
We say a PDE is semi-linear if it is of the form
X
|α|=k
a
α
(x)D
α
u(x) + a
0
[x, u, Du, . . . , D
k−1
u] = 0.
In other words, the terms involving the highest order derivatives are linear.
Generalizing further, we have
Definition
(Quasi-linear PDE)
.
We say a PDE is quasi-linear if it is of the
form
X
|α|=k
a
α
[x, u, Du, . . . , D
k−1
u]D
α
u(x) + a
0
[x, u, . . . , D
k−1
u] = 0.
So the highest order derivative still appears linearly, but the coefficients can
depend on lower-order derivatives of u.
Finally, we have
Definition
(Fully non-linear PDE)
.
A PDE is fully non-linear if it is not
quasi-linear.
Example. Laplace’s equation ∆u = 0 is linear.
Example. The equation u
xx
+ u
yy
= u
2
x
is semi-linear.
Example. The equation uu
xx
+ u
yy
= u
2
x
is quasi-linear.
Example. The equation u
xx
u
yy
− u
2
xy
= 0 is fully non-linear.
2 The Cauchy–Kovalevskaya theorem
2.1 The Cauchy–Kovalevskaya theorem
Before we begin talking about PDEs, let’s recall what we already know about
ODEs. Fix some
U ⊆ R
n
an open subset, and assume
f
:
U → R
n
is given.
Consider the ODE
˙u(t) = f (u(t)).
This is an autonomous ODE because there is no explicit
t
dependence on the
right. This assumption is usually harmless, as we can just increment
n
and use
the new variable to keep track of
t
. Here
u
: (
a, b
)
→ U
is the unknown, where
a < 0 < b.
The Cauchy problem for this equation is to find a solution to the ODE
satisfying u(0) = u
0
∈ U for any u
0
.
The Picard–Lindel¨of theorem says we can always do so under some mild
conditions.
Theorem (Picard–Lindel¨of theorem). Suppose that there exists r, K > 0 such
that B
r
(u
0
) ⊆ U , and
kf(x) − f(y)k ≤ Kkx − u
0
k
for all
x, y ∈ B
r
(
u
0
). Then there exists an
ε >
0 depending on
K, r
and a unique
C
1
function u : (−ε, ε) → U solving the Cauchy problem.
It is instructive to give a quick proof sketch of the result.
Proof sketch.
If
u
is a solution, then by the fundamental theorem of calculus,
we have
u(t) = u
0
+
Z
t
0
f(u(s)) ds.
Conversely, if
u
is a
C
0
solution to this integral equation, then it solves the
ODE. Crucially, this only requires
u
to be
C
0
. Indeed, if
u
is
C
0
and satisfies
the integral equation, then
u
is automatically
C
1
. So we can work in a larger
function space when we seek for u.
Thus, we have reformulated our initial problem into an integral equation. In
particular, we reformulated it in a way that assumes less about the function. In
the case of PDEs, this is what is known as a weak formulation.
Returning to the proof, we have reformulated our problem as looking for a
fixed point of the map
B : w 7→ u
0
+
Z
t
0
f(w(s)) ds
acting on
C = {w : [−ε, ε] → B
r/2
(u
0
) : w is continuous}.
This is a complete metric space when we equip it with the supremum norm (in
fact, it is a closed ball in a Banach space).
We then show that for
ε
small enough, this map
B
:
C → C
is a contraction
map. There are two parts — to show that it actually lands in
C
, and that it is
a contraction. If we managed to show these, then by the contraction mapping
theorem, there is a unique fixed point, and we are done.
The idea of formulating our problem as a fixed point problem is a powerful
technique that allows us to understand many PDEs, especially non-linear ones.
This theorem tells us that a unique
C
1
solution exists locally. It is not reasonable
to believe it can exist globally, as we might run out of
U
in finite time. However,
if
f
is better behaved, we might expect
u
to be more regular, and indeed this is
the case. We shall not go into the details.
How can we actually use the theorem in practice? Can we actually obtain
a solution from this? Recall that to prove the contraction mapping theorem,
what we do is that we arbitrarily pick a point in
C
, keep applying
B
, and by the
contraction, we must approach the fixed point. This gives us a way to construct
an approximation to the ODE.
However, if we were a physicist, we would have done things differently.
Suppose
f ∈ C
∞
. We can then attempt to construct a Taylor series of the
solution near the origin. First we note that for any solution u, we must have
u(0) = u
0
, ˙u(0) = f(u
0
).
Assuming
u
is in fact a smooth solution, we can differentiate the ODE and obtain
¨u(t) =
d
dt
˙u(t) =
d
dt
f(u(t)) = Df(u(t)) ˙u(t) ≡ f
2
(u(t), ˙u(t)).
At the origin, we already know what
u
and
˙u
. We can proceed iteratively to
determine
u
(k)
(t) = f
k
(u, ˙u, . . . , u
(k−1)
).
So in particular, we can in principle determine
u
k
≡ u
(k)
= 0. At least formally,
we can write
u(t) =
∞
X
k=0
u
k
t
k
k!
.
If we were physicists, we would say we are done. But being honest mathemati-
cians, in order to claim that we have a genuine solution, we need to at least
show that this converges. Under suitable circumstances, this is given by the
Cauchy–Kovalevskaya theorem.
Theorem (Cauchy–Kovalevskaya for ODEs). The series
u(t) =
∞
X
k=0
u
k
t
k
k!
.
converges to the Picard–Lindel¨of solution of the Cauchy problem if
f
is real
analytic in a neighbourhood of u
0
.
Recall that being real analytic means being equal to its Taylor series:
Definition
(Real analytic)
.
Let
U ⊆ R
n
be open, and suppose
f
:
U → R
. We
say
f
is real analytic near
x
0
∈ U
if there exists
r >
0 and constants
f
α
∈ R
for
each multi-index α such that
f(x) =
X
α
f
α
(x − x
0
)
α
for |x − x
0
| < r.
Note that if
f
is real analytic near
x
0
, then it is in fact
C
∞
in the corre-
sponding neighbourhood. Furthermore, the constants f
α
are given by
f
α
=
1
α!
D
α
f(x
0
).
In other words,
f
equals its Taylor expansion. Of course, by translation, we can
usually assume x
0
= 0.
Example. If r > 0, set
f(x) =
r
r − (x
1
+ x
2
+ ··· + x
n
)
for |x| <
r
√
n
. Then this is real analytic, since we have
f(x) =
1
1 − (x
1
+ ··· + x
n
)/r
=
∞
X
k=0
x
1
+ ··· + x
n
r
k
.
We can then expand out each term to see that this is given by a power series.
Explicitly, it is given by
f(x) =
X
α
1
r
|α|
|α|
α
x
α
,
where
|α|
α
=
|α|!
α!
.
One sees that this series is absolutely convergent for |x| <
r
√
n
.
Recall that in single-variable analysis, essentially the only way we have to
show that a series converges is by comparison to the geometric series. Here with
multiple variables, our only way to show that a power series converges is by
comparing it to this f.
Definition (Majorant). Let
f =
X
α
f
α
x
α
, g =
X
α
g
α
x
α
be formal power series. We say
g
majorizes
f
(or
g
is a majorant of
f
), written
g f, if g
α
≥ |f
α
| for all multi-indices α.
If f and A are vector-valued, then this means g
i
f
i
for all indices i.
Lemma.
(i) If g f and g converges for |x| < r, then f converges for |x| < r.
(ii)
If
f
(
x
) =
P
α
f
α
x
α
converges for
x < r
and 0
< s
√
n < r
, then
f
has a
majorant which converges on |x| < s.
Proof.
(i) Given x, define ˜x = (|x
1
|, |x
2
|, . . . , |x
n
|). We then note that
X
α
|f
α
x
α
| =
X
α
|f
α
|˜x
α
≤
X
α
g
α
˜x
α
= g(˜x).
Since |˜x| = |x| < r, we know g converges at ˜x.
(ii) Let 0 < s
√
n < r and set y = s(1, 1, . . . , 1). Then we have
|y| = s
√
n < r.
So by assumption, we know
X
α
f
α
y
α
converges. A convergent series has bounded terms, so there exists
C
such
that
|f
α
y
α
| ≤ C
for all α. But y
α
= s
|α|
. So we know
|f
α
| ≤
C
s
|α|
≤
C
s
|α|
|α|!
α!
.
But then if we set
g(x) =
Cs
s − (x
1
+ ··· + x
n
)
= C
X
α
|α|!
s
|α|
α!
x
α
,
we are done, since this converges for |x| <
s
√
n
.
With this lemma in mind, we can now prove the Cauchy–Kovalevskaya
theorem for first-order PDEs. This concerns a class of problems similar to the
Cauchy problem for ODEs. We first set up our notation.
We shall consider functions
u
:
R
n
→ R
m
. Writing
x
= (
x
1
, . . . , x
n
)
∈ R
n
,
we will consider the last variable
x
n
as being the “time variable”, and the others
as being space. However, for notational convenience, we will not write it as
t
.
We will adopt the shorthand x
0
= (x
1
, . . . , x
n−1
), so that x = (x
0
, x
n
).
Suppose we are given two real analytic functions
B : R
m
× R
n−1
→ Mat
m×m
(R)
c : R
m
× R
n−1
→ R
m
.
We seek a solution to the PDE
u
x
n
=
n−1
X
j=1
B(u, x
0
)u
x
j
+ c(u, x
0
)
subject to
u
= 0 when
x
n
= 0. We shall not require a solution on all of
R
n
,
but only on an open neighbourhood of the origin. Consequently, we will allow
for
B
and
c
to not be everywhere defined, but merely convergent on some
neighbourhood of the identity.
Note that we assumed
B
and
c
do not depend on
x
n
, but this is not a
restriction, since we can always introduce a new variable
u
m+1
=
x
n
, and enlarge
the target space.
Theorem
(Cauchy–Kovalevskaya theorem)
.
Given the above assumptions, there
exists a real analytic function
u
=
P
α
u
α
x
α
solving the PDE in a neighbourhood
of the origin. Moreover, it is unique among real analytic functions.
The uniqueness part of the proof is not difficult. If we write out
u
,
B
and
c
in power series and plug them into the PDE, we can then simply collect terms
and come up with an expression for what
u
must be. This is the content of the
following lemma:
Lemma.
For
k
= 1
, . . . , m
and
α
a multi-index in
N
n
, there exists a polynomial
q
k
α
in the power series coefficients of
B
and
c
such that any analytic solution to
the PDE must be given by
u =
X
α
q
α
(B, c)x
α
,
where q
α
is the vector with entries q
k
α
.
Moreover, all coefficients of q
α
are non-negative.
Note that despite our notation,
q
is not a function of
B
and
c
(which are
themselves functions of
u
and
x
). It is a function of the coefficients in the power
series expansion of B and c, which are some fixed constants.
This lemma proves uniqueness. To prove existence, we must show that this
converges in a neighbourhood of the origin, and for this purpose, the fact that
the coefficients of
q
α
are non-negative is crucial. After we have established this,
we will use the comparison test to reduce the theorem to the case of a single,
particular PDE, which we can solve by hand.
Proof.
We construct the polynomials
q
k
α
by induction on
α
n
. If
α
n
= 0, then
since u = 0 on {x
n
= 0}, we conclude that we must have
u
α
=
D
α
u(0)
α!
= 0.
For
α
n
= 1, we note that whenever
x
n
= 0, we have
u
x
j
= 0 for
j
= 1
, . . . , n −
1.
So the PDE reads
u
x
n
(x
0
, 0) = c(0, x
0
).
Differentiating this relation in directions tangent to
x
n
= 0, we find that if
α = (α
0
, 1), then
D
α
u(0) = D
α
0
c(0, 0).
So
q
k
α
is a polynomial in the power series coefficients of
c
, and has non-negative
coefficients.
Now suppose α
n
= 2, so that α = (α
0
, 2). Then
D
α
u = D
α
0
(u
x
n
)
x
n
= D
α
0
X
j
B
j
u
x
j
+ c
x
n
= D
α
0
X
j
B
j
u
x
j
,x
n
+
X
p
B
u
p
u
x
j
u
p
x
n
!
+
X
p
c
u
p
u
p
x
n
We don’t really care what this looks like. The point is that when we evaluate at 0,
and expand all the terms out, we get a polynomial in the derivatives of
B
j
and
c
,
and also D
β
u
with
β
n
<
2. The derivatives of
B
j
and
c
are just the coefficients
of the power series expansion of
B
j
and
c
, and by the induction hypothesis, we
can also express the D
β
u
in terms of these power series coefficients. Thus, we
can use this to construct
q
α
. By inspecting what the formula looks like, we see
that all coefficients in q
α
are non-negative.
We see that we can continue doing the same computations to obtain all q
α
.
An immediate consequence of the non-negativity is that
Lemma. If
˜
B
j
B
j
and
˜
c c, then
q
k
α
(
˜
B,
˜
c) > q
k
α
(B, c).
for all α. In particular,
˜
u u.
So given any
B
and
c
, if we can find some
˜
B
and
˜
c
that majorizes
B
and
c
respectively, and show that the corresponding series converges for
˜
B
and
˜
c
, then
we are done.
But we previously saw that every power series is majorized by
Cr
r − (x
1
+ ··· + x
n
)
for
C
sufficiently large and
r
sufficiently small. So we have reduced the problem
to the following case:
Lemma. For any C and r, define
h(z, x
0
) =
Cr
r − (x
1
+ ··· + x
n−1
) − (z
1
+ ··· + z
m
)
If B and c are given by
B
∗
j
(z, x
0
) = h(z, x
0
)
1 ··· 1
.
.
.
.
.
.
.
.
.
1 ··· 1
, c
∗
(z, x
0
) = h(z, x
0
)
1
.
.
.
1
,
then the power series
u =
X
α
q
α
(B, c)x
α
converges in a neighbourhood of the origin.
We’ll provide a rather cheap proof, by just writing down a solution of
the corresponding PDE. The solution itself can be found via the method of
characteristics, which we will learn about soon. However, the proof itself only
requires the existence of the solution, not how we got it.
Proof. We define
v(x) =
1
mn
r − (x
1
+ ··· + x
n−1
) −
p
(r − (x
1
+ ··· + x
n−1
))
2
− 2mnCrx
n
,
which is real analytic around the origin, and vanishes when
x
n
= 0. We then
observe that
u(x) = v(x)
1
.
.
.
1
gives a solution to the corresponding PDE, and is real analytic around the origin.
Hence it must be given by that power series, and in particular, the power series
must converge.
2.2 Reduction to first-order systems
In nature, very few equations come in the form required by the Cauchy–
Kovalevskaya theorem, but it turns out a lot of PDEs can be cast into this form
after some work. We shall demonstrate this via an example.
Example. Consider the problem
u
tt
= uu
xy
− u
xx
+ u
t
u|
t=0
= u
0
u
t
|
t=0
= u
1
,
where u
0
, u
1
are some real analytic functions near the origin. We define
f = u
0
+ tu
1
.
This is then real analytic near 0, and f |
t=0
= u
0
and f
t
|
t=0
= u
1
. Set
w = u − f.
Then w satisfies
w
tt
= ww
xy
− w
xx
+ w
t
+ fw
xy
+ f
xy
w + F,
where
F = ff
xy
− f
xx
+ f
t
,
and
w|
t=0
= w
t
|
t=0
= 0.
We let (
x, y, t
) = (
x
1
, x
2
, x
3
) and set
u
= (
w, w
x
, w
y
, w
t
). Then our PDE becomes
u
1
t
= w
t
= u
4
u
2
t
= w
xt
= u
4
x
u
3
t
= w
yt
= u
4
y
u
4
t
= w
tt
= u
1
u
2
x
2
− u
2
x
1
+ u
4
+ fu
2
x
2
+ f
xy
u
1
+ F,
and the initial condition is
u
(
x
1
, x
2
,
0) = 0. This is not quite autonomous, but
we can solve that problem simply by introducing a further new variable.
Let’s try to understand this in more generality. In certain cases, it is not
possible to write the equation in Cauchy–Kovalevskaya form. For example, if
the equation has no local solutions, then it certainly cannot be written in that
form, or else Cauchy–Kovalevskaya would give us a solution! It is thus helpful
to understand when this is possible.
Note that in the formulation of Cauchy–Kovalevskaya, the derivative
u
x
n
is
assumed to depend only on
x
0
, and not
x
n
. If we want
u
x
n
to depend on
x
n
as
well, we can introduce a new variable
u
n+1
and set (
u
n+1
)
x
n
= 1. So from now
on, we shall ignore the fact that our PDE only has x
0
on the right-hand side.
Let’s now consider the scalar quasi-linear problem
X
|α|=k
a
α
(D
k−1
u, . . . , Du, u, x)D
α
u + a
0
(D
k−1
u, . . . , u, x) = 0,
where u : B
r
(0) ⊆ R
n
→ R, with initial data
u =
∂u
∂x
n
= ··· =
∂
k−1
u
∂x
k−1
n
= 0.
whenever |x
0
| < r, x
n
= 0.
We introduce a new vector
u =
u,
∂u
∂x
1
, . . . ,
∂u
∂x
n
,
∂
2
u
∂x
2
1
, . . . ,
∂
n−1
u
∂x
k−1
n
= (u
1
, . . . , u
m
)
Here
u
contains all partial derivatives of
u
up to order
k−
1, for
j ∈ {
1
, . . . , m−
1
}
,
we can compute
∂u
j
∂x
n
in terms of u
`
or
∂u
`
∂x
p
for some ` ∈ {1, . . . , m} and p < n.
To express
∂u
m
∂x
n
in terms of the other variables, we need to actually use
the differential equation. To do so, we need to make an assumption about our
equation. We suppose that
a
(0,...,0,k)
(0
,
0) is non-zero. We can then rewrite the
equation as
∂
k
u
∂x
k
n
=
−1
a
(0,...,0,k)
(D
k−1
u, . . . , u, x)
X
|α|=k,α
n
<k
a
α
D
α
u + a
0
,
where at least near
x
= 0, the denominator can’t vanish. The RHS can then be
written in terms of
∂u
k
∂x
p
and ub for p < n.
So we have cast our original equation into the form we previously discussed,
provided that the
a
α
’s and
a
0
’s are real analytic about the origin, and that
a
(0,...,0,k)
(0
, . . . ,
0) = 0. Under these assumptions, we can solve the equation by
Cauchy–Kovalevskaya.
It is convenient to make the following definition: if
a
(0,...,k)
(0
, . . . ,
0)
6
= 0, we
say {x
n
= 0} is non-characteristic. Otherwise, we say it is characteristic.
Often times, we want to specify our initial data on some more exotic surface.
Unfortunately, they cannot be too exotic. They have to be real analytic in some
sense for our theory to have any chance of working.
Definition
(Real analytic hypersurface)
.
We say that Σ
⊆ R
n
is a real analytic
hypersurface near
x ∈
Σ if there exists
ε >
0 and a real analytic map Φ :
B
ε
(
x
)
→
U ⊆ R
n
, where U = Φ(B
ε
(x)), such that
– Φ is bijective and Φ
−1
: U → B
ε
(x) is real analytic.
– Φ(Σ ∩ B
ε
(x)) = {x
n
= 0} ∩ U and Φ(x) = 0.
We think of this Φ as “straightening out the boundary”.
Let γ be the unit normal to Σ, and suppose u solves
X
|α|=k
a
α
(D
k−1
u, . . . , u, x)D
α
u + a
0
(D
k−1
u, . . . , u, x) = 0
subject to
u = γ
i
∂
i
u = ··· , (γ
i
∂
i
)
k−1
u = 0
on Σ.
To do so, we define w(y) = u(Φ
−1
(y)), so that
u(x) = w(Φ(x)).
Then by the chain rule, we have
∂u
∂x
i
=
n
X
j=1
∂w
∂y
i
∂ψ
j
∂x
i
.
So plugging this into the equation, we see w satisfies an equation of the form
X
b
α
D
α
w + b
0
= 0,
as well as boundary conditions of
w =
∂w
∂y
n
= ··· =
∂
k−1
w
∂y
k−1
n
= 0.
So we have transformed this to a quasi-linear equation with boundary conditions
on
y
n
= 0, which we can tackle with Cauchy–Kovalevskaya, provided the surface
y
n
= 0 is non-characteristic. Can we relate this back to the a’s?
We can compute b
(0,...,0,k)
directly. Note that if |α| = k, then
D
α
u =
∂
k
w
∂y
k
n
(DΦ
n
)
α
+ terms not involving
∂
n
w
∂y
k
n
.
So the coefficient of
∂
k
w
∂y
k
n
is
b
(0,...,k)
X
|α|=k
a
α
(DΦ
n
)
α
.
Definition
((Non-)characteristic surface)
.
A surface Σ is non-characteristic at
xΣ provided
X
|α|=k
a
α
(DΦ
n
)
α
6= 0.
Equivalently, if
X
|α|=k
a
α
ν
α
6= 0,
where
ν
is the normal to the surface. We say a surface is characteristic if it is
not non-characteristic.
We focus on the case where our PDE is second-order. Consider an operator
of the form.
Lu =
n
X
i,j=1
a
ij
∂
i
u
∂x
i
∂x
j
where
a
ij
∈ R
. We may wlog assume
a
ij
=
a
ji
. For example the wave equation
and Laplace’s equation are given by operators of this form. Consider the equation
Lu = f
u = v
i
∂u
∂x
i
= 0 on Π
ν
= {x · ν = 0}.
Then Π
ν
is non-characteristic if
n
X
i,j
a
ij
ν
i
ν
j
6= 0.
Since
a
ij
is diagonalizable, we see that if all eigenvalues are positive, then
P
a
ij
ν
i
ν
j
is non-zero, and so the problem has no characteristic surfaces. In this
case, we say the operator is elliptic. If (
a
ij
) has one negative eigenvalue and the
rest positive, then we say L is hyperbolic.
Example. If L is the Laplacian
L = ∆ =
n
X
i=1
∂
2
∂x
2
i
,
then L is elliptic.
If L is the wave operator
L = −∂
2
t
+ ∆,
then L is hyperbolic.
If we consider the problem
Lu = 0,
and forget the Cauchy data, we can look for solutions of the form
e
ik·x
, as a
good physicist would do. We can plug this into our operator to compute
L(e
ik·x
) = −
n
X
i,j=1
a
ij
k
i
k
j
e
ik·x
.
So if
L
is elliptic, the only solution of this form is
k
= 0. If
L
is hyperbolic, we
can have non-trivial plane wave solutions provided k ∝ ν for some ν with
n
X
i,j=1
a
ij
ν
i
ν
j
= 0.
So if we set
u
λ
(
x
) =
e
iλν·x
for such a
ν
(with
|ν|
= 1, wlog). By taking
λ
very large, we can arrange this solution to have very large derivative in the
ν
direction. Vaguely, this says the characteristic directions are the directions where
singularities can propagate. By contrast, we will see that this is not the case for
elliptic operators, and this is known as elliptic regularity. In fact, we will show
that if L is elliptic and u satisfies Lu = 0, then u ∈ C
∞
.
While Cauchy–Kovalevskaya is sometimes useful, it has a few issues:
– Not all functions are real analytic.
– We have no control over “how long” a solution exists.
– It doesn’t answer the question of well-posedness.
Indeed, consider the PDE
u
xx
+ u
yy
= 0.
This admits a solution
u(x, y) = cos kx cosh ky
for some
k ∈ R
. We can think of this coming as coming from the Cauchy problem
u(x, 0) = cos kx, u
y
(x, 0) = 0.
By Cauchy–Kovalevskaya, there is a unique real analytic solution, and we’ve
found one. So this is the unique solution.
Let’s think about what happens when
k
gets large. In this case, it seems
like nothing is very wrong with the initial data. While the initial data oscillates
more and more, it is still bounded by 1. However, we see that the solution at
any
y
=
ε >
0 grows exponentially. We might say that the derivatives of the
initial condition grows to infinity as well, but if we do a bit more work (as you
will on the example sheet), we can construct a sequence of initial data all of
whose derivatives tend to 0, but the solution still blows up.
This is actually a serious problem. If we want to solve the PDE for a more
general initial condition, we may want to decompose the initial data into Fourier
modes, and then integrate up these solutions we found. But we cannot do this
in general, if these solutions blow up as k → ∞.
3 Function spaces
From now on, we shall restrain our desire to be a physicist, and instead tackle
PDEs with functional analytic methods. This requires some technical under-
standing of certain function spaces.
3.1 The H¨older spaces
The most straightforward class of functions paces is the
C
k
spaces. These are
spaces based on classical continuity and differentiability.
Definition
(
C
k
spaces)
.
Let
U ⊆ R
n
be an open set. We define
C
k
(
U
) to be
vector space of all
u
:
U → R
such that
u
is
k
-times differentiable and the partial
derivatives D
α
u : U → R are continuous for |α| ≤ k.
We want to turn this into a Banach space by putting the supremum norm
on the derivatives. However, even
sup |u|
is not guaranteed to exist, as
u
may
be unbounded. So this doesn’t give a genuine norm. This suggests the following
definition.
Definition
(
C
k
(
¯
U
) spaces)
.
We define
C
k
(
¯
U
)
⊆ C
k
(
U
) to be the subspace of
all
u
such that D
α
u
are all bounded and uniformly continuous. We define a
norm on C
k
(
¯
U) by
kuk
C
k
(
¯
U)
=
X
|α|≤k
sup
x∈U
kD
α
u(x)k.
This makes C
k
(
¯
U) a Banach space.
In some cases, we might want a “fractional” amount of differentiability. This
gives rise to the notion of H¨older spaces.
Definition
(H¨older continuity)
.
We say a function
u
:
U → R
is H¨older
continuous with index γ if there exists C ≥ 0 such that
|u(x) − u(y)| ≤ C|x − y|
γ
for all x, y ∈ U .
We write
C
0,γ
(
¯
U
)
⊆ C
0
(
¯
U
) for the subspace of all H¨older continuous functions
with index γ.
We define the γ-H¨older semi-norm by
[u]
C
0,γ
(
¯
U)
= sup
x6=y∈U
|u(x) − u(y)|
|x − y|
γ
.
We can then define a norm on C
0,γ
(
¯
U) by
kuk
C
(0,γ
(
¯
U)
= kuk
C
0
(
¯
U)
+ [u]
C
0,γ
(
¯
U)
.
We say
u ∈ C
k,γ
(
¯
U
) if
u ∈ C
k
(
¯
U
) and D
α
u ∈ C
0,γ
(
¯
U
) for all
|α|
=
k
, and we
define
kuk
C
k,γ
(
¯
U)
= kuk
C
k
(
¯
U)
+
X
|α|=k
[D
α
u]
C
0,γ
(
¯
U)
.
This makes C
k,γ
(
¯
U) into a Banach space as well.
Note that C
0,1
(
¯
U) is the set of (uniformly) Lipschitz functions on U.
3.2 Sobolev spaces
The properties of H¨older spaces are not difficult to understand, but on the other
hand they are not too useful. This is not too surprising, perhaps, because the
supremum norm only sees the maximum of the function, and ignores the rest.
In contrast, the
L
p
norm takes into account the values at all points. This gives
rise to the notion of Sobolev spaces.
Definition
(
L
p
space)
.
Let
U ⊆ R
n
be open, and suppose 1
≤ p ≤ ∞
. We
define the space L
p
(U) by
L
p
(U) = {u : U → R measurable | kuk
L
p
(U)
< ∞}/{equality a.e.}.
where, if p < ∞, we define
kuk
L
p
(U)
=
Z
U
|u(x)|
p
dx
1/p
,
and
kuk
L
∞
(U)
= inf{C ≥ 0 | |u(x)| ≤ C almost everywhere}.
Theorem. L
P
(U) is a Banach space with the L
p
norm.
We can also define local versions of
L
p
spaces by saying
u ∈ L
p
loc
(
U
) if
u ∈ L
p
(
V
) for every
V b U
, i.e.
¯
V ⊆ U
and
¯
V
is compact. This is read as “
V
is compactly contained in
U
”. By working with
L
p
loc
(
U
), we ignore any possible
blowing up at the boundary. Note that
L
p
loc
(
U
) is not Banach, but is a Fr´echet
space.
What we want to do is to define differentiability for these things. If we try to
define them via limits, then we run into difficulties since the value of an element
in
L
p
(
U
) at a point is not well-defined. To proceed, we use the notion of a weak
derivative.
Definition
(Weak derivative)
.
Suppose
u, v ∈ L
1
loc
(
U
) and
α
is a multi-index.
We say that v is the αth weak derivative of u if
Z
U
uD
α
φ dx = (−1)
|α|
Z
U
vφ dx
for all
φ ∈ C
∞
c
(
U
), i.e. for all smooth, compactly supported function on
U
. We
write v = D
α
u.
Note that if
u
is a genuine smooth function, then D
α
u
is the
α
th weak
derivative of u, as integration by parts tells us.
For those who have seen distributions, this is the same as the definition of a
distributional derivative, except here we require that the derivative is an
L
1
loc
function.
Lemma.
Suppose
v, ˜v ∈ L
1
loc
(
U
) are both
α
th weak derivatives of
u ∈ L
1
loc
(
U
),
then v = ˜v almost everywhere.
Proof. For any φ ∈ C
∞
c
(U), we have
Z
U
(v − ˜v)φ dx = (−1)
|α|
Z
U
(u − u)D
α
φ dx = 0.
Therefore v − ˜v = 0 almost everywhere.
Now that we have weak derivatives, we can define the Sobolev spaces.
Definition
(Sobolev space)
.
We say that
u ∈ L
1
loc
(
U
) belongs to the Sobolev
space W
k,p
(U) if u ∈ L
p
(U) and D
α
u exists and is in L
p
(U) for all |α| ≤ k.
If p = 2, we write H
k
(U) = W
k,2
(U), which will be a Hilbert space.
If p < ∞, we define the W
k,p
(U) norm by
kuk
W
k,p
(U)
=
X
|α|≤k
Z
U
|D
α
u|
p
dx
1/p
.
If p = ∞, we define
kuk
W
k,∞
(U)
=
X
|α|≤k
kD
α
uk
L
∞
(U)
.
We denote by
W
k,p
0
(
U
) the completion of
C
∞
c
(
U
) in this norm (and again
H
k
0
(U) = W
k,2
0
(U)).
To see that these things are somehow interesting, it would be nice to find
some functions that belong to these spaces but not the C
k
spaces.
Example. Let u = B
1
(0) be the unit ball in R
n
, and set
u(x) = |x|
−α
when x ∈ U, x 6= 0. Then for x 6= 0, we have
D
i
u =
−αx
i
|x|
α+1
.
By considering
φ ∈ C
∞
c
(
B
1
(0)
\{
0
}
), it is clear that if
u
is weakly differentiable,
then it must be given by
D
i
u =
−αx
i
|a|
α+1
(∗)
We can check that u ∈ L
1
loc
(U) iff α < n, and
x
i
|x|
α+1
∈ L
1
loc
(U) if α < n − 1.
So if we want
u ∈ W
1,p
(
U
), then we must take
α < n −
1. To check (
∗
) is
indeed the weak derivative, suppose
φ ∈ C
∞
c
(
U
). Then integrating by parts, we
get
−
Z
U−B
ε
(0)
uφ
x
i
dx =
Z
U−B
ε
(0)
D
i
uφ dx −
Z
∂B
ε
(0)
uφν
i
dS,
where ν = (ν
1
, . . . , ν
n
) is the inwards normal. We can estimate
Z
∂B
ε
(0)
uφν
i
dS
≤ kφk
L
∞
· ε
−α
· Cε
n−1
≤
˜
Cε
n−1−α
→ 0 as ε → 0
for some constants
C
and
˜
C
. So the second term vanishes. So by, say, dominated
convergence, it follows that (∗) is indeed the weak derivative.
Finally, note that D
i
u ∈ L
p
(
U
) iff
p
(
α
+ 1)
< n
. Thus, if
α <
n−p
p
, then
u ∈ W
1,p
(
U
). Note that if
p > n
, then the condition becomes
α <
0, and
u
is
continuous.
Note also that if α >
n
p
, then u 6∈ W
1,p
(U).
Theorem.
For each
k
= 0
,
1
, . . .
and 1
≤ p ≤ ∞
, the space
W
k,p
(
U
) is a Banach
space.
Proof.
Homogeneity and positivity for the Sobolev norm are clear. The triangle
inequality follows from the Minkowski inequality.
For completeness, note that
kD
α
uk
L
p
(U)
≤ kuk
W
k,p
(U)
for |α| ≤ k.
So if (
u
i
)
∞
i=1
is Cauchy in
W
k,p
(
U
), then (D
α
u
i
)
∞
i=1
is Cauchy in
L
p
(
U
) for
|α| ≤ k. So by completeness of L
p
(U), we have
D
α
u
i
→ u
α
∈ L
p
(U)
for some
u
α
. It remains to show that
u
α
= D
α
u
, where
u
=
u
(0,0,...,0)
. Let
φ ∈ C
∞
c
(U). Then we have
(−1)
|α|
Z
U
u
j
D
α
φ dx =
Z
U
D
α
u
j
φ dx
for all j. We send j → ∞. Then using D
α
u
j
→ u
α
in L
p
(U), we have
(−1)
|α|
Z
U
uD
α
φ dx =
Z
U
u
α
φ dx.
So D
α
u = u
α
∈ L
p
(U) and we are done.
3.3 Approximation of functions in Sobolev spaces
It would be nice if we could approximate functions in
W
k,p
(
U
) with something
more tractable. For example, it would be nice if we could approximate them by
smooth functions, so that the weak derivatives are genuine derivatives. A useful
trick to improve regularity of a function is to convolve with a smooth mollifier.
Definition (Standard mollifier). Let
η(x) =
(
Ce
1/(|x|
2
−1)
|x| < 1
0 |x| ≥ 1
,
where C is chosen so that
R
R
n
η(x) dx = 1.
One checks that this is a smooth function on R
n
, peaked at x = 0.
For each ε > 0, we set
η
ε
(x) =
1
ε
n
η
x
ε
.
Of course, the pre-factor of
1
ε
n
is chosen so that
η
ε
is appropriately normalized.
We call η
ε
the standard mollifier, and it satisfies supp η
ε
⊆ B
ε
(0).
We think of these η
ε
as approximations of the δ-function.
Now suppose U ⊆ R
n
is open, and let
U
ε
= {x ∈ U : dist(x, ∂U ) > ε}.
Definition
(Mollification)
.
If
f ∈ L
1
loc
(
U
), we define the mollification
f
ε
:
U
ε
→
R by the convolution
f
ε
= η
ε
∗ f.
In other words,
f
ε
(x) =
Z
U
η
ε
(x − y)f(y) =
Z
B
ε
(x)
η
ε
(x − y)f(y) dy.
Thus,
f
ε
is the “local average” of
f
around each point, with the weighting
given by
η
ε
. The hope is that
f
ε
will have much better regularity properties
than f .
Theorem. Let f ∈ L
1
loc
(U). Then
(i) f
ε
∈ C
∞
(U
ε
).
(ii) f
ε
→ f almost everywhere as ε → 0.
(iii) If in fact f ∈ C(U), then f
ε
→ f uniformly on compact subsets.
(iv)
If 1
≤ p < ∞
and
f ∈ L
p
loc
(
U
), then
f
ε
→ f
in
L
p
loc
(
U
), i.e. we have
convergence in L
p
on any V b U .
In general, the difficulty of proving these approximation theorems lie in what
happens at the boundary
Lemma. Assume u ∈ W
k,p
(U) for some 1 ≤ p < ∞, and set
u
ε
= η
ε
∗ u on U
ε
.
Then
(i) u
ε
∈ C
∞
(U
ε
) for each ε > 0
(ii) If V b U, then u
ε
→ u in W
k,p
(V ).
Proof.
(i) As above.
(ii) We claim that
D
α
u
ε
= η
ε
∗ D
α
u
for |α| ≤ k in U
ε
.
To see this, we have
D
α
u
ε
(x) = D
α
Z
U
η
ε
(x − y)u(y) dy
=
Z
U
D
α
x
η
ε
(x − y)u(y) dy
=
Z
U
(−1)
|α|
D
α
y
η
ε
(x − y)u(y) dy
For a fixed
x ∈ U
ε
,
η
ε
(
x − ·
)
∈ C
∞
c
(
U
), so by the definition of a weak
derivative, this is equal to
=
Z
U
η
ε
(x − y)D
α
u(y) dy
= η
ε
∗ D
α
u.
It is an exercise to verify that we can indeed move the derivative past the
integral.
Thus, if we fix
V b U
. Then by the previous parts, we see that D
α
u
ε
→
D
α
u in L
p
(V ) as ε → 0 for |α| ≤ k. So
ku
ε
− uk
p
W
k.p
(V )
=
X
|α|≤k
kD
α
u
ε
− D
α
uk
p
L
p
(V )
→ 0
as ε → 0.
Theorem
(Global approximation)
.
Let 1
≤ p < ∞
, and
U ⊆ R
n
be open and
bounded. Then C
∞
(U) ∩ W
k,p
(U) is dense in W
k,p
(U).
Our main obstacle to overcome is the fact that the mollifications are only
defined on U
ε
, and not U.
Proof. For i ≥ 1, define
U
i
=
x ∈ U | dist(x, ∂U) >
1
i
V
i
= U
i+3
−
¯
U
i+1
W
i
= U
i+4
−
¯
U
i
.
We clearly have
U
=
S
∞
i=1
U
i
, and we can choose
V
0
b U
such that
U
=
S
∞
i=0
V
i
.
Let
{ζ
i
}
∞
i=0
be a partition of unity subordinate to
{V
i
}
. Thus, we have
0 ≤ ζ
i
≤ 1, ζ
i
∈ C
∞
c
(V
i
) and
P
∞
i=0
ζ
i
= 1 on U .
Fix δ > 0. Then for each i, we can choose ε
i
sufficiently small such that
u
i
= η
ε
i
∗ ζ
i
u
satisfies supp u
i
⊆ W
i
and
ku
i
− ζ
i
uk
W
k.p
(U)
= ku
i
− ζ
i
uk
W
k.p
(W
i
)
≤
δ
2
i+1
.
Now set
v =
∞
X
i=0
u
i
∈ C
∞
(U).
Note that we do not know (yet) that
v ∈ W
k.p
(
U
). But it certainly is when we
restrict to some V b U .
In any such subset, the sum is finite, and since u =
P
∞
i=0
ζ
i
u, we have
kv − uk
W
k,p
(V )
≤
∞
X
i=0
ku
i
− ζ
i
uk
W
k.p
(V )
≤ δ
∞
X
i=0
2
−(i+1)
= δ.
Since the bound
δ
does not depend on
V
, by taking the supremum over all
V
,
we have
kv − uk
W
k.p
(U)
≤ δ.
So we are done.
It would be nice for
C
∞
(
¯
U
) to be dense, instead of just
C
∞
(
U
). It turns out
this is possible, as long as we have a sensible boundary.
Definition
(
C
k,δ
boundary)
.
Let
U ⊆ R
n
be open and bounded. We say
∂U
is
C
k,δ
if for any point in the boundary
p ∈ ∂U
, there exists
r >
0 and a function
γ ∈ C
k,δ
(
R
n−1
) such that (possibly after relabelling and rotating axes) we have
U ∩ B
r
(p) = {(x
0
, x
n
) ∈ B
r
(p) : x
n
> γ(x
0
)}.
Thus, this says our boundary is locally the graph of a C
k,δ
function.
Theorem
(Smooth approximation up to boundary)
.
Let 1
≤ p < ∞
, and
U ⊆ R
n
be open and bounded. Suppose
∂U
is
C
0,1
. Then
C
∞
(
¯
U
)
∩ W
k,p
(
U
) is
dense in W
k,p
(U).
Proof.
Previously, the reason we didn’t get something in
C
∞
(
¯
U
) was that we
had to glue together infinitely many mollifications whose domain collectively
exhaust
U
, and there is no hope that the resulting function is in
C
∞
(
¯
U
). In the
current scenario, we know that U locally looks like
x
0
The idea is that given a
u
defined on
U
, we can shift it downwards by some
ε
.
It is a known result that translation is continuous, so this only changes
u
by a
tiny bit. We can then mollify with a
¯ε < ε
, which would then give a function
defined on U (at least locally near x
0
).
So fix some
x
0
∈ ∂U
. Since
∂U
is
C
0,1
, there exists
r >
0 such that
γ ∈ C
0,1
(R
n−1
) such that
U ∩ B
r
(x
0
) = {(x
0
, x
n
) ∈ B
r
(x
0
) | x
n
> γ(x
0
)}.
Set
V = U ∩ B
r/2
(x
0
).
Define the shifted function u
ε
to be
u
ε
(x) = u(x + εe
n
).
Now pick ¯ε sufficiently small such that
v
ε,¯ε
= η
¯ε
∗ u
ε
is well-defined. Note that here we need to use the fact that
∂U
is
C
0,1
. Indeed,
we can see that if the slope of ∂U is very steep near a point x:
ε
then we need to choose a
¯ε
much smaller than
ε
. By requiring that
γ
is 1-H¨older
continuous, we can ensure there is a single choice of
¯ε
that works throughout
V
.
As long as ¯ε is small enough, we know that v
ε,¯ε
∈ C
∞
(
¯
V ).
Fix δ > 0. We can now estimate
kv
ε,˜ε
− uk
W
k.p
(V )
= kv
ε,˜ε
− u
ε
+ u
ε
− uk
W
k,p
(V )
≤ kv
ε,˜ε
− u
ε
k
W
k,p
(V )
+ ku
ε
− uk
W
k.p
(V )
.
Since translation is continuous in the
L
p
norm for
p < ∞
, we can pick
ε >
0
such that
ku
ε
− uk
W
k.p
(V )
<
δ
2
. Having fixed such an
ε
, we can pick
˜ε
so small
that we also have kv
ε,˜ε
− u
ε
k
W
k.p
(V )
<
δ
2
.
The conclusion of this is that for any
x
0
∈ ∂U
, we can find a neighbourhood
V ⊆ U
of
x
0
in
U
such that for any
u ∈ W
k,p
(
U
) and
δ >
0, there exists
v ∈ C
∞
(
¯
V ) such that ku − vk
W
k,p
(V )
≤ δ.
It remains to patch all of these together using a partition of unity. By the
compactness of
∂U
, we can cover
∂U
by finitely many of these
V
, say
V
1
, . . . , V
N
.
We further pick a V
0
such that V
0
b U and
U =
N
[
i=0
V
i
.
We can pick approximations
v
i
∈ C
∞
(
¯
V
i
) for
i
= 0
, . . . , N
(the
i
= 0 case is given
by the previous global approximation theorem), satisfying
kv
i
− uk
W
k,p
(V
i
)
≤ δ
.
Pick a partition of unity {ζ
i
}
N
i=0
of
¯
U subordinate to {V
i
}. Define
v =
N
X
i=0
ζ
i
v
i
.
Clearly v ∈ C
∞
(
¯
U), and we can bound
kD
α
v − D
α
uk
L
p
(U)
=
D
α
N
X
i=0
ζ
i
v
i
− D
α
N
X
i=0
ζ
i
u
L
p
(U)
≤ C
k
N
X
i=0
kv
i
− uk
W
k.p
(V
i
)
≤ C
k
(1 + N)δ,
where
C
k
is a constant that solely depends on the derivatives of the partition of
unity, which are fixed. So we are done.
3.4 Extensions and traces
If
U ⊆ R
n
is open and bounded, then there is of course a restriction map
W
1,p
(
R
n
)
→ W
1,p
(
U
). It turns out under mild conditions, there is an extension
map going in the other direction as well.
Theorem
(Extension of
W
1.p
functions)
.
Suppose
U
is open, bounded and
∂U
is
C
1
. Pick a bounded
V
such that
U b V
. Then there exists a bounded linear
operator
E : W
1,p
(U) → W
1.p
(R
n
)
for 1 ≤ p < ∞ such that for any u ∈ W
1,p
(U),
(i) Eu = u almost everywhere in U
(ii) Eu has support in V
(iii) kEuk
W
1,p
(R
n
)
≤ Ckuk
W
1,p
(U)
, where the constant
C
depends on
U, V, p
but not u.
Proof.
First note that
C
1
(
¯
U
) is dense in
W
1,p
(
U
). So it suffices to show that
the above theorem holds with
W
1,p
(
U
) replaced with
C
1
(
¯
U
), and then extend
by continuity.
We first show that we can do this locally, and then glue them together using
partitions of unity.
Suppose
x
0
∈ ∂U
is such that
∂U
near
x
0
lies in the plane
{x
n
= 0
}
. In
other words, there exists r > 0 such that
B
+
= B
r
(x
0
) ∩ {x
n
≥ 0} ⊆
¯
U
B
−
= B
r
(x
0
) ∩ {x
n
≤ 0} ⊆ R
n
\ U.
The idea is that we want to reflect
u|
B
+
across the
x
n
= 0 boundary to get a
function on
B
−
, but the derivative will not be continuous if we do this. So we
define a “higher order reflection” by
¯u(x) =
(
u(x) x ∈ B
+
−3u(x
0
, −x
n
) + 4
ux
0
, −
x
n
2
x ∈ B
−
x
n
u
−x
−
x
2
x
We see that this is a continuous function. Moreover, by explicitly computing the
partial derivatives, we see that they are continuous across the boundary. So we
know ¯u ∈ C
1
(B
r
(x
0
)).
We can then easily check that we have
k¯uk
W
1,p
(B
r
(x
0
))
≤ Ckuk
W
1,p
(B
+
)
for some constant C.
If
∂U
is not necessarily flat near
x
0
∈ ∂U
, then we can use a
C
1
diffeomor-
phism to straighten it out. Indeed, we can pick
r >
0 and
γ ∈ C
1
(
R
n−1
) such
that
U ∩ B
r
(p) = {(x
0
, x
n
) ∈ B
r
(p) | x
n
> γ(x
0
)}.
We can then use the C
1
-diffeomorphism Φ : R
n
→ R
n
given by
Φ(x)
i
= x
i
i = 1, . . . , n − 1
Φ(x)
n
= x
n
− γ(x
1
, . . . , x
n
)
Then since
C
1
diffeomorphisms induce bounded isomorphisms between
W
1,p
,
this gives a local extension.
Since
∂U
is compact, we can take a finite number of points
x
0
i
∈ ∂W
, sets
W
i
and extensions u
i
∈ C
1
(W
i
) extending u such that
∂U ⊆
N
[
i=1
W
i
.
Further pick
W
0
b U
so that
U ⊆
S
N
i=0
W
i
. Let
{ζ
i
}
N
i=0
be a partition of unity
subordinate to {W
i
}. Write
¯u =
N
X
i=0
ζ
i
¯u
i
where ¯u
0
= u. Then ¯u ∈ C
1
(R
n
), ¯u = u on U , and we have
k¯uk
W
1,p
(R
n
)
≤ Ckuk
W
1,p
(U)
.
By multiplying ¯u by a cut-off, we may assume supp ¯u ⊆ V for some V c U .
Now notice that the whole construction is linear in
u
. So we have constructed
a bounded linear operator from a dense subset of
W
1,p
(
U
) to
W
1,p
(
V
), and there
is a unique extension to the whole of
W
1,p
(
U
) by the completeness of
W
1,p
(
V
).
We can see that the desired properties are preserved by this extension.
Trace theorems
A lot of the PDE problems we are interested in are boundary value problems,
namely we want to solve a PDE subject to the function taking some prescribed
values on the boundary. However, a function
u ∈ L
p
(
U
) is only defined up to
sets of measure zero, and
∂U
is typically a set of measure zero. So naively, we
can’t naively define
u|
∂U
. We would hope that if we require
u
to have more
regularity, then perhaps it now makes sense to define the value at the boundary.
This is true, and is given by the trace theorem
Theorem
(Trace theorem)
.
Assume
U
is bounded and has
C
1
boundary. Then
there exists a bounded linear operator
T
:
W
1,p
(
U
)
→ L
p
(
∂U
) for 1
≤ p < ∞
such that T u = u|
∂U
if u ∈ W
1,p
(U) ∩ C(
¯
U).
We say T u is the trace of u.
Proof.
It suffices to show that the restriction map defined on
C
∞
functions is a
bounded linear operator, and then we have a unique extension to
W
1,p
(
U
). The
gist of the argument is that Stokes’ theorem allows us to express the integral of
a function over the boundary as an integral over the whole of
U
. In fact, the
proof is indeed just the proof of Stokes’ theorem.
By a general partition of unity argument, it suffices to show this in the case
where U = {x
n
> 0} and u ∈ C
∞
¯
U with supp u ⊆ B
R
(0) ∩
¯
U. Then
Z
R
n−1
|u(x
0
, 0)|
p
dx
0
=
Z
R
n−1
Z
∞
0
∂
∂x
n
|u(x
0
, x
n
)|
p
dx
n
dx
0
=
Z
U
p|u|
p−1
u
x
n
sgn u dx
n
dx
0
.
We estimate this using Young’s inequality to get
Z
R
n−1
|u(x
0
, 0)|
p
dx
0
≤ C
p
Z
U
|u|
p
+ |u
x
n
|
p
dU ≤ C
p
kuk
p
W
1,p
(U)
.
So we are done.
We can apply this to each derivative to define trace maps
W
k,p
(
U
)
→
W
k−1,p
(U).
In general, this trace map is not surjective. So in some sense, we don’t
actually need to use up a whole unit of differentiability. In the example sheet,
we see that in the case p = 2, we only lose “half” a derivative.
Note that
C
∞
c
(
U
) is dense in
W
1,p
0
(
U
), and the trace vanishes on
C
∞
c
(
U
).
So
T
vanishes on
W
1,p
0
(
U
). In fact, the converse is true — if
T u
= 0, then
u ∈ W
1,p
0
(U).
3.5 Sobolev inequalities
Before we can move on to PDE’s, we have to prove some Sobolev inequalities.
These are inequalities that compare different norms, and allows us to “trade”
different desirable properties. One particularly important thing we can do is
to trade differentiability for continuity. So we will know that if
u ∈ W
k,p
(
U
)
for some large
k
, then in fact
u ∈ C
m
(
U
) for some (small)
m
. The utility of
these results is that we would like to construct our solutions in
W
k,p
spaces,
since these are easier to work with, but ultimately, we want an actual, smooth
solution to our equation. Sobolev inequalities let us do so, since if
u ∈ W
k,p
(
U
)
for all k, then it must be in C
m
as well.
To see why we should be expected to be able to do that, consider the space
H
1
0
([0
,
1]). A priori, if
u ∈ H
1
0
([0
,
1]), then we only know it exists as some
measurable function, and there is no canonical representative of this function.
However, we can simply assign
u(x) =
Z
x
0
u
0
(t) dt,
since we know
u
0
is an honest integrable function. This gives a well-defined
representative of the function
u
, and even better, we can bound its supremum
using ku
0
k
L
2
([0,1])
.
Before we start proving our Sobolev inequalities, we first prove the following
lemma:
Lemma. Let n ≥ 2 and f
1
, . . . , f
n
∈ L
n−1
(R
n−1
). For 1 ≤ i ≤ n, denote
˜x
i
= (x
1
, . . . , x
i−1
, x
i+1
, . . . , x
n
),
and set
f(x) = f
1
(˜x
1
) ···f
n
(˜x
n
).
Then f ∈ L
1
(R
n
) with
kfk
L
1
(R
n
) ≤
n
Y
i=1
kf
i
k
L
n−1
(R
n−1
)
.
Proof. We proceed by induction on n.
If n = 2, then this is easy, since
f(x
1
, x
2
) = f
1
(x
2
)f
2
(x
1
).
So
Z
R
2
|f(x
1
, x
2
)| dx =
Z
|f
1
(x
2
)| dx
2
Z
|f
2
(x
1
)| dx
1
= kf
1
k
L
1
(R
1
)
kf
2
k
L
1
(R
1
)
.
Suppose that the result is true for n ≥ 2, and consider the n + 1 case. Write
f(x) = f
n+1
(˜x
n+1
)F (x),
where F (x) = f
1
(˜x
1
) ···f
n
(˜x
n
). Then by H¨older’s inequality, we have
Z
x
1
,...,x
n
|f( ·, x
n+1
)| dx ≤ kf
n+1
k
L
n
(R
n
)
kF ( ·, x
n+1
)k
L
n/(n−1)
(R
n
)
.
We now use the induction hypothesis to
f
n/(n−1)
1
( ·, x
n+1
)f
n/(n−1)
2
( ·, x
n+1
) ···f
n/(n−1)
n
( ·, x
n+1
).
So
Z
x
1
,...,x
n
|f( ·, x
n+1
)| dx ≤ kf
n+1
k
L
n
(R
n
)
n
Y
i=1
kf
n
n−1
i
( ·, x
n
)k
L
n−1
(R
n−1
)
!
n−1
n
= kf
n+1
k
L
n
(R
n
)
n
Y
i=1
kf
i
( ·, x
m
)k
L
n
(R
n−1
)
.
Now integrate over x
n+1
. We get
kfk
L
1
(R
n+1
)
≤ kf
n+1
k
L
n
(R
n
)
Z
x
n+1
n
Y
i=1
kf
i
( ·, x
n+1
)k
L
n
(R
n−1
)
dx
n
.
≤ kf
n+1
k
L
n
(R
n+1
)
n
Y
i=1
Z
x
n+1
kf
i
( ·, x
n+1
)k
n
L
n
(R
n−1
)
dx
n+1
!
1/n
= kf
n+1
k
L
n
(R
n
)
n
Y
i=1
kf
i
k
L
n
(R
n
)
.
Theorem
(Gagliardo–Nirenberg–Sobolev inequality)
.
Assume
n > p
. Then we
have
W
1,p
(R
n
) ⊆ L
p
∗
(R
n
),
where
p
∗
=
np
n − p
> p,
and there exists c > 0 depending on n, p such that
kuk
L
p
∗
(R
n
)
≤ ckuk
W
1,p
(R
n
)
.
In other words, W
1,p
(R
n
) is continuously embedded in L
p
∗
(R
n
).
Proof.
Assume
u ∈ C
∞
c
(
R
n
), and consider
p
= 1. Since the support is compact,
u(x) =
Z
x
i
−∞
u
x
i
(x
1
, . . . , x
i−1
, y
i
, x
i+1
, . . . , x
n
) dy
i
.
So we know that
|u(x)| ≤
Z
∞
−∞
|Du(x
1
, . . . , x
i−1
, y
i
, x
i+1
, . . . , x
n
)| dy
i
≡ f
i
(˜x
i
).
Thus, applying this once in each direction, we obtain
|u(x)|
n/(n−1)
≤
n
Y
i=1
f
i
(˜x
i
)
1/(n−1)
.
If we integrate and then use the lemma, we see that
kuk
L
n/(n−1)
(R
n
)
n/(n−1)
≤ C
n
Y
i=1
kf
1/(n−1)
i
k
L
n−1
(R
n−1
)
= kDuk
n/(n−1)
L
1
(R
n
)
.
So
kuk
L
n/(n−1)
(R
n
)
≤ CkDuk
L
1
(R
n
)
.
Since C
∞
c
(R
n
) is dense in W
1,1
(R
n
), the result for p = 1 follows.
Now suppose p > 1. We apply the p = 1 case to
v = |u|
γ
for some γ > 1, which we choose later. Then we have
Dv = γ sgn u · |u|
γ−1
Du.
So
Z
R
n
|u|
γn
n−1
dx
n−1
n
≤ γ
Z
R
n
|u|
γ−1
|Du| dx
≤ γ
Z
R
n
|u|
(γ−1)
p
p−1
dx
p−1
p
Z
R
n
|Du|
p
dx
1
p
.
We choose γ such that
γn
n − 1
=
(γ − 1)p
p − 1
.
So we should pick
γ =
p(n − 1)
n − p
> 1.
Then we have
γn
n − 1
=
np
n − p
= p
∗
.
So
Z
R
n
|u|
p
∗
dx
n−1
n
≤
p(n − 1)
n − p
Z
R
n
|u|
p
∗
dx
p−1
p
kDuk
L
p
(R
n
)
.
So
Z
R
n
|u|
p
∗
dx
1/p
∗
≤
p(n − 1)
n − p
kDuk
L
p
(R
n
)
.
This argument is valid for
u ∈ C
∞
c
(
R
n
), and by approximation, we can extend
to W
1,p
(R
n
).
We can deduce some corollaries of this result:
Corollary.
Suppose
U ⊆ R
n
is open and bounded with
C
1
-boundary, and
1 ≤ p < n. Then if p
∗
=
np
n−p
, we have
W
1,p
(U) ⊆ L
p
∗
(U),
and there exists C = C(U, p, n) such that
kuk
L
p
∗
(U)
≤ Ckuk
W
1,p
(U)
.
Proof.
By the extension theorem, we can find
¯u ∈ W
1,p
(
R
n
) with
¯u
=
u
almost
everywhere on U and
k¯uk
W
1,p
(R
n
)
≤ Ckuk
W
1,p
(U)
.
Then we have
kuk
L
p
∗
(U)
≤ k¯uk
L
p
∗
(R
n
)
≤ ck¯uk
W
1,p
(R
n
)
≤
˜
Ckuk
W
1,p
(U)
.
Corollary.
Suppose
U
is open and bounded, and suppose
u ∈ W
1,p
0
(
U
). For
some 1 ≤ p < n, then we have the estimates
kuk
L
q
(U)
≤ CkDuk
L
p
(U)
for any q ∈ [1, p
∗
]. In particular,
kuk
L
p
(U)
≤ CkDuk
L
p
(U)
.
Proof.
Since
u ∈ W
1,p
0
(
U
), there exists
u
0
∈ C
∞
c
(
U
) converging to
u
in
W
1,p
(
U
).
Extending u
m
to vanish on U
c
, we have
u
m
∈ C
∞
c
(R
n
).
Applying Gagliardo–Nirenberg–Sobolev, we find that
ku
m
k
L
p
∗
(R
n
)
≤ CkDu
m
k
L
p
(R
n
)
.
So we know that
ku
m
k
L
p
∗
(U)
≤ CkDu
m
k
L
p
(U)
.
Sending m → ∞, we obtain
kuk
L
p
∗
(U)
≤ CkDuk
L
p
(U)
.
Since U is bounded, by H¨older, we have
Z
U
|u|
q
dx
1/q
≤
Z
U
1 dx
1/rq
Z
U
|u|
qs
ds
1/sq
≤ Ckuk
L
p
∗
(U)
provided
q ≤ p
∗
, where we choose
s
such that
qs
=
p
∗
, and
r
such that
1
r
+
1
s
= 1.
The previous results were about the case
n > p
. If
n < p < ∞
, then we
might hope that if u ∈ W
1,p
(R
n
), then u is “better than L
∞
”.
Theorem
(Morrey’s inequality)
.
Suppose
n < p < ∞
. Then there exists a
constant C depending only on p and n such that
kuk
C
0,γ
(R
n
)
≤ Ckuk
W
1,p
(R
n
)
for all u ∈ C
∞
c
(R
n
) where C = C(p, n) and γ = 1 −
n
p
< 1.
Proof. We first prove the H¨older part of the estimate.
Let Q be an open cube of side length r > 0 and containing 0. Define
¯u =
1
|Q|
Z
Q
u(x) dx.
Then
|¯u − u(0)| =
1
|Q|
Z
Q
[u(x) − u(0)] dx
≤
1
|Q|
Z
Q
|u(x) − u(0)| dx.
Note that
u(x) − u(0) =
Z
1
0
d
dt
u(tx) dt =
X
i
Z
1
0
x
i
∂u
∂x
i
(tx) dt.
So
|u(x) − u(0)| ≤ r
Z
1
0
X
i
∂u
∂x
i
(tx)
dt.
So we have
|¯u − u(0)| ≤
r
|Q|
Z
Q
Z
1
0
X
i
∂u
∂x
i
(tx)
dt dx
=
r
|Q|
Z
1
0
t
−n
Z
tQ
X
i
∂u
∂x
i
(y)
dy
!
dt
≤
r
|Q|
Z
1
0
t
−n
n
X
i=1
∂u
∂x
i
L
p
(tQ)
|tQ|
1/p
0
!
dt.
where
1
p
+
1
p
0
= 1.
Using that |Q| = r
n
, we obtain
|¯u − u(0)| ≤ cr
1−n+
n
p
0
kDuk
L
p
(R
n
)
Z
1
0
t
−n+
n
p
0
dt
≤
c
1 − n/p
r
1−n/p
kDuk
L
p
(R
n
)
.
Note that the right hand side is decreasing in
r
. So when we take
r
to be very
small, we see that u(0) is close to the average value of u around 0.
Indeed, suppose
x, y ∈ R
n
with
|x − y|
=
r
2
. Pick a box containing
x
and
y
of side length
r
. Applying the above result, shifted so that
x
,
y
play the role of
0, we can estimate
|u(x) − u(y)| ≤ |u(x) − ¯u| + |u(y) − ¯u| ≤
˜
Cr
1−n/p
kDuk
L
p
(R
n
)
.
Since r < kx − yk, it follows that
|u(x) − u(y)|
|x − y|
1−n/p
≤ C · 2
1−n/p
kDuk
L
p
(R
n
)
.
So we conclude that [u]
C
0,γ
(R
n
)
≤ CkDuk
L
p
(R
n
)
.
Finally, to see that
u
is bounded, any
x ∈ R
n
belongs to some cube
Q
of side
length 1. So we have
|u(x)| ≤ |u(x) − ¯u + ¯u| ≤ |¯u| + CkDuk
L
p
(R
n
)
.
But also
|¯u| ≤
Z
Q
|u(x)| dx ≤ kuk
L
p
(R
n
)
k1k
L
p
(Q)
= kuk
L
p
(R
n
)
.
So we are done.
Corollary.
Suppose
u ∈ W
1,p
(
U
) for
U
open, bounded with
C
1
boundary.
Then there exists
u
∗
∈ C
0,γ
(
U
) such that
u
=
u
∗
almost everywhere and
ku
∗
k
C
0,γ
(U)
≤ Ckuk
W
1,p
(U)
.
By applying these results iteratively, we can establish higher order versions
W
k,p
⊆ L
q
(U)
with some appropriate q.
4 Elliptic boundary value problems
4.1 Existence of weak solutions
In this chapter, we are going to study second-order elliptic boundary value
problems. The canonical example to keep in mind is the following:
Example.
Suppose
U ⊆ R
n
is a bounded open set with smooth boundary.
Suppose
∂U
is a perfect conductor and
ρ
:
U → R
is the charge density inside
U
.
The electrostatic field φ satisfies
∆φ = ρ on U
φ|
∂U
= 0.
This is an example of an elliptic boundary value problem. Note that we cannot
tackle this with the Cauchy–Kovalevskaya theorem, since we don’t even have
enough boundary conditions, and also because we want an everywhere-defined
solution.
In general, let
U ⊆ R
n
be open and bounded with
C
1
boundary, and for
u ∈ C
2
(
¯
U), we define
Lu = −
n
X
i,j=1
(a
ij
(x)u
x
j
)
x
i
+
n
X
i=1
b
i
(x)u
x
i
+ c(x)u,
where
a
ij
,
b
i
and
c
are given functions defined on
U
. Typically, we will assume
they are at least L
∞
, but sometimes we will require more.
If a
ij
∈ C
1
(U), then we can rewrite this as
Lu = −
n
X
i,j=1
a
ij
(x)u
x
i
x
j
+
n
X
i=1
˜
b
i
(x)u
x
i
+ c(x)u
for some
˜
b
i
, using the product rule.
We will mostly use the first form, called the divergence form, which is suitable
for the energy method, while the second (non-divergence form) is suited to the
maximum principle. Essentially, what makes the divergence form convenient for
us is that it’s easy to integrate by parts.
Of course, given the title of the chapter, we assume that L is elliptic, i.e.
X
i,j
a
ij
(x)ξ
i
ξ
j
≥ 0
for all x ∈ U and ξ ∈ R
n
.
It turns out this is not quite strong enough, because this condition allows
the a
ij
’s to be degenerate, or vanish at the boundary.
Definition (Uniform ellipticity). An operator
Lu = −
n
X
i,j=1
(a
ij
(x)u
j
)
x
i
+
n
X
i=1
b
i
(x)u
x
i
+ c(x)u
is uniformly elliptic if
n
X
i,j=1
a
ij
(x)ξ
i
ξ
j
≥ θ|ξ|
2
for some θ > 0 and all x ∈ U, ξ ∈ R
n
.
We shall consider the boundary value problem
Lu = f on U
u = 0 on ∂U.
This form of the equation is not very amenable to study by functional analytic
methods. Similar to what we did in the proof of Picard–Lindel¨of, we want to
write this in a weak formulation.
Let’s suppose
u ∈ C
2
(
¯
U
) is a solution, and suppose
v ∈ C
2
(
¯
U
) also satisfies
v|
∂U
= 0. Multiply the equation
Lu
=
f
by
v
and integrate by parts. Then we
get
Z
U
vf dx =
Z
U
X
ij
v
x
i
a
ij
u
x
j
+
X
i
b
i
u
x
i
v + cuv
dx ≡ B[u, v]. (2)
Conversely, suppose
u ∈ C
2
(
¯
U
) and
u|
∂U
= 0. If
R
U
vf
d
x
=
B
[
u, v
] for all
v ∈ C
2
(
¯
U
) such that
v|
∂U
= 0, then we claim
u
in fact solves the original
equation.
Indeed, undoing the integration by parts, we conclude that
Z
vLu dx =
Z
vf dx
for all
v ∈ C
2
(
¯
U
) with
v|
∂U
= 0. But if this is true for all
v
, then it must be
that Lu = f .
Thus, the PDE problem we started with is equivalent to finding
u
that solves
B[u, v] =
R
U
vf dx for all suitable v, provided u is regular enough.
But the point is that (2) makes sense for
u, v ∈ H
1
0
(
U
). So our strategy is
to first show that we can find
u ∈ H
1
0
(
U
) that solves (2), and then hope that
under reasonable assumptions, we can show that any such solution must in fact
be C
2
(
¯
U).
Definition (Weak solution). We say u ∈ H
1
0
(U) is a weak solution of
Lu = f on U
u = 0 on ∂U
for f ∈ L
2
(U) if
B[u, v] = (f, v)
L
2
(U)
for all v ∈ H
1
0
(U).
We’ll exploit the Hilbert space structure of H
1
0
(U) to find weak solutions.
Theorem
(Lax–Milgram theorem)
.
Let
H
be a real Hilbert space with inner
product (
·, ·
). Suppose
B
:
H × H → R
is a bilinear mapping such that there
exists constants α, β > 0 so that
– |B[u, v]| ≤ αkukkvk for all u, v ∈ H (boundedness)
– βkuk
2
≤ B[u, u] (coercivity)
Then if
f
:
H → R
is a bounded linear map, then there exists a unique
u ∈ H
such that
B[u, v] = hf, vi
for all v ∈ H.
Note that if
B
is just the inner product, then this is the Riesz representation
theorem.
Proof.
By the Riesz representation theorem, we may assume that there is some
w such that
hf, vi = (u, v).
For each fixed u ∈ H, the map
v 7→ B[u, v]
is a bounded linear functional on
H
. So by the Riesz representation theorem,
we can find some Au such that
B[u, v] = (Au, v).
It then suffices to show that A is invertible, for then we can take u = A
−1
w.
– Since B is bilinear, it is immediate that A : H → H is linear.
– A is bounded, since we have
kAuk
2
= (Au, Au) = B[u, Au] ≤ αkukkAuk.
– A is injective and has closed image. Indeed, by coercivity, we know
βkuk
2
≤ B[u, u] = (Au, u) ≤ kAukkuk.
Dividing by
kuk
, we see that
A
is bounded below, hence is injective and
has closed image (since H is complete).
(Indeed, injectivity is clear, and if
Au
m
→ v
for some
v
, then
ku
m
−u
n
k ≤
1
β
kAu
m
− Au
n
k →
0 as
m, n → ∞
. So (
u
n
) is Cauchy, and hence has a
limit u. Then by continuity, Au = v, and in particular, v ∈ im A)
– Since im A is closed, we know
H = im A ⊕ im A
⊥
.
Now let w ∈ im A
⊥
. Then we can estimate
βkwk
2
≤ B[w, w] = (Aw, w) = 0.
So w = 0. Thus, in fact im A
⊥
= {0}, and so A is surjective.
We would like to apply this to our elliptic PDE. To do so, we need to prove
that our
B
satisfy boundedness and coercivity. Unfortunately, this is not always
true.
Theorem
(Energy estimates for
B
)
.
Suppose
a
ij
=
a
ji
, b
i
, c ∈ L
∞
(
U
), and
there exists θ > 0 such that
n
X
i,j=1
a
ij
(x)ξ
i
ξ
j
≥ θ|ξ|
2
for almost every x ∈ U and ξ ∈ R
n
. Then if B is defined by
B[u, v] =
Z
U
X
ij
v
x
i
a
ij
u
x
j
+
X
i
b
i
u
x
i
v + cuv
dx,
then there exists α, β > 0 and γ ≥ 0 such that
(i) |B[u, v]| ≤ αkuk
H
1
(U)
kvk
H
1
(U)
for all u, v ∈ H
1
0
(U)
(ii) βkuk
2
H
1
(U)
≤ B[u, u] + γkuk
2
L
2
(U)
.
Moreover, if b
i
≡ 0 and c ≥ 0, then we can take γ.
Proof.
(i) We estimate
|B[u, v]| ≤
X
i,j
ka
ij
k
L
∞
(U)
Z
U
|Du||Dv| dx
+
X
i
kbk
C
∞
(U)
Z
U
|Du||v| dx
+ kck
L
∞
(U)
Z
U
|u||v| dx
≤ c
1
kDuk
L
2
(U)
kDvk
L
2
(u)
+ c
2
kDuk
L
2
(U)
kvk
L
2
(U)
+ c
3
kuk
L
2
(U)
kvk
L
2
(u)
≤ αkuk
H
1
(U)
kvk
H
1
(U)
for some α.
(ii) We start from uniform ellipticity. This implies
θ
Z
U
|Du|
2
dx ≤
Z
U
n
X
i,j=1
a
ij
(x)u
x
i
u
x
j
dx
= B[u, u] −
Z
U
n
X
i=1
b
i
u
x
i
u + cu
2
dx
≤ B[u, u] +
n
X
i=1
kb
i
k
L
∞
(U)
Z
|Du||u| dx
+ kck
L
∞
(U)
Z
U
|u|
2
dx.
Now by Young’s inequality, we have
Z
U
|Du||u| dx ≤ ε
Z
U
|Du|
2
dx +
1
4ε
Z
U
|u|
2
dx
for any ε > 0. We choose ε small enough so that
ε
n
X
i=1
kb
i
k
L
∞
(U)
≤
θ
2
.
So we have
θ
Z
U
|Du|
2
dx ≤ B[u, u] +
θ
2
Z
U
|Du|
2
dx + γ
Z
U
|u|
2
dx
for some γ. This implies
θ
2
kDuk
2
L
2
(U)
≤ B[u, u] + γkuk
2
L
2
(U)
We can add
θ
2
kuk
2
L
2
(U)
on both sides to get the desired bound on
kuk
H
1
(U)
.
To get the “moreover” statement, we see that under these conditions, we have
θ
Z
|Du|
2
dx ≤ B[u, u].
Then we apply the Poincar´e’s inequality, which tells us there is some
C >
0
such that for all u ∈ H
1
0
(U), we have
kuk
L
2
(U)
≤ CkDuk
L
2
(U)
.
The estimate (ii) is sometimes called G˚arding’s inequality.
Theorem.
Let
U, L
be as above. There is a
γ ≥
0 such that for any
µ ≥ γ
and
any f ∈ L
2
(U), there exists a unique weak solution to
Lu + µu = f on U
u = 0 on ∂U .
Moreover, we have
kuk
H
1
(U)
≤ Ckf k
L
2
(U)
for some C = C(L, U) ≥ 0.
Again, if b
i
≡ 0 and c ≥ 0, then we may take γ = 0.
Proof.
Take
γ
from the previous theorem when applied to
L
. Then if
µ ≥ γ
and
we set
B
µ
[u, v] = B[u, v] + µ(u, v)
L
2
(U)
,
This is the bilinear form corresponding to the operator
L
µ
= L + µ.
Then by the previous theorem,
B
µ
satisfies boundedness and coercivity. So if we
fix any f ∈ L
2
, and think of it as an element of H
1
0
(U)
∗
by
hf, vi = (f, u)
L
2
(U)
=
Z
U
fv dx,
then we can apply Lax–Milgram to find a unique
u ∈ H
1
0
(
U
) satisfying
B
µ
[
u, v
] =
hf, vi
= (
f, v
)
L
2
(U)
for all
v ∈ H
1
0
(
U
). This is precisely the condition for
u
to be
a weak solution.
Finally, the G˚arding inequality tells us
βkuk
2
H
1
(U)
≤ B
µ
[u, u] = (f, u)
L
2
(U)
≤ kf k
L
2
(U)
kuk
L
2
(U)
.
So we know that
βkuk
H
1
(U)
≤ kf k
L
2
(U)
.
In some way, this is a magical result. We managed to solve a PDE without
having to actually work with a PDE. There are a few things we might object
to. First of all, we only obtained a weak solution, and not a genuine solution.
We will show that under some reasonable assumptions on
a, b, c
, if
f
is better
behaved, then
u
is also better behaved, and in general, if
f ∈ H
k
, then
u ∈ H
k+2
.
This is known as elliptic regularity. Together Sobolev inequalities, this tells us
u
is genuinely a classical solution.
Another problem is the presence of the
µ
. We noted that if
L
is, say, Laplace’s
equation, then we can take
γ
= 0, and so we don’t have this problem. But in
general, this theorem requires it, and this is a bit unsatisfactory. We would like
to think a bit more about it.
4.2 The Fredholm alternative
To understand the second problem, we shall seek to prove the following theorem:
Theorem (Fredholm alternative). Consider the problem
Lu = f, u|
∂U
= 0. (∗)
For
L
a uniformly elliptic operator on an open bounded set
U
with
C
1
boundary,
either
(i)
For each
f ∈ L
2
(
U
), there is a unique weak solution
u ∈ H
1
0
(
U
) to (
∗
); or
(ii)
There exists a non-zero weak solution
u ∈ H
1
0
(
U
) to the homogeneous
problem, i.e. (∗) with f = 0.
This is similar to what we know about solving matrix equations
Ax
=
b
—
either there is a solution for all
b
, or there are infinitely many solutions to the
homogneous problem.
Similar to the previous theorem, this follows from some general functional
analytic result. Recall the definition of a compact operator:
Definition
(Compact operator)
.
A bounded operator
K
:
H → H
0
is compact
if every bounded sequence (
u
m
)
∞
m=1
has a subsequence
u
m
j
such that (
Ku
m
j
)
∞
j=1
converges strongly in H.
Recall (or prove as an exercise) the following theorem regarding compact
operators.
Theorem
(Fredholm alternative)
.
Let
H
be a Hilbert space and
K
:
H → H
be a compact operator. Then
(i) ker(I − K) is finite-dimensional.
(ii) im(I − K) is finite-dimensional.
(iii) im(I − K) = ker(I − K
†
)
⊥
.
(iv) ker(I − K) = {0} iff im(I − K) = H.
(v) dim ker(I − K) = dim ker(I − K
†
) = dim coker(I − K).
How do we apply this to our situation? Our previous theorem told us that
L
+
γ
is invertible for large
γ
, and we claim that (
L
+
γ
)
−1
is compact. We can
then deduce the previous result by applying (iv) of the Fredholm alternative
with K a (scalar multiple of) (L + γ)
−1
(plus some bookkeeping).
So let us show that (
L
+
γ
)
−1
is compact. Note that this maps sends
f ∈ L
2
(
U
)
to
u ∈ H
1
0
(
U
). To make it an endomorphism, we have to compose this with the
inclusion
H
1
0
(
U
)
→ L
2
(
U
). The proof that (
L
+
γ
)
−1
is compact will not involve
(
L
+
γ
)
−1
in any way — we shall show that the inclusion
H
1
0
(
U
)
→ L
2
(
U
) is
compact!
We shall prove this in two steps. First, we need the notion of weak conver-
gence.
Definition
(Weak convergence)
.
Suppose (
u
n
)
∞
n=1
is a sequence in a Hilbert
space H. We say u
n
converges weakly to u ∈ H if
(u
n
, w) → (u, w)
for all w ∈ H. We write u
n
* u.
Of course, we have
Lemma. Weak limits are unique.
Lemma. Strong convergence implies weak convergence.
We shall show that given any bounded sequence in
H
1
0
(
U
), we can find
a subsequence that is weakly convergent. We then show that every weakly
convergent sequence in H
1
0
(U) is strongly convergent in L
2
(U).
In fact, the first result is completely general:
Theorem
(Weak compactness)
.
Let
H
be a separable Hilbert space, and suppose
(
u
m
)
∞
m=1
is a bounded sequence in
H
with
ku
m
k ≤ K
for all
m
. Then
u
m
admits
a subsequence (u
m
j
)
∞
j=1
such that u
m
j
* u for some u ∈ H with kuk ≤ K.
One can prove this theorem without assuming
H
is separable, but it is slightly
messier.
Proof.
Let (
e
i
)
∞
i=1
be an orthonormal basis for
H
. Consider (
e
1
, u
m
). By Cauchy–
Schwarz, we have
|(e
1
, u
m
)| ≤ ke
1
kke
m
k ≤ K.
So by Bolzano–Weierstrass, there exists a subsequence (
u
m
j
) such that (
e
1
, u
m
j
)
converges.
Doing this iteratively, we can find a subsequence (
v
`
) such that for each
i
,
there is some c
i
such that (e
i
, v
`
) → c
i
as ` → ∞.
We would expect the weak limit to be
P
c
i
e
i
. To prove this, we need to first
show it converges. We have
p
X
j=1
|c
j
|
2
= lim
k→∞
p
X
j=1
|(e
j
, v
`
)|
2
≤ sup
p
X
j=1
|(e
j
, v
`
)|
2
≤ sup kv
k
k
2
≤ K
2
,
using Bessel’s inequality. So
u =
∞
X
j=1
c
j
e
j
converges in H, and kuk ≤ K. We already have
(e
j
, v
`
) → (e
j
, u)
for all
j
. Since
kv
`
− uk
is bounded by 2
K
, it follows that the set of all
w
such
that
(w, v
`
) → (v, u) (†)
is closed under finite linear combinations and taking limits, hence is all of
H
.
To see that it is closed under limits, suppose
w
k
→ w
, and
w
k
satisfy (
†
). Then
|(w, v
`
)−(w, u)| ≤ |(w−w
k
, v
`
−u)|+|(w
k
, v
`
−u)| ≤ 2Kkw −w
k
k+|(w
k
, v
`
−u)|
So we can first find
k
large enough such that the first term is small, then pick
`
such that the second is small.
We next want to show that if
u
m
* u
in
H
1
(
U
), then
u
m
→ u
in
L
1
. We
may as well assume that
U
is some large cube of length
L
by extension. Notice
that since
U
is bounded, the constant function 1 is in
H
1
0
(
U
). So
u
m
* u
in
particular implies
R
U
(u
m
− u) dx → 0.
Recall that the Poincar´e inequality tells us if
u ∈ H
1
0
(
U
), then we can bound
kuk
L
2
(Q)
by some multiple of
k
D
uk
L
2
(U)
. If we try to prove this without the
assumption that
u
vanishes on the boundary, then we find that we need a
correction term. The resulting lemma is as follows:
Lemma
(Poincar´e revisited)
.
Suppose
u ∈ H
1
(
R
n
). Let
Q
= [
ξ
1
, ξ
1
+
L
]
×···×
[ξ
n
, ξ
n
+ L] be a cube of length L. Then we have
kuk
2
L
2
(Q)
≤
1
|Q|
Z
Q
u(x) dx
2
+
nL
2
2
kDuk
2
L
2
(Q)
.
We can improve this to obtain better bounds by subdividing
Q
into smaller
cubes, and then applying this to each of the cubes individually. By subdividing
enough, this leads to a proof that u
m
* u in H
1
implies u
m
→ u in H
0
.
Proof. By approximation, we can assume u ∈ C
∞
(
¯
Q). For x, y ∈ Q, we write
u(x) − u(y) =
Z
x
1
y
1
d
dt
u(t, x
1
, . . . , x
n
) dt
+
Z
x
2
y
2
d
dt
u(y
1
, t, x
3
, . . . , x
n
) dt
+ ···
+
Z
x
n
y
n
d
dt
u(y
1
, . . . , y
n−1
, t) dt.
Squaring, and using 2ab ≤ a
2
+ b
2
, we have
u(x)
2
+ u(y)
2
− 2u(x)u(y) ≤ n
Z
x
1
y
1
d
dt
u(t, x
1
, . . . , x
n
) dt
2
+ ···
+ n
Z
x
n
y
n
d
dt
u(y
1
, . . . , y
n−1
, t) dt
2
.
Now integrate over x and y. On the left, we get
ZZ
Q×Q
dx dy (u(x)
2
+ u(y)
2
− 2u(x)u(y)) = 2|Q|kuk
2
L
2
(Q)
− 2
Z
Q
u(x) dx
2
.
On the right we have
I
1
=
Z
x
1
y
1
d
dt
u(t, x
2
, . . . , x
n
) dt
2
≤
Z
x
1
y
1
dt
Z
x
1
y
1
d
dt
u(t, x
2
, . . . , x
n
)
2
dt (Cauchy–Schwarz)
≤ L
Z
ξ
1
+L
ξ
1
d
dt
u(t, x
2
, . . . , x
n
)
2
dt.
Integrating over all x, y ∈ Q, we get
ZZ
Q×Q
dx dy I
1
≤ L
2
|Q|kD
1
uk
2
L
2
(Q)
.
Similarly estimating the terms on the right-hand side, we find that
2|Q|kuk
L
2
(Q)
− 2
Z
Q
u(x) dx
2
≤ n|Q|
n
X
i=1
kD
i
uk
2
L
2
(Q)
= n|Q|L
2
kDuk
2
L
2
(Q)
.
It now follows that
Theorem
(Rellich–Kondrachov)
.
Let
U ⊆ R
n
be open, bounded with
C
1
boundary. Then if (
u
m
)
∞
m=1
is a sequence in
H
1
(
U
) with
u
m
* u
, then
u
m
→ u
in L
2
.
In particular, by weak compactness any sequence in
H
1
(
U
) has a subsequence
that is convergent in L
2
(U).
Note that to obtain the “in particular” part, we need to know that
H
1
(
U
) is
separable. This is an exercise on the example sheet. Alternatively, we can appeal
to a stronger version of weak compactness that does not assume separability.
Proof.
By the extension theorem, we may assume
U
=
Q
for some large cube
Q
with U b Q.
We subdivide
Q
into
N
many cubes of side length
δ
, such that the cubes
only intersect at their faces. Call these {Q
a
}
N
a=1
.
We apply Poincar´e separately to each of these to obtain
ku
j
− uk
2
L
2
(Q)
=
N
X
a=1
ku
j
− uk
2
L
2
(Q
a
)
≤
N
X
a=1
"
1
|Q
a
|
Z
Q
a
(u
i
− u) dx
2
+
nδ
2
2
kDu
i
− Duk
2
L
2
(Q
a
)
#
=
N
X
a=1
1
|Q
a
|
Z
Q
a
(u
i
− u) dx
2
+
nδ
2
2
kDu
i
− Duk
2
L
2
(Q)
.
Now since
k
D
u
i
−
D
uk
2
L
2
(Q)
is fixed, for
δ
small enough, the second term is
<
ε
2
.
Then since u
i
* u, we in particular have
Z
Q
1
(u
i
− u) dx → 0 as i → ∞
for all
a
, since this is just the inner product with the constant function 1. So for
i large enough, the first term is also <
ε
2
.
The same result holds with
H
1
(
U
) replaced by
H
1
0
(
U
). The proof is in fact
simpler, and we wouldn’t need the assumption that the boundary is C
1
.
Corollary.
Suppose
K
:
L
2
(
U
)
→ H
1
(
U
) is a bounded linear operator. Then
the composition
L
2
(U) H
1
(U) L
2
(U)
K
is compact.
The slogan is that we get compactness whenever we improve regularity, which
is something that happens in much more generality.
Proof.
Indeed, if
u
m
∈ L
2
(
U
) is bounded, then
Ku
m
is also bounded. So by
Rellich–Kondrachov, there exists a subsequence u
m
j
→ u in L
2
(U).
We are now ready to prove the Fredholm alternative for elliptic boundary
value problems. Recall that in our description of the Fredholm alternative, we
had the direct characterizations
im
(
I − K
) =
ker
(
I − K
†
)
⊥
. We can make the
analogous statement here. To do so, we need to talk about the adjoint of
L
.
Since
L
is not an operator defined on
L
2
(
U
), trying to write down what it means
to be an adjoint is slightly messy. Instead, we shall be content with talking
about “formal adjoints”.
It’s been a while since we’ve met a PDE, so let’s recall the setting we had.
We have a uniformly elliptic operator
Lu = −
n
X
i,j=1
(a
ij
(x)u
x
j
)
x
i
+
n
X
i=1
b
i
(x)u
x
i
+ c(x)u
on an open bounded set U with C
1
boundary. The associated bilinear form is
B[u, v] =
Z
U
u
X
i,j
a
ij
(x)u
x
i
v
x
j
+
n
X
i=1
b
i
(x)u
x
i
v + c(x)uv
dx.
We are interested solving in the boundary value problem
Lu = f, u|
∂u
= 0
with f ∈ L
2
(U).
The formal adjoint of L is defined by the relation
(Lφ, ψ)
L
2
(U)
= (φ, L
†
ψ)
L
2
(U)
for all φ, ψ ∈ C
∞
c
(U). By integration by parts, we know L
†
should be given by
L
†
v = −
n
X
i,j=1
(a
ij
v
x
j
)
x
i
−
n
X
i=1
b
i
(x)v
x
j
+
c −
n
X
i=1
b
i
x
i
!
v.
Note that here we have to assume that
b
i
∈ C
1
(
¯
U
). However, what really
interests us is the adjoint bilinear form, which is simply given by
B
†
[v, u] = B[u, v].
We are actually just interested in
B
†
, and not
L
†
, and we can sensibly talk about
B
†
even if b
i
is not differentiable.
As usual, we say
v ∈ H
1
0
(
U
) is a weak solution of the adjoint problem
L
†
v = f, v|
∂U
= 0 if
B
†
[v, u] = (f, u)
L
2
(U)
for all u ∈ H
1
0
(U).
Given this set up, we can now state and prove the Fredholm alternative.
Theorem
(Fredholm alternative for elliptic BVP)
.
Let
L
be a uniformly elliptic
operator on an open bounded set U with C
1
boundary. Consider the problem
Lu = f, u|
∂U
= 0. (∗)
Then exactly one of the following are true:
(i) For each f ∈ L
2
(U), there is a unique weak solution u ∈ H
1
0
(U) to (∗)
(ii)
There exists a non-zero weak solution
u ∈ H
1
0
(
U
) to the homogeneous
problem, i.e. (∗) with f = 0.
If this holds, then the dimension of
N
=
ker L ⊆ H
1
0
(
U
) is equal to the
dimension of N
∗
= ker L
†
⊆ H
1
0
(U).
Finally, (∗) has a solution if and only if (f, v)
L
2
(U)
= 0 for all v ∈ N
∗
Proof.
We know that there exists
γ >
0 such that for any
f ∈ L
2
(
U
), there is a
unique weak solution u ∈ H
1
0
(U) to
L
γ
u = Lu + γu = f, u|
∂U
= 0.
Moreover, we have the bound
kuk
H
1
(U)
≤ Ckf k
L
2
(U)
(which gives uniqueness).
Thus, we can set
L
−1
γ
f
to be this
u
, and then
L
−1
γ
:
L
2
(
U
)
→ H
1
0
(
U
) is a
bounded linear map. Composing with the inclusion
L
2
(
U
), we get a compact
endomorphism of L
2
(U).
Now suppose u ∈ H
1
0
is a weak solution to (∗). Then
B[u, v] = (f, v)
L
2
(U)
for all v ∈ H
1
0
(U)
is true if and only if
B
γ
[u, v] ≡ B[u, v] + γ(u, v) = (f + γu, v) for all v ∈ H
1
0
(U).
Hence, u is a weak solution of (∗) if and only if
u = L
−1
γ
(f + γu) = γL
−1
γ
u + L
−1
γ
f.
In other words, u solves (∗) iff
u − Ku = h,
for
K = γL
−1
γ
, h = L
−1
γ
f.
Since we know that
K
:
L
2
(
U
)
→ L
2
(
U
) is compact, by the Fredholm alternative
for compact operators, either
(i) u − Ku = h admits a solution u ∈ L
2
(U) for all h ∈ L
2
(U); or
(ii)
There exists a non-zero
u ∈ L
2
(
U
) such that
u − Ku
= 0. Moreover,
im(I − K) = ker(I − K
†
)
⊥
and dim ker(I − K) = dim im(I − K)
⊥
.
There is a bit of bookkeeping to show that this corresponds to the two alternatives
in the theorem.
(i) We need to show that u ∈ H
1
0
(U). But this is trivial, since we have
u = γL
−1
γ
u + L
−1
γ
f,
and we know that L
−1
γ
maps L
2
(U) into H
1
0
(U).
(ii)
As above, we know that the non-zero solution
u
. There are two things to
show. First, we have to show that
v − K
†
v
= 0 iff
v
is a weak solution to
L
†
v = 0, v|
∂U
= 0.
Next, we need to show that h = L
−1
γ
f ∈ (N
∗
)
⊥
iff f ∈ (N
∗
)
⊥
.
For the first part, we want to show that
v ∈ ker
(
I − K
†
) iff
B
†
[
v, u
] =
B[u, v] = 0 for all u ∈ H
1
0
(U).
We are good at evaluating
B
[
u, v
] when
u
is of the form
L
−1
γ
w
, by definition
of a weak solution. Fortunately,
im L
−1
γ
contains
C
∞
c
(
U
), since
L
−1
γ
L
γ
φ
=
φ
for all
φ ∈ C
∞
c
(
U
). In particular,
im L
−1
γ
is dense in
H
1
0
(
U
). So it
suffices to show that
v ∈ ker
(
I − K
†
) iff
B
[
L
−1
γ
w, v
] = 0 for
w ∈ L
2
(
U
).
This is immediate from the computation
B[L
−1
γ
w, v] = B
γ
[L
−1
γ
w, v]−γ(L
−1
γ
w, v) = (w, v)−(Kw, v) = (w, v−K
†
v).
The second is also easy — if v ∈ N
∗
= ker(I − K
†
), then
(L
−1
γ
f, v) =
1
γ
(Kf, v) =
1
γ
(f, K
†
v) =
1
γ
(f, v).
4.3 The spectrum of elliptic operators
Let’s recap what we have obtained so far. Given
L
, we have found some
γ
such
that whenever
µ ≥ γ
, there is a unique solution to (
L
+
µ
)
u
=
f
(plus boundary
conditions). In particular,
L
+
µ
has trivial kernel. For
µ ≤ γ
, (
L
+
µ
)
u
= 0 may
or may not have a non-trivial solution, but we know this satisfies the Fredholm
alternative, since L + µ is still an elliptic operator.
Rewriting (
L
+
µ
)
u
= 0 as
Lu
=
−µu
, we are essentially considering eigen-
values of
L
. Of course,
L
is not a bounded linear operator, so our usual spectral
theory does not apply to
L
. However, as always, we know that
L
−1
γ
is compact
for large enough
γ
, and so the spectral theory of compact operators can tell us
something about what the eigenvalues of L look like.
We first recall some elementary definitions. Note that we are explicitly
working with real Hilbert spaces and spectra.
Definition
(Resolvent set)
.
Let
A
:
H → H
be a bounded linear operator.
Then the resolvent set is
ρ(A) = {λ ∈ R : A − λI is bijective}.
Definition (Spectrum). The spectrum of a bounded linear A : H → H is
σ(A) = R \ ρ(A).
Definition
(Point spectrum)
.
We say
η ∈ σ
(
A
) belongs to the point spectrum
of A if
ker(A − ηI) 6= {0}.
If η ∈ σ
p
(A) and w satisfies Aw = ηw, then w is an associated eigenvector .
Our knowledge of the spectrum of
L
will come from known results about the
spectrum of compact operators.
Theorem
(Spectral theorem of compact operators)
.
Let
dim H
=
∞
, and
K : H → H a compact operator. Then
– σ(K) = σ
p
(K) ∪ {0}. Note that 0 may or may not be in σ
p
(K).
– σ(K) \ {0} is either finite or is a sequence tending to 0.
– If λ ∈ σ
p
(K), then ker(K − λI) is finite-dimensional.
–
If
K
is self-adjoint, i.e.
K
=
K
†
and
H
is separable, then there exists a
countable orthonormal basis of eigenvectors.
From this, it follows easily that
Theorem (Spectrum of L).
(i)
There exists a countable set Σ
⊆ R
such that there is a non-trivial solution
to Lu = λu iff λ ∈ Σ.
(ii)
If Σ is infinite, then Σ =
{λ
k
}
∞
k=1
, the values of an increasing sequence
with λ
k
→ ∞.
(iii) To each λ ∈ Σ there is an associated finite-dimensional space
E(λ) = {u ∈ H
1
0
(U) | u is a weak solution of (∗) with f = 0}.
We say
λ ∈
Σ is an eigenvalue and
u ∈ E
(
λ
) is the associated eigenfunction.
Proof.
Apply the spectral theorem to compact operator
L
−1
γ
:
L
2
(
U
)
→ L
2
(
U
),
and observe that
L
−1
γ
u = λu ⇐⇒ u = λ(L + γ)u ⇐⇒ Lu =
1 − λγ
λ
u.
Note that L
−1
γ
does not have a zero eigenvalue.
In certain cases, such as Laplace’s equation, our operator is “self-adjoint”,
and more things can be said. As before, we want the “formally” quantifier:
Definition
(Formally self-adjoint)
.
An operator
L
is formally self-adjoint if
L = L
†
. Equivalently, if b
i
≡ 0.
Definition
(Positive operator)
.
We say
L
is positive if there exists
C >
0 such
that
kuk
2
H
1
0
(U)
≤ CB[u, u] for all u ∈ H
1
0
(U).
Theorem.
Suppose
L
is a formally self-adjoint, positive, uniformly elliptic
operator on
U
, an open bounded set with
C
1
boundary. Then we can represent
the eigenvalues of L as
0 < λ
1
≤ λ
2
≤ λ
3
≤ ··· ,
where each eigenvalue appears according to its multiplicity (
dim E
(
λ
)), and there
exists an orthonormal basis
{w
k
}
∞
k=1
of
L
2
(
U
) with
w
k
∈ H
1
0
(
U
) an eigenfunction
of L with eigenvalue λ
k
.
Proof.
Note that positivity implies
c ≥
0. So the inverse
L
−1
:
L
2
(
U
)
→ L
2
(
U
)
exists and is a compact operator. We are done if we can show that
L
−1
is
self-adjoint. This is trivial, since for any f, g, we have
(L
−1
f, g)
L
2
(U)
= B[v, u] = B[u, v] = (L
−1
g, f )
L
2
(U)
.
4.4 Elliptic regularity
We can finally turn to the problem of regularity. We previously saw that when
solving
Lu
=
f
, if
f ∈ L
2
(
U
), then by definition of a weak solution, we have
u ∈ H
1
0
(
U
), so we have gained some regularity when solving the differential
equation. However, it is not clear that
u ∈ H
2
(
U
), so we cannot actually say
u
solves
Lu
=
f
. Even if
u ∈ H
2
(
U
), it may not be classically differentiable, so
Lu
=
f
isn’t still holding in the strongest possible sense. So we might hope that
under reasonable circumstances,
u
is in fact twice continuously differentiable.
But human desires are unlimited. If
f
is smooth, we might hope further that
u
is also smooth. All of these will be true.
Let’s think about how regularity may fail. It could be that the individual
derivatives of
u
are quite singular, but in
Lu
all these singularities happen to
cancel with each other. Thus, the content of elliptic regularity is that this doesn’t
happen.
To see why we should expect this to be true, suppose for convenience that
u, f ∈ C
∞
c
(R
n
) and
−∆u = f.
Using integration by parts, we compute
Z
R
n
f
2
dx =
Z
R
n
(∆u)
2
dx
=
X
i,j
Z
R
n
(D
i
D
i
u)(D
j
D
j
u) dx
=
X
i,j
Z
R
n
(D
i
D
j
u)(D
i
D
j
u) dx
= kD
2
uk
L
2
(R
n
)
.
So we have deduced that
kD
2
uk
L
2
(R
n
)
= k∆uk
L
2
(R
n
)
.
This is of course not a very useful result, because we have a priori assumed
that
u
and
f
are
C
∞
, while what we want to prove that
u
is, for example, in
H
2
(u). However, the fact that we can control the H
2
norm if we assumed that
u ∈ H
2
(
U
) gives us some strong indication that we should be able to show that
u must always be in H
2
(U).
The idea is to run essentially the same argument for weak solutions, without
mentioning the word “second derivative”. This involves the use of difference
quotients.
Definition
(Difference quotient)
.
Suppose
U ⊆ R
n
is open and
V b U
. For
0 < |h| < dist(V, ∂U), we define
∆
h
i
u(x) =
u(x + he
i
) − u(x)
h
∆
k
u(x) = (∆
h
1
u, . . . , ∆
h
n
u).
Observe that if
u ∈ L
2
(
U
), then ∆
h
u ∈ L
2
(
V
). If further
u ∈ H
1
(
U
), then
∆
h
u ∈ H
1
(V ) and D∆
h
u = ∆
h
Du.
What makes difference quotients useful is the following lemma:
Lemma. If u ∈ L
2
(U), then u ∈ H
1
(V ) iff
k∆
h
uk
L
2
(V )
≤ C
for some C and all 0 < |h| <
1
2
dist(V, ∂U ). In this case, we have
1
˜
C
kDuk
L
2
(V )
≤ k∆
h
uk
L
2
(V )
≤
˜
CkDuk
L
2
(V )
.
Proof. See example sheet.
Thus, if we are able to establish the bounds we had for the Laplacian using
difference quotients, then this tells us u is in H
2
loc
(U).
Lemma. If w, v and compactly supported in U, then
Z
U
w∆
−h
k
v dx =
Z
U
(∆
h
k
w)v dx
∆
h
k
(wv) = (τ
h
k
w)∆
h
k
v + (∆
h
k
w)v,
where τ
h
k
w(x) = w(x + he
k
).
Theorem
(Interior regularity)
.
Suppose
L
is uniformly elliptic on an open set
U ⊆ R
n
, and assume
a
ij
∈ C
1
(
U
),
b
i
, c ∈ L
∞
(
U
) and
f ∈ L
2
(
U
). Suppose
further that u ∈ H
1
(U) is such that
B[u, v] = (f, v)
L
2
(U)
(†)
for all v ∈ H
1
0
(U). Then u ∈ H
2
loc
(U), and for each V b U, we have
kuk
H
2
(V )
≤ C(kf k
L
2
(U)
+ kuk
L
2
(U)
),
with C depending on L, V, U, but not f or u.
Note that we don’t require
u ∈ H
1
0
(
U
), so we don’t require
u
to satisfy the
boundary conditions. In this case, there may be multiple solutions, so we need
the
u
on the right. Also, observe that we don’t actually need uniform ellipticity,
as the property of being in
H
2
loc
(
U
) can be checked locally, and
L
is always
locally uniformly elliptic.
The proof is essentially what we did for the Laplacian just now, except this
time it is much messier since we need to use difference quotients instead of
derivatives, and there are lots of derivatives of
a
ij
’s that have to be kept track
of.
When using regularity results, it is often convenient to not think about it in
terms of “solving equations”, but as something that (roughly) says “if
u
is such
that Lu happens to be in L
2
(say), then u is in H
2
loc
(U)”.
Proof.
We first show that we may in fact assume
b
i
=
c
= 0. Indeed, if we know
the theorem for such L, then given a general L, we write
L
0
u = −
X
(a
ij
u
x
j
)
x
i
, Ru =
X
b
i
u
x
i
+ cu.
Then if
u
is a weak solution to
Lu
=
f
, then it is also a weak solution to
L
0
u
=
f − Ru
. Noting that
Ru ∈ L
2
(
U
), this tells us
u ∈ H
2
loc
(
U
). Moreover,
on V b U ,
– We can control kuk
H
2
(V )
by kf − Ruk
L
2
(V )
and kuk
L
2
(V )
(by theorem).
– We can control kf − Ruk
L
2
(V )
by kfk
L
2
(V )
, kuk
L
2
(V )
and kDuk
L
2
(V )
.
–
By G˚arding’s inequality, we can control
k
D
uk
L
2
(V )
by
kuk
L
2
(V )
and
B[u, u] = (f, u)
L
2
(V )
.
– By H¨older, we can control (f, u)
L
2
(V )
by kfk
L
2
(V )
and kuk
L
2
(V )
.
So it suffices to consider the case where
L
only has second derivatives. Fix
V b U
and choose
W
such that
V b W b U
. Take
ξ ∈ C
∞
c
(
W
) such that
ζ ≡
1
on V .
Recall that our example of Laplace’s equation, we considered the integral
R
f
2
d
x
and did some integration by parts. Essentially, what we did was to
apply the definition of a weak solution to ∆
u
. There we was lucky, and we could
obtain the result in one go. In general, we should consider the second derivatives
one by one.
For k ∈ {1, . . . n}, we consider the function
v = −∆
−h
k
(ζ
2
∆
h
k
u).
As we shall see, this is the correct way to express
u
x
k
x
k
in terms of difference
quotients (the
−h
in the first ∆
−h
k
comes from the fact that we want to integrate
by parts). We shall put this into the definition of a weak solution to say
B
[
u, v
] = (
f, v
). The plan is to isolate a
k
∆
h
k
D
uk
2
term on the left and then
bound it.
We first compute
B[u, v] = −
X
i,j
Z
U
a
ij
u
x
i
∆
−h
k
(ζ
2
∆
h
k
u)
x
j
dx
=
X
i,j
Z
U
∆
h
k
(a
ij
u
x
i
)(ζ
2
∆
h
k
u)
x
j
dx
=
X
i,j
Z
U
(τ
h
k
a
ij
∆
h
k
u
x
i
+ (∆
h
k
a
ij
)u
x
i
)(ζ
2
∆
h
k
u
x
j
+ 2ζζ
x
j
∆
h
k
u) dx
≡ A
1
+ A
2
,
where
A
1
=
X
i,j
Z
U
ξ
2
(τ
h
k
a
ij
)(∆
h
k
u
x
i
)(∆
h
k
u
x
j
) dx
A
2
=
X
i,j
Z
U
h
(∆
h
k
a
ij
)u
x
i
ζ
2
∆
h
k
u
x
j
+ 2ζζ
x
j
∆
h
k
u(τ
h
k
a
ij
∆
h
k
u
x
i
+ (∆
h
k
a
ij
)u
x
i
)
i
dx.
By uniform ellipticity, we can bound
A
1
≥ θ
Z
U
ξ
2
|∆
h
k
Du|
2
dx.
This is what we want to be small.
Note that
A
2
looks scary, but every term either only involves “first derivatives”
of
u
, or a product of a second derivative of
u
with a first derivative. Thus, applying
Young’s inequality, we can bound
|A
2
|
by a linear combination of
|
∆
h
k
D
u|
2
and
|Du|
2
, and we can make the coefficient of |∆
h
k
Du|
2
as small as possible.
In detail, since
a
ij
∈ C
1
(
U
) and
ζ
is supported in
W
, we can uniformly
bound a
ij
, ∆
h
k
a
ij
, ζ
x
j
, and we have
|A
2
| ≤ C
Z
W
h
ζ|∆
h
k
Du||Du| + ζ|Du||∆
h
k
u| + ζ|∆
h
k
Du||∆
h
k
u|
i
dx.
Now recall that
k
∆
h
k
uk
is bounded by
k
D
uk
. So applying Young’s inequality, we
may bound (for a different C)
|A
2
| ≤ ε
Z
W
ζ
2
|∆
h
k
Du|
2
+ C
Z
W
|Du|
2
dx.
Thus, taking ε =
θ
2
, it follows that
(f, v) = B[u, v] ≥
θ
2
Z
U
ζ
2
|∆
h
k
Du|
2
dx − C
Z
W
|Du|
2
dx.
This is promising.
It now suffices to bound (f, v) from above. By Young’s inequality,
|(f, v)| ≤
Z
|f||∆
−h
k
(ζ
2
∆
h
k
u)| dx
≤ C
Z
|f||D(ζ
2
∆
h
k
u)| dx
≤ ε
Z
|D(ζ
2
∆
h
k
u)|
2
dx + C
Z
|f|
2
dx
≤ ε
Z
|ζ
2
∆
h
k
Du|
2
dx + C(kfk
2
L
2
(U)
+ kDuk
2
L
2
(U)
)
Setting ε =
θ
4
, we get
Z
U
ζ
2
|∆
h
k
Du|
2
dx ≤ C(kfk
2
L
2
(W )
+ kDuk
2
L
2
(W )
),
and so, in particular, we get a uniform bound on
k
∆
h
k
D
uk
L
2
(V )
. Now as before,
we can use G˚arding to get rid of the kDuk
L
2
(W )
dependence on the right.
Notice that this is a local result. In order to have
u ∈ H
2
(
V
), it is enough
for us to have
f ∈ L
2
(
W
) for some
W
slightly larger than
V
. Thus, singularities
do not propagate either in from the boundary or from regions where
f
is not
well-behaved.
With elliptic regularity, we can understand weak solutions as genuine solutions
to the equation
Lu
=
f
. Indeed, if
u
is a weak solution, then for any
v ∈ C
∞
c
(
U
),
we have
B
[
u, v
] = (
f, v
), hence after integrating by parts, we recover (
Lu−f, v
) =
0 for all v ∈ C
∞
c
(U). So in fact Lu = f almost everywhere.
It is natural to hope that we can get better than
u ∈ H
2
loc
(
U
). This is actually
not hard given our current work. If
Lu
=
f
, and all
a
ij
, b
i
, c, f
are sufficiently
well-behaved, then we can simply differentiate the whole qeuation with respect
to
x
i
, and then observe that
u
x
i
satisfies some second-order elliptic PDE of the
form previously understood, and if we do this for all
i
, then we can conclude
that
u ∈ H
3
loc
(
U
). Of course, some bookkeeping has to be done if we were to do
this properly, since we need to write everything in weak form. However, this is
not particularly hard, and the details are left as an exercise.
Theorem
(Elliptic regularity)
.
If
a
ij
, b
i
and
c
are
C
m+1
(
U
) for some
m ∈ N
,
and f ∈ H
m
(U), then u ∈ H
m+2
loc
(U) and for V b W b U, we can estimate
kuk
H
m+2
(V )
≤ C(kf k
H
m
(W )
+ kuk
L
2
(W )
).
In particular, if
m
is large enough, then
u ∈ C
2
loc
(
U
), and if all
a
ij
, b
i
, c, f
are
smooth, then u is also smooth.
We can similarly obtain a H¨older theory of elliptic regularity, which gives
(roughly) f ∈ C
k,α
(U) implies u ∈ C
k+2,α
(U).
The final loose end is to figure out what happens at the boundary.
Theorem
(Boundary
H
2
regularity)
.
Assume
a
ij
∈ C
1
(
¯
U
),
b
1
, c ∈ L
∞
(
U
), and
f ∈ L
2
(
U
). Suppose
u ∈ H
1
0
(
U
) is a weak solution of
Lu
=
f, u|
∂U
= 0. Finally,
we assume that ∂U is C
2
. Then
kuk
H
2
(U)
≤ C(kf k
L
2
(U)
+ kuk
L
2
(U)
).
If
u
is the unique weak solution, we can drop the
kuk
L
2
(U)
from the right hand
side.
Proof.
Note that we already know that
u
is locally in
H
2
loc
(
U
). So we only have
to show that the second-derivative is well-behaved near the boundary.
By a partition of unity and change of coordinates, we may assume we are in
the case
U = B
1
(0) ∩ {x
n
> 0}.
Let
V
=
B
1/2
(0)
∩ {x
n
>
0
}
. Choose a
ζ ∈ C
∞
c
(
B
1
(0)) with
ζ ≡
1 on
V
and
0 ≤ ζ ≤ 1.
Most of the proof in the previous proof goes through, as long as we restrict
to
v = −∆
−h
k
(ζ
2
∆
h
k
u)
with
k 6
=
n
, since all the translations keep us within
U
, and hence are well-defined.
Thus, we control all second derivatives of the form D
k
D
i
u
, where
k ∈
{
1
, . . . , n −
1
}
and
i ∈ {
1
, . . . , n}
. The only remaining second-derivative to
control is D
n
D
n
u
. To understand this, we go back to the PDE and look at the
PDE itself. Recall that we know it holds pointwise almost everywhere, so
n
X
i,j=1
(a
ij
u
x
i
)
x
j
+
n
X
i=1
b
i
u
x
i
+ cu = f.
So we can write
a
nn
u
x
n
u
x
n
=
F
almost everywhere, where
F
depends on
a, b, c, f
and all (up to) second derivatives of
u
that are not
u
x
n
x
n
. Thus,
F
is controlled
in
L
2
. But uniform ellipticity implies
a
nn
is bounded away from 0. So we are
done.
Similraly, we can reiterate this to obtain higher regularity results.
5 Hyperbolic equations
So far, we have been looking at elliptic PDEs. Since the operator is elliptic,
there is no preferred “time direction”. For example, Laplace’s equation models
static electric fields. Thus, it is natural to consider boundary value problems in
these cases.
Hyperbolic equations single out a time direction, and these model quantities
that evolve in time. In this case, we are often interested in initial value problems
instead. Let’s first define what it means for an equation to by hyperbolic
Definition
(Hyperbolic PDE)
.
A second-order linear hyperbolic PDE is a PDE
of the form
n+1
X
i,j=1
(a
ij
(y)u
y
j
)
y
i
+
n+1
X
i=1
b
i
(y)u
y
i
+ c(y)u = f
with y ∈ R
n+1
, a
ij
= a
ji
, b
i
, c ∈ C
∞
(R
n+1
), such that the principal symbol
Q(ξ) =
n+1
X
i,j=1
a
ij
(y)ξ
i
ξ
j
has signature (+
, −, −, . . .
) for all
y
. That is to say, after perhaps changing basis,
at each point we can write
q(ξ) = λ
2
n+1
ξ
2
n+1
−
n
X
i=1
λ
2
i
ξ
2
i
with λ
i
> 0.
It turns out not to be too helpful to treat this equation at this generality.
We would like to pick out a direction that corresponds to the positive eigenvalue.
By a coordinate transformation, we can locally put our equation in the form
u
tt
=
n
X
i,j=1
(a
ij
(x, t)u
x
i
)
x
j
+
n
X
i=1
b
i
(x, t)u
x
i
+ c(x, t)u.
Note that we did not write down a
u
t
term. It doesn’t make much difference,
and it is notationally convenient to leave it out.
In this form, hyperbolicity is equivalent to the statement that the operator
on the right is elliptic for each
t
(or rather, the negative of the right hand side).
We observe that
t
= 0 is a non-characteristic surface. So we can hope to
solve the Cauchy problem. In other words, we shall specify
u|
t=0
and
u
t
|
t=0
.
Actually, we’ll look at an initial boundary value problem. Consider a region of
the form R × U , where U ⊆ R
n
is open bounded with C
1
boundary.
t = 0
t = T
U
We define
U
t
= (0, t) × U
Σ
t
= {t} × U
∂
∗
U
t
= [0, t] × ∂U.
Then
∂U
T
= Σ
0
t Σ
T
t ∂
∗
U
T
.
The general initial boundary value problem (IVBP) is as follows: Let
L
be a
(time-dependent) uniformly elliptic operator. We want to solve
u
tt
+ Lu = f on U
T
u = ψ on Σ
0
u
t
= ψ
0
on Σ
0
u = 0 on ∂
∗
U
T
.
In the case of elliptic PDEs, we saw that Laplace’s equation was a canonical,
motivating example. In this case, if we take
L
=
−
∆, then we obtain the wave
equation. Let’s see what we can do with it.
Example.
Start with the equation
u
tt
−
∆
u
= 0. Multiply by
u
t
and integrate
over U
t
to obtain
0 =
Z
U
t
u
tt
u
t
− u
t
∆u
dx dt
=
Z
U
t
1
2
∂
∂t
u
2
t
− ∇ · (u
t
Du) + Du
t
· Du
dx dt
=
Z
U
t
1
2
∂
∂t
u
2
t
+ |Du|
2
− ∇ · (u
t
Du)
dx dt
=
1
2
Z
Σ
t
−Σ
0
u
2
t
+ |Du|
2
dx −
Z
∂
∗
U
t
u
t
∂u
∂ν
dS.
But
u
vanishing on
∂
∗
U
T
implies
u
t
vanishes as well. So the second term vanishes,
and we obtain
Z
Σ
t
u
2
t
+ |Du|
2
dx =
Z
Σ
0
u
2
t
+ |Du|
2
dx.
This is the conservation of energy! Thus, if a solution exists, we control
kuk
H
1
(Σ
t
)
in terms of
kψk
H
1
(Σ
0
)
and
kψ
0
k
L
2
(Σ
0
)
. We also see that the solution is uniquely
determined by
ψ
and
ψ
0
, since if
ψ
=
ψ
0
= 0, then
u
t
= D
u
= 0 and
u
is zero at
the boundary.
Estimates like this that control a solution without needing to construct it are
known as a priori estimates. These are often crucial to establish the existence
of solutions (cf. G˚arding).
We shall first find a weak formulation of this problem that only requires
u ∈ H
1
(
U
T
). Note that when we do so, we have to understand carefully what
we mean by
u
t
=
ψ
0
. We shall see how we will deal with that in the derivation
of the weak formulation.
Assume that
u ∈ C
2
(
¯
U
T
) is a classical solution. Multiply the equation by
v ∈ C
2
(
¯
U
T
) which satisfies v = 0 on ∂
∗
U
T
∪ Σ
T
. Then we have
Z
U
T
dx dt (fv) =
Z
U
T
dx dt (u
tt
v + Luv)
=
Z
U
T
dx dt
−u
t
v
t
+
X
a
ij
u
x
i
v
x
j
+
X
b
i
u
x
i
v
+ cu
+
Z
U
u
t
v dx
T
0
−
Z
T
0
dt
Z
∂U
X
a
ij
u
x
j
v dS
dt.
Using the boundary conditions, we find that
Z
U
T
fv dx dt =
Z
U
T
−u
t
v
t
+
X
a
ij
u
x
i
v
x
j
+
X
b
i
u
x
i
v + cuv
dx dt
−
Z
Σ
0
ψ
0
v dx. (†)
Conversely, suppose
u ∈ C
2
(
¯
U
T
) satisfies (
†
) for all such
v
, and
u|
Σ
0
=
ψ
and
u|
∂
∗
U
T
= 0. Then by first testing on
v ∈ C
∞
c
(
U
T
), reversing the integration by
parts tells us
0 =
Z
U
T
(u
tt
+ Lu − f)v dx,
since there is no boundary term. Hence we get
u
tt
+ Lu = f
on
U
T
. To check the boundary conditions, if
v ∈ C
∞
(
¯
U
T
) vanishes on
∂
∗
U
T
∪
Σ
T
,
then again reversing the integration by parts shows that
Z
U
T
(u
tt
+ Lu − f)v dx dt =
Z
Σ
0
(ψ
0
− u
t
)v dx.
Since we know that the LHS vanishes, it follows that ψ
0
= u
t
on Σ
0
. So we see
that our weak formulation can encapsulate the boundary condition on Σ
0
.
Definition
(Weak solution)
.
Suppose
f ∈ L
2
(
U
T
),
ψ ∈ H
1
0
(Σ
0
) and
ψ
0
∈
L
2
(Σ
0
). We say u ∈ H
1
(U
t
) is a weak solution to the hyperbolic PDE if
(i) u|
Σ
0
= ψ in the trace sense;
(ii) u|
∂
∗
U
T
= 0 in the trace sense; and
(iii) (†) holds for all v ∈ H
1
(U
T
) with v = 0 on ∂
∗
U
T
∪ Σ
T
in a trace sense.
Theorem (Uniqueness of weak solution). A weak solution, if exists, is unique.
Proof.
It suffices to consider the case
f
=
ψ
=
ψ
0
= 0, and show any solution
must be zero. Let
v(x, t) =
Z
T
t
e
−λs
u(x, s) ds,
where
λ
is a real number we will pick later. The point of introducing this
e
−λt
is that in general, we do not expect conservation of energy. There could be some
exponential growth in the energy, so want to suppress this.
Then this function belongs to H
1
(U
T
), v = 0 on Σ
T
∪ ∂
∗
U
T
, and
v
t
= −e
−λt
u.
Using the fact that u is a weak solution, we have
Z
U
T
u
t
ue
−λt
−
X
v
tx
j
v
x
i
e
λt
+
X
i
b
i
u
x
i
v + (c − 1)uv − vv
t
e
λt
!
dx dt = 0.
Integrating by parts, we can write this as A = B, where
A =
Z
U
T
d
dt
1
2
u
2
e
−λt
−
X
a
ij
v
x
i
v
x
j
e
λt
−
1
2
v
2
e
λt
+
λ
2
u
2
e
−λt
+
X
a
ij
v
x
i
v
x
j
e
λt
+ v
2
e
λt
dx dt
B = −
Z
U
T
e
λt
X
a
ij
v
x
i
v
x
j
−
X
b
i
x
i
uv −
X
b
i
v
x
i
u + (c − 1)uv
dx dt.
Here
A
is the nice bit, which we can control, and
B
is the junk bit, which we
will show that we can absorb elsewhere.
Integrating the time derivative in
A
, using
v
= 0 on Σ
T
and
u
= 0 on Σ
0
, we
have
A = e
λT
Z
Σ
T
1
2
u
2
dx +
Z
Σ
0
X
a
ij
v
x
i
v
x
j
+ v
2
dx
λ
2
Z
U
T
u
2
e
−λt
+
X
a
ij
v
x
i
v
x
j
e
λt
+ v
2
e
λt
dx dt.
Using the uniform ellipticity condition (and the observation that the first line is
always non-negative), we can bound
A ≥
λ
2
Z
U
T
u
2
e
−λt
+ θ|Dv|
2
e
λt
+ v
2
e
λt
dx dt.
Doing some integration by parts, we can also bound
B ≤
c
2
Z
U
T
u
2
e
−λt
+ θ|Dv|
2
e
λt
+ v
2
e
λt
dx dt,
where the constant c does not depend on λ. Taking this together, we have
λ − c
2
Z
U
T
u
2
e
−λt
+ θ|Dv|
2
e
λt
+ v
2
e
λt
dx dt ≤ 0.
Taking
λ > c
, this tells us the integral must vanish. In particular, the integral of
u
2
e
λt
= 0. So u = 0.
We now want to prove the existence of weak solutions. While we didn’t
need to assume much regularity in the uniqueness result, since we are going
to subtract the boundary conditions off anyway, we expect that we need more
regularity to prove existence.
Theorem
(Existence of weak solution)
.
Given
ψ ∈ H
1
0
(
U
) and
ψ
0
∈ L
2
(
U
),
f ∈ L
2
(U
T
), there exists a (unique) weak solution with
kuk
H
1
(U
T
)
≤ C(kψk
H
1
(U)
+ kψ
0
k
L
2
(U)
+ kfk
L
2
(U
T
)
). (†)
Proof.
We use Galerkin’s method . The way we write our equations suggests we
should think of our hyperbolic PDE as a second-order ODE taking values in the
infinite-dimensional space
H
1
0
(
U
). To apply the ODE theorems we know, we
project our equation onto a finite-dimensional subspace, and then take the limit.
First note that by density arguments, we may assume
ψ, ψ
0
∈ C
∞
c
(
U
) and
f ∈ C
∞
c
(U
T
), as long as we prove the estimate (†). So let us do so.
Let
{ϕ
k
}
∞
k=1
be an orthonormal basis for
L
2
(
U
), with
ϕ
k
∈ H
1
0
(
U
). For
example, we can take
ϕ
k
to be eigenfunctions of
−
∆ with Dirichlet boundary
conditions.
We shall consider “solutions” of the form
u
N
(x, t) =
N
X
k=1
u
k
(t)ϕ
k
(x).
We want this to be a solution after projecting to the subspace spanned by
ϕ
1
, . . . , ϕ
N
. Thus, we want (
u
tt
+
Lu − f, ϕ
k
)
L
2
(Σ
t
)
= 0 for all
k
= 1
, . . . , N
.
After some integration by parts, we see that we want
¨u
N
, ϕ
k
L
2
(U)
+
Z
Σ
t
X
a
ij
u
N
x
i
(ϕ
k
)
x
j
+ b
i
u
N
x
i
ϕ
k
+ cu
N
ϕ
k
dx = (f, ϕ
k
)
L
2
(U)
.
(∗)
We also require
u
k
(0) = (ψ, ϕ
k
)
L
2
(U)
˙u
k
(0) = (ψ
0
, ϕ
k
)
L
2
(U)
.
Notice that if we have a genuine solution
u
that can be written as a finite sum
of the ϕ
k
(x), then these must be satisfied.
This is a system of ODEs for the functions
u
k
(
t
), and the RHS is uniformly
C
1
in
t
and linear in the
u
k
’s. By Picard–Lindel¨of, a solution exists for
t ∈
[0
, T
].
So for each
N
, we have an approximate solution that solves the equation
when projected onto
hϕ
1
, . . . , ϕ
N
i
. What we need to do is to extract from this
solution a genuine weak solution. To do so, we need some estimates to show
that the functions u
N
converge.
We multiply (
∗
) by
e
−λt
˙u
k
(
t
), sum over
k
= 1
, . . . , N
, and integrate from 0
to τ ∈ (0, T ), and end up with
Z
τ
0
dt
Z
U
dx
¨u
N
˙u
N
e
−λt
+
X
a
ij
u
N
x
i
˙u
N
x
j
+
X
b
i
u
N
x
i
˙u
N
+ cu
N
˙u
N
e
−λt
=
Z
τ
0
dt
Z
U
du(f ˙u
N
e
−λt
).
As before, we can rearrange this to get A = B, where
A =
Z
U
τ
dt dx
d
dt
1
2
( ˙u
N
)
2
+
1
2
X
a
ij
u
N
x
i
u
N
x
j
+
1
2
(u
N
)
2
e
−λt
+
λ
2
( ˙u
N
)
2
+
X
a
ij
u
N
x
i
u
N
x
j
+ (u
N
)
2
e
−λt
and
B =
Z
U
τ
dt dx
1
2
X
˙a
ij
u
N
x
i
u
N
x
j
−
X
b
i
u
N
x
i
˙u
N
+ (1 − c)u
N
˙u
N
+ f ˙u
N
e
−λt
.
Integrating in time, and estimating as before, for λ sufficiently large, we get
1
2
Z
Σ
τ
( ˙u
N
)
2
+ |Du
N
|
2
dx +
Z
U
τ
( ˙u
N
)
2
+ |Du
N
|
2
+ (u
N
)
2
dx dt
≤ C(kψk
2
H
1
(U)
+ kψ
0
k
2
L
2
(U)
+ kfk
2
U
T
).
This, in particular, tells us u
N
is bounded in H
1
(U
T
),
Since
u
N
(0) =
P
N
n=1
(
ψ, ϕ
k
)
ϕ
k
, we know this tends to
ψ
in
H
1
(
U
). So for
N large enough, we have
ku
N
k
H
1
(Σ
0
)
≤ 2kψk
H
1
(U)
.
Similarly, k˙u
N
k
L
2
(Σ
0
)
≤ 2kψ
0
k
L
2
(U)
.
Thus, we can extract a convergent subsequence
u
N
m
* u
in
H
1
(
U
) for some
u ∈ H
1
(U) such that
kuk
H
1
(U
T
)
≤ C(kψk
H
1
(U)
+ kψk
L
2
(U)
+ kfk
L
2
(U
T
)
).
For convenience, we may relabel the sequence so that in fact u
N
* u.
To check that
u
is a solution, suppose
v
=
P
M
k=1
v
k
(
t
)
ϕ
k
for some
v
k
∈
H
1
((0, T )) with v
k
(T ) = 0. By definition of u
N
, we have
(¨u
N
, v)
L
2
(U)
+
Z
Σ
t
X
i,j
a
ij
u
N
x
i
v
x
j
+
X
i
b
i
u
N
x
i
v + cuv dx = (f, v)
L
2
(U)
.
Integrating
R
T
0
dt using v(T ) = 0, we have
Z
U
T
−u
N
t
v
t
+
X
x
i
N
v
x
j
+
X
b
i
u
N
x
i
v + cuv
dx dt −
Z
Σ
0
u
N
t
v dx
=
Z
U
T
fv dx dt.
Now note that if
N > M
, then
R
Σ
0
u
N
t
v
d
x
=
R
Σ
0
ψ
0
v
d
x
. Now, passing to the
weak limit, we have
Z
U
T
−u
t
v
t
+
X
a
ij
u
x
i
v
x
j
+
X
b
i
u
x
i
v + cuv
dx dt −
Z
Σ
0
ψ
0
v dx
=
Z
U
T
fv dx dt.
So u
t
satisfies the identity required for u to be a weak solution.
Now for
k
= 1
, . . . , M
, the map
w ∈ H
1
(
U
T
)
7→
R
Σ
0
wϕ
k
d
x
is a bounded
linear map, since the trace is bounded in L
2
. So we conclude that
Z
Σ
0
uϕ
k
dx = lim
N→∞
Z
Σ
0
u
N
ϕ
k
dx = (ψ, ϕ
k
)
L
2
(H)
.
Since this is true for all
ϕ
k
, it follows that
u|
Σ
0
=
ψ
, and
v
of the form considered
are dense in H
1
(U
T
) with v = 0 on ∂
∗
U
T
∪ Σ
T
. So we are done.
In fact, we have
ess sup
t∈(0,T )
(k˙uk
L
2
(Σ
t
)
+ kuk
H
1
(Σ
t
)
) ≤ C · (data).
So we can say u ∈ L
∞
((0, T ), H
1
(U)) and ˙u ∈ L
∞
((0, T ), L
2
(U)).
We would like to improve the regularity of the solution. To motivate how we
are going to do that, let’s go back to the wave equation for a bit.
Suppose that in fact
u ∈ C
∞
(
U
T
) is a smooth solution to the wave equation
with initial conditions (
ψ, ψ
0
). We want a quantitative estimate for
u ∈ H
2
(Σ
t
).
The idea is to differentiate the equation with respect to
t
. Writing
w
=
u
t
, we
get
w
tt
− ∆w = 0
w|
Σ
0
= ψ
0
w
t
|
Σ
0
= ∆ψ
w|
∂
∗
U
T
= 0.
By the energy estimate we have for the wave equation, we get
kw
t
k
L
2
(Σ
t
)
+ kwk
H
1
(Σ
t
)
≤ C(kψ
0
k
H
1
(U)
+ k∆ψk
L
2
(U)
)
≤ C(kψ
0
k
H
1
(U)
+ kψk
H
2
(U)
).
So we now have control of
u
tt
and
u
tx
i
in
L
2
(Σ
t
). But once we know that
u
tt
is controlled in
L
2
, then we can use the elliptic estimate to gain control on the
second-order spatial derivatives of u. So
kuk
H
2
(Σ
t
)
≤ C(k∆uk
L
2
(Σ
t
)
) = Cku
tt
k
L
2
(Σ
t
)
.
So we control all second-derivatives of u in terms of the data.
Theorem.
If
a
ij
, b
i
, c ∈ C
2
(
U
T
) and
∂U ∈ C
2
, then for
ψ ∈ H
2
(
U
) and
ψ
0
∈ H
1
0
(U), and f, f
t
∈ L
2
(U
T
), we have
u ∈ H
2
(U
T
) ∩ L
∞
((0, T ); H
2
(U))
u
t
∈ L
∞
((0, T ), H
1
0
(U))
u
tt
∈ L
∞
((0, T ); L
2
(U))
Proof.
We return to the Galerkin approximation. Now by assumption, we have
a linear system with
C
2
coefficients. So
u
k
∈ C
3
((0
, T
)). Differentiating with
respect to t (assuming as we can f, f
t
∈ C
0
(
¯
U
T
)), we have
(∂
3
t
u
N
, ϕ
k
)
L
2
(U)
+
Z
Σ
t
X
a
ij
˙u
N
x
i
(ϕ
k
)
x
j
+
X
b
i
˙u
N
x
i
ϕ
k
+ c ˙u
N
ϕ
k
dx
= (
˙
f, ϕ
k
)
L
2
(U)
−
Z
Σ
t
X
˙a
ij
u
N
x
i
(ϕ
k
)
x
j
+
X
˙
b
i
u
N
x
i
ϕ
k
+ ˙cuϕ
k
dx.
Multiplying by
¨u
k
e
−λt
, summing
k
= 1
, . . . , N
, integrating
R
τ
0
d
t
, and recalling
we already control u ∈ H
1
(U
T
), we get
sup
t∈(0,T )
(ku
N
t
k
H
1
(Σ
t
)
+ ku
N
tt
k
L
2
(Σ
t
)
+ ku
N
t
k
H
2
(U
T
)
)
≤ C
ku
N
t
k
H
1
(Σ
0
)
+ ku
N
tt
k
L
2
(Σ
0
)
+ kψk
H
1
(Σ
0
)
+ kψ
0
k
L
2
(Σ
0
)
+ kfk
L
2
(U
T
)
+ kf
t
k
L
2
(U
T
)
.
We know
u
N
t
|
t=0
=
N
X
k=1
(ψ
0
, ϕ
k
)
L
2
(U)
ϕ
k
.
Since ϕ
k
are a basis for H
1
, we have
ku
N
t
k
H
1
(Σ
0
)
≤ kψ
0
k
H
1
(Σ
0
)
.
To control
u
N
tt
, let us assume for convenience that in fact
ϕ
k
are the eigenfunctions
−∆. From the fact that
(¨u
N
, ϕ
k
)
L
2
(U)
+
Z
Σ
t
X
i,j
a
ij
u
N
x
i
(ϕ
k
)
x
j
+
X
i
b
i
u
N
x
i
ϕ
k
+cu
N
ϕ
k
dx dt = (f, ϕ
k
)
L
2
(U)
,
integrate the first term in the integral by parts, multiply by
¨u
n
, and sum to get
ku
N
tt
k
Σ
0
≤ C(ku
N
k
H
2
(Σ
0
)
+ kfk
L
2
(U
T
)
+ kf
t
k
L
2
(U
T
)
).
We need to control
ku
N
k
H
2
(Σ
0
)
by
kψk
H
2
(Σ
0
)
. Then, using that ∆
ϕ
k
|
∂U
= 0
and u
N
is a finite sum of these ϕ
k
’s,
(∆u
N
, ∆u
N
)
L
2
(Σ
0
)
= (u
N
, ∆
2
u
N
)
L
2
(Σ
0
)
= (ψ, ∆
2
u
N
)
L
2
(Σ
0
)
= (∆ψ, ∆u
N
)
L
2
(Σ
0
)
.
So
ku
N
k
H
2
(Σ
0
)
≤ k∆u
N
k
L
2
(Σ
0
)
≤ Ckψk
2
H
(U).
Passing to the weak limit, we conclude that
u
t
∈ H
1
(U
T
)
u
t
∈ L
∞
((0, T ), H
1
0
(U))
u
tt
∈ L
∞
((0, T ), L
2
(U)).
Since
u
tt
+
Lu
=
f
, by an elliptic estimate on (almost) every constant
t
, we
obtain u ∈ L
∞
((0, T ), H
2
(U)).
We can now understand the equation as holding pointwise almost everywhere
by undoing the integration by parts that gave us the definition of the weak
solution. The initial conditions can also be understood in a trace sense.
Returning to the case
ψ ∈ H
1
0
(
U
) and
ψ
0
∈ L
2
(
U
), by approximating in
H
2
(
U
), by approximating in
H
2
(
U
),
H
1
0
(
U
) respectively, we can show that a
weak solution can be constructed as a strong limit in
H
1
(
U
T
). This implies the
energy identity, so that in fact weak solutions satisfy
u ∈ C
0
((0, T ); H
1
0
(U))
u
t
∈ C
0
((0, T ); L
2
(U))
This requires slightly stronger regularity assumptions on
a
ij
,
b
i
and
c
. Such
solutions are said to be in the energy class.
Finally, note that we can iterate the argument to get higher regularity.
Theorem. If a
ij
, b
i
, c ∈ C
k+1
(
¯
U
T
) and ∂U is C
k+1
, and
∂
i
t
u|
Σ
0
∈ H
1
0
(U) i = 0, . . . , k
∂
k+1
t
u|
Σ
0
∈ L
2
(U)
∂
i
t
f ∈ L
2
((0, T ); H
k−i
(U)) i = 0, . . . , k
then u ∈ H
k+1
(U) and
∂
i
t
u ∈ L
∞
((0, T ); H
k+1−i
(U))
for i = 0, . . . , k + 1.
In particular, if everything is smooth, then we get a smooth solution.
The first two conditions should be understood as conditions on
ψ
and
ψ
0
,
using the fact that the equation allows us to express higher time derivatives
of
u
in terms of lower time derivatives and spatial derivatives. One can check
that these condition imply
ψ ∈ H
k+1
(
U
) and
ψ
0
∈ H
k
(
U
), but the condition we
wrote down also encodes some compatibility conditions, since we know
u
ought
to vanish at the boundary, hence all time derivatives should.
Those were the standard existence and regularity theorems for hyperbolic
PDEs. However, there are more things to say about hyperbolic equations. The
“physicist’s version” of the wave equation involves a constant c, and says
¨u − c
2
∆x = 0.
This constant
c
is the speed of propagation. This tells us in the wave equation,
information propagates at a speed of at most
c
. We can see this very concretely
in the 1-dimensional wave equation, where d’Alembert wrote down an explicit
solution to the wave equation given by
u(x, t) =
1
2
(ψ(x − ct) + ψ(x + ct)) +
1
2c
Z
x+ct
x−ct
ψ
0
(y) dy.
Thus, we see that the value of
φ
at any point (
x, t
) is completely determined by
the values of ψ and ψ
0
in the interval [x − ct, x + ct].
(x, t)
t = 0
This is true for a general hyperbolic PDE. In this case, the speed of propa-
gation should be measured by the principal symbol
Q
(
ξ
) =
P
a
ij
(
y
)
ξ
i
ξ
j
. The
correct way to formulate this result is as follows:
Let
S
0
⊆ U
be an open set with (say) smooth boundary. Let
τ
:
S
0
→
[0
, T
]
be a smooth function vanishing on ∂S
0
, and define
D = {(t, x) ∈ U
T
: x ∈ S
0
, 0 < t < τ(x)}
S
0
= {(τ(x), x) : x ∈ S
0
}.
We say S
0
is spacelike if
n
X
i,j=1
a
ij
τ
x
i
τ
x
j
< 1
for all x ∈ S
0
.
Theorem.
If
u
is a weak solution of the usual thing, and
S
0
is spacelike, then
u|
D
depends only on ψ|
S
0
, ψ
0
|
S
0
and f |
D
.
The proof is rather similar to the proof of uniqueness of solutions.
Proof. Returning to the definition of a weak solution, we have
Z
U
T
−u
t
v
t
+
n
X
i,j=1
a
ij
u
x
j
v
x
i
+
n
X
i=1
b
i
u
x
i
+ cuv dx dt −
Z
Σ
0
ψ
0
v dx =
Z
U
T
fv dx dt.
By linearity it suffices to show that if
u|
Σ
0
= 0 if
ψ|
S
0
=
ψ
0
|
S
0
= 0 and
f|
D
= 0.
We take as test function
v(t, x) =
(
R
τ(x)
t
e
−λs
u(s, x) ds (t, x) ∈ D
0 (t, x) 6∈ D
.
One checks that this is in H
1
(U
T
), and v = 0 on Σ
T
∪ ∂
∗
U
T
with
v
x
i
= τ
x
i
e
−λτ
u(x, τ) +
Z
τ(x)
t
e
−λs
u
x
i
(x, s) ds
v
t
= −e
−λt
u(x, t).
Plugging these into the definition of a weak solution, we argue as in the previous
uniqueness proof. Then
Z
D
d
dt
1
2
u
2
e
−λt
−
1
2
X
a
ij
v
x
i
v
x
j
e
λt
−
1
2
v
2
e
λt
+
λ
2
u
2
e
−λt
+
X
a
ij
v
x
i
v
x
j
e
λt
+ v
2
e
λt
dx dt
=
Z
D
1
2
X
a
ij
v
x
i
v
x
j
e
λt
−
X
b
i
v
x
i
v − (c − 1)uv
dx dt
Noting that
R
D
d
x
d
t
=
R
S
0
d
x
R
τ(x)
0
d
t
, we can perform the
t
integral of the
d
dt
term, and we get contribution from S
0
which is given by
I
S
0
=
Z
S
0
1
2
u
2
(τ(x), x)e
−λτ(x)
−
1
2
X
i,j
a
ij
τ
x
i
τ
x
j
u
2
e
−λτ
dx
We have used
v
= 0 on
S
0
and
v
x
i
=
τ
x
i
ue
−λτ
. Using the definition of a spacelike
surface, we have
I
S
0
>
0. The rest of the argument of the uniqueness of solutions
goes through to conclude that u = 0 on D.
This implies no signal can travel faster than a certain speed. In particular, if
X
i,j
a
ij
ξ
i
ξ
j
≤ µ|ξ|
k
for some
µ
, then no signal can travel faster than
√
µ
. This allows us to solve
hyperbolic equations on unbounded domains by restricting to bounded domains.