III Analysis of Partial Differential Equations (Full)

Part III — Analysis of Partial Differential

Equations

Based on lectures by C. Warnick

Notes taken by Dexter Chua

Michaelmas 2017

These notes are not endorsed by the lecturers, and I have modified them (often

significantly) after lectures. They are nowhere near accurate representations of what

was actually lectured, and in particular, all errors are almost surely mine.

This course serves as an introduction to the mathematical study of Partial Differential

Equations (PDEs). The theory of PDEs is nowadays a huge area of active research,

and it goes back to the very birth of mathematical analysis in the 18th and 19th

centuries. The subject lies at the crossroads of physics and many areas of pure and

applied mathematics.

The course will mostly focus on four prototype linear equations: Laplace’s equation,

the heat equation, the wave equation and Schr¨odinger’s equation. Emphasis will be

given to modern functional analytic techniques, relying on a priori estimates, rather

than explicit solutions, although the interaction with classical methods (such as the

fundamental solution and Fourier representation) will be discussed. The following basic

unifying concepts will be studied: well-posedness, energy estimates, elliptic regularity,

characteristics, propagation of singularities, group velocity, and the maximum principle.

Some non-linear equations may also be discussed. The course will end with a discussion

of major open problems in PDEs.

Pre-requisites

There are no specific pre-requisites beyond a standard undergraduate analysis back-

ground, in particular a familiarity with measure theory and integration. The course

will be mostly self-contained and can be used as a first introductory course in PDEs

for students wishing to continue with some specialised PDE Part III courses in the

Lent and Easter terms.

Contents

0 Introduction

1 Basics of PDEs

2 The Cauchy–Kovalevskaya theorem

2.1 The Cauchy–Kovalevskaya theorem

2.2 Reduction to first-order systems

3 Function spaces

3.1 The H¨older spaces

3.2 Sobolev spaces

3.3 Approximation of functions in Sobolev spaces

3.4 Extensions and traces

3.5 Sobolev inequalities

4 Elliptic boundary value problems

4.1 Existence of weak solutions

4.2 The Fredholm alternative

4.3 The spectrum of elliptic operators

4.4 Elliptic regularity

5 Hyperbolic equations

0 Introduction

Partial differential equations are ubiquitous in mathematics, physics, and beyond.

The first equation we have met might be Laplace’s equation, saying

−∆u = −

i=1

∂

∂x

= 0.

This is the canonical example of an elliptic PDE, and we will spend a lot of

time thinking about elliptic PDEs, since they tend to be very well-behaved.

Instead of trying to explicitly solve equations, as we did in, say, IB Methods, our

focus is mostly on the existence and uniqueness of solutions, without explicitly

constructing them. This will involve the use of machinery from functional

analysis, and indeed a lot of the work will be about showing that we satisfy the

hypotheses required by the functional-analytic results (as well as proving the

functional-analytic results themselves (sometimes)).

We will also consider hyperbolic equations. The canonical example is the

wave equation

∂

∂t

− ∆u = 0.

The difference is that the time derivative term now has a different sign from the

rest. In Laplace’s equation, all directions were equal. Here time is a “special”

direction, and often our questions are about how the solution evolves in time.

Of course, we don’t “just solve” such equations. Usually, we impose some

data, such as the desired values of

on the boundary of our domain, or the

“starting configuration” in the case of the wave equation. In general, given such

a system, there are several questions we can ask:

– Does a solution exist?

– Is the solution unique?

– Does the solution depend continuously on the data?

–

How regular is the solution? Is it continuously differentiable? Or even

smooth?

These questions are closely related. To even make sense of the question, we

need to specify our “search space”, i.e. the sort of functions we are willing to

consider. For example, we may consider the space of all smooth functions, or

less ambitiously, the space of all twice-differentiable functions. This somewhat

answers the last question, but it doesn’t answer it completely. It could be that

we can try to search for the solution in the space of

functions, but it turns

out the solutions are always smooth!

The choice of this function space affects the answers to the other questions as

well. If we have a larger function space, then we are more likely to get a positive

answer to the first question. However, since there are more functions around, we

are more likely to get a negative function to the second question. So there is

some tension here.

The choice affects the third question in a slightly more subtle way. To speak

of continuity, we must pick a topology, and this usually comes from a norm on

the function space. Thus, to make sense of the third question, we must pick the

appropriate norm on both the space of data and the space of potential solutions.

After choosing the appropriate function spaces, if the answers to the first

three questions are all “yes”, then we say the problem is well-posed.

1 Basics of PDEs

It might be wise to define what a partial differential equation is.

Definition

(Partial differential equation)

Suppose

U ⊆ R

is open. A partial

differential equation (PDE) of order k is a relation of the form

F (x, u(x), Du(x), . . . , D

u(x)) = 0, (∗)

where

U ×R ×R

×R

×···×R

→ R

is a given function, and

U → R

is the “unknown”.

Definition

(Classical solution)

We say

u ∈ C

(

) is a classical solution of a

PDE if in fact the PDE is identically satisfied on

when

u, . . . ,

are

substituted in.

More generally, we can allow

and

to take values in a vector space. In

this case, we say it is a system of PDEs.

We can now entertain ourselves by writing out a large list of PDEs that are

naturally found in physics and mathematics.

Example

(Transport equation)

Suppose

× R → R

and

→ R

are

given. The transport equation is

∂u

∂t

(x, t) + v(x, t, u(x, t)) · D

u(x, t) = f(x, t)

where we think of

x ∈ R

and

t ∈ R

. This describes the evolution of the density

u of some chemical being advected by a flow v and produced at a rate f.

We see that this is a PDE of order 1, and a relatively straightforward solution

method exists, namely the method of characteristics.

Example

(Laplace’s and Poissson’s equations)

Taking

→ R

, Laplace’s

equation is

∆u(x) =

i=1

∂

∂x

(x) = 0.

This describes, for example, the electrostatic potential in vacuum and the static

distribution of heat inside a uniform solid body. It also has applications to

steady flows in 2d fluids.

There is an inhomogeneous version of this:

∆u(x) = f(x),

where

→ R

is a fixed function. This is known as Poisson’s equation, and

describes, for example, the electrostatic field due to a charge distribution, and

the gravitational field in Newtonian gravity.

Example (Heat/diffusion equation). This is given by

∂u

∂t

= ∆u,

where

× R → R

is now a function of space and time. This describes the

evolution of temperature inside a uniform body, or equivalently the diffusion of

some chemical (where u is the density).

Example (Wave equation). The wave equation is given by

∂

∂t

= ∆u,

where

× R → R

is again a function of space and time. This describes

oscillations of

– strings (n = 1)

– membrane/drum (n = 2)

– air density in a sound wave (n = 3)

Example

(Schr¨odinger’s equation)

Let

× R → C

∼

. Up to choices

of units and convention, the Schr¨odinger’s equation is

∂u

∂t

+ ∆u − V u = 0.

Here u is the wavefunction of a particle moving in a potential V : R

→ R.

Example

(Maxwell’s equations)

The unknowns here are

E, B

× R → R

They satisfy Maxwell’s equations

∇ · E = ρ ∇ · B = 0

∇ × E +

∂B

∂t

= 0 ∇ × B −

∂E

∂t

= J,

where

is the electric charge density,

is the electric current,

is the electric

field and B is the magnetic field.

This is a system of 6 equations and 6 unknowns.

Example (Einstein’s equations). The Einstein’s equation in vacuum are

µν

[g] = 0,

where

is a Lorentzian metric (encoding the gravitational field), and

µν

[

] is

the Ricci curvature of g.

Since we haven’t said what

and

µν

are, it is not clear that this is a partial

differential equation, but it is.

Example (Minimal surface equation). The minimal surface equation is

Div

1 + |Du|

= 0,

where

→ R

is some function. This is the condition that the graph of

{(x, u(x))} ⊆ R

× R, is locally an extremizer of area.

Example

(Ricci flow)

Let

be a Riemannian metric on some manifold. The

Ricci flow is a PDE that evolves this metric:

∂g

∂t

= R

[g],

where R

is again the Ricci curvature.

The most famous application is in proving the Poincar´e conjecture, which is

a topological conjecture about 3-manifolds.

These PDEs exhibit a wide variety of behaviours. For example, waves behave

very differently from the evolution of temperature. This means it is unlikely

that we can say anything about PDEs as a whole, since everything we say must

be true for both the heat equation and the wave equation. We must restrict

to some particular classes of PDEs to say something useful. Thus, we seek to

classify our PDEs into different types. We first introduce some notation.

In this course, the natural numbers start at 0.

Notation

(Multi-index/Schwartz notation)

We say an element

α ∈ N

is a

multi-index. Writing α = (α

, . . . , α

). We write

|α| = α

+ α

+ ··· + α

Also, we have

f =

∂

|α|

∂x

···∂x

If x = (x

, . . . , x

) ∈ R

, then

= x

···x

We also write

α! = α

!α

! ···α

We now try to crudely classify the PDEs we have written down. Recall that

our PDEs take the general form

F (x, u(x), Du(x), . . . , D

u(x)) = 0.

Definition

(Linear PDE)

We say a PDE is linear if

is a linear function of

and its derivatives. In this case, we can re-write it as

|α|≤k

(x)D

u = 0.

Definition

(Semi-linear PDE)

We say a PDE is semi-linear if it is of the form

|α|=k

(x)D

u(x) + a

[x, u, Du, . . . , D

k−1

u] = 0.

In other words, the terms involving the highest order derivatives are linear.

Generalizing further, we have

Definition

(Quasi-linear PDE)

We say a PDE is quasi-linear if it is of the

form

|α|=k

[x, u, Du, . . . , D

k−1

u]D

u(x) + a

[x, u, . . . , D

k−1

u] = 0.

So the highest order derivative still appears linearly, but the coefficients can

depend on lower-order derivatives of u.

Finally, we have

Definition

(Fully non-linear PDE)

A PDE is fully non-linear if it is not

quasi-linear.

Example. Laplace’s equation ∆u = 0 is linear.

Example. The equation u

+ u

= u

is semi-linear.

Example. The equation uu

+ u

= u

is quasi-linear.

Example. The equation u

− u

= 0 is fully non-linear.

2 The Cauchy–Kovalevskaya theorem

2.1 The Cauchy–Kovalevskaya theorem

Before we begin talking about PDEs, let’s recall what we already know about

ODEs. Fix some

U ⊆ R

an open subset, and assume

U → R

is given.

Consider the ODE

˙u(t) = f (u(t)).

This is an autonomous ODE because there is no explicit

dependence on the

right. This assumption is usually harmless, as we can just increment

and use

the new variable to keep track of

. Here

: (

a, b

)

→ U

is the unknown, where

a < 0 < b.

The Cauchy problem for this equation is to find a solution to the ODE

satisfying u(0) = u

∈ U for any u

The Picard–Lindel¨of theorem says we can always do so under some mild

conditions.

Theorem (Picard–Lindel¨of theorem). Suppose that there exists r, K > 0 such

that B

) ⊆ U , and

kf(x) − f(y)k ≤ Kkx − u

for all

x, y ∈ B

(

). Then there exists an

ε >

0 depending on

K, r

and a unique

function u : (−ε, ε) → U solving the Cauchy problem.

It is instructive to give a quick proof sketch of the result.

Proof sketch.

is a solution, then by the fundamental theorem of calculus,

we have

u(t) = u

f(u(s)) ds.

Conversely, if

is a

solution to this integral equation, then it solves the

ODE. Crucially, this only requires

to be

. Indeed, if

and satisfies

the integral equation, then

is automatically

. So we can work in a larger

function space when we seek for u.

Thus, we have reformulated our initial problem into an integral equation. In

particular, we reformulated it in a way that assumes less about the function. In

the case of PDEs, this is what is known as a weak formulation.

Returning to the proof, we have reformulated our problem as looking for a

fixed point of the map

B : w 7→ u

f(w(s)) ds

acting on

C = {w : [−ε, ε] → B

r/2

) : w is continuous}.

This is a complete metric space when we equip it with the supremum norm (in

fact, it is a closed ball in a Banach space).

We then show that for

small enough, this map

C → C

is a contraction

map. There are two parts — to show that it actually lands in

, and that it is

a contraction. If we managed to show these, then by the contraction mapping

theorem, there is a unique fixed point, and we are done.

The idea of formulating our problem as a fixed point problem is a powerful

technique that allows us to understand many PDEs, especially non-linear ones.

This theorem tells us that a unique

solution exists locally. It is not reasonable

to believe it can exist globally, as we might run out of

in finite time. However,

is better behaved, we might expect

to be more regular, and indeed this is

the case. We shall not go into the details.

How can we actually use the theorem in practice? Can we actually obtain

a solution from this? Recall that to prove the contraction mapping theorem,

what we do is that we arbitrarily pick a point in

, keep applying

, and by the

contraction, we must approach the fixed point. This gives us a way to construct

an approximation to the ODE.

However, if we were a physicist, we would have done things differently.

Suppose

f ∈ C

∞

. We can then attempt to construct a Taylor series of the

solution near the origin. First we note that for any solution u, we must have

u(0) = u

, ˙u(0) = f(u

Assuming

is in fact a smooth solution, we can differentiate the ODE and obtain

¨u(t) =

˙u(t) =

f(u(t)) = Df(u(t)) ˙u(t) ≡ f

(u(t), ˙u(t)).

At the origin, we already know what

and

˙u

. We can proceed iteratively to

determine

(k)

(t) = f

(u, ˙u, . . . , u

(k−1)

So in particular, we can in principle determine

≡ u

(k)

= 0. At least formally,

we can write

u(t) =

∞

k=0

If we were physicists, we would say we are done. But being honest mathemati-

cians, in order to claim that we have a genuine solution, we need to at least

show that this converges. Under suitable circumstances, this is given by the

Cauchy–Kovalevskaya theorem.

Theorem (Cauchy–Kovalevskaya for ODEs). The series

u(t) =

∞

k=0

converges to the Picard–Lindel¨of solution of the Cauchy problem if

is real

analytic in a neighbourhood of u

Recall that being real analytic means being equal to its Taylor series:

Definition

(Real analytic)

Let

U ⊆ R

be open, and suppose

U → R

. We

say

is real analytic near

∈ U

if there exists

r >

0 and constants

∈ R

for

each multi-index α such that

f(x) =

(x − x

)

for |x − x

| < r.

Note that if

is real analytic near

, then it is in fact

∞

in the corre-

sponding neighbourhood. Furthermore, the constants f

are given by

α!

f(x

In other words,

equals its Taylor expansion. Of course, by translation, we can

usually assume x

= 0.

Example. If r > 0, set

f(x) =

r − (x

+ x

+ ··· + x

)

for |x| <

√

. Then this is real analytic, since we have

f(x) =

1 − (x

+ ··· + x

)/r

∞

k=0



+ ··· + x



We can then expand out each term to see that this is given by a power series.

Explicitly, it is given by

f(x) =

|α|



|α|



where



|α|



|α|!

α!

One sees that this series is absolutely convergent for |x| <

√

Recall that in single-variable analysis, essentially the only way we have to

show that a series converges is by comparison to the geometric series. Here with

multiple variables, our only way to show that a power series converges is by

comparing it to this f.

Definition (Majorant). Let

f =

, g =

be formal power series. We say

majorizes

(or

is a majorant of

), written

g  f, if g

≥ |f

| for all multi-indices α.

If f and A are vector-valued, then this means g

 f

for all indices i.

Lemma.

(i) If g  f and g converges for |x| < r, then f converges for |x| < r.

(ii)

(

) =

converges for

x < r

and 0

< s

√

n < r

, then

has a

majorant which converges on |x| < s.

Proof.

(i) Given x, define ˜x = (|x

|, |x

|, . . . , |x

|). We then note that

| =

|˜x

≤

˜x

= g(˜x).

Since |˜x| = |x| < r, we know g converges at ˜x.

(ii) Let 0 < s

√

n < r and set y = s(1, 1, . . . , 1). Then we have

|y| = s

√

n < r.

So by assumption, we know

converges. A convergent series has bounded terms, so there exists

such

that

| ≤ C

for all α. But y

= s

|α|

. So we know

| ≤

|α|

≤

|α|

|α|!

α!

But then if we set

g(x) =

s − (x

+ ··· + x

)

= C

|α|!

|α|

α!

we are done, since this converges for |x| <

√

With this lemma in mind, we can now prove the Cauchy–Kovalevskaya

theorem for first-order PDEs. This concerns a class of problems similar to the

Cauchy problem for ODEs. We first set up our notation.

We shall consider functions

→ R

. Writing

= (

, . . . , x

)

∈ R

we will consider the last variable

as being the “time variable”, and the others

as being space. However, for notational convenience, we will not write it as

We will adopt the shorthand x

= (x

, . . . , x

n−1

), so that x = (x

, x

Suppose we are given two real analytic functions

B : R

× R

n−1

→ Mat

m×m

(R)

c : R

× R

n−1

→ R

We seek a solution to the PDE

n−1

j=1

B(u, x

+ c(u, x

)

subject to

= 0 when

= 0. We shall not require a solution on all of

but only on an open neighbourhood of the origin. Consequently, we will allow

for

and

to not be everywhere defined, but merely convergent on some

neighbourhood of the identity.

Note that we assumed

and

do not depend on

, but this is not a

restriction, since we can always introduce a new variable

m+1

, and enlarge

the target space.

Theorem

(Cauchy–Kovalevskaya theorem)

Given the above assumptions, there

exists a real analytic function

solving the PDE in a neighbourhood

of the origin. Moreover, it is unique among real analytic functions.

The uniqueness part of the proof is not difficult. If we write out

and

in power series and plug them into the PDE, we can then simply collect terms

and come up with an expression for what

must be. This is the content of the

following lemma:

Lemma.

For

= 1

, . . . , m

and

a multi-index in

, there exists a polynomial

in the power series coefficients of

and

such that any analytic solution to

the PDE must be given by

u =

(B, c)x

where q

is the vector with entries q

Moreover, all coefficients of q

are non-negative.

Note that despite our notation,

is not a function of

and

(which are

themselves functions of

and

). It is a function of the coefficients in the power

series expansion of B and c, which are some fixed constants.

This lemma proves uniqueness. To prove existence, we must show that this

converges in a neighbourhood of the origin, and for this purpose, the fact that

the coefficients of

are non-negative is crucial. After we have established this,

we will use the comparison test to reduce the theorem to the case of a single,

particular PDE, which we can solve by hand.

Proof.

We construct the polynomials

by induction on

. If

= 0, then

since u = 0 on {x

= 0}, we conclude that we must have

u(0)

α!

= 0.

For

= 1, we note that whenever

= 0, we have

= 0 for

= 1

, . . . , n −

So the PDE reads

, 0) = c(0, x

Differentiating this relation in directions tangent to

= 0, we find that if

α = (α

, 1), then

u(0) = D

c(0, 0).

is a polynomial in the power series coefficients of

, and has non-negative

coefficients.

Now suppose α

= 2, so that α = (α

, 2). Then

u = D

)

= D





+ c





= D













We don’t really care what this looks like. The point is that when we evaluate at 0,

and expand all the terms out, we get a polynomial in the derivatives of

and

and also D

with

2. The derivatives of

and

are just the coefficients

of the power series expansion of

and

, and by the induction hypothesis, we

can also express the D

in terms of these power series coefficients. Thus, we

can use this to construct

. By inspecting what the formula looks like, we see

that all coefficients in q

are non-negative.

We see that we can continue doing the same computations to obtain all q

An immediate consequence of the non-negativity is that

Lemma. If

 B

and

c  c, then

(

c) > q

(B, c).

for all α. In particular,

u  u.

So given any

and

, if we can find some

and

that majorizes

and

respectively, and show that the corresponding series converges for

and

, then

we are done.

But we previously saw that every power series is majorized by

r − (x

+ ··· + x

)

for

sufficiently large and

sufficiently small. So we have reduced the problem

to the following case:

Lemma. For any C and r, define

h(z, x

) =

r − (x

+ ··· + x

n−1

) − (z

+ ··· + z

)

If B and c are given by

∗

(z, x

) = h(z, x

)







1 ··· 1







, c

∗

(z, x

) = h(z, x

)













then the power series

u =

(B, c)x

converges in a neighbourhood of the origin.

We’ll provide a rather cheap proof, by just writing down a solution of

the corresponding PDE. The solution itself can be found via the method of

characteristics, which we will learn about soon. However, the proof itself only

requires the existence of the solution, not how we got it.

Proof. We define

v(x) =



r − (x

+ ··· + x

n−1

) −

(r − (x

+ ··· + x

n−1

))

− 2mnCrx



which is real analytic around the origin, and vanishes when

= 0. We then

observe that

u(x) = v(x)













gives a solution to the corresponding PDE, and is real analytic around the origin.

Hence it must be given by that power series, and in particular, the power series

must converge.

2.2 Reduction to first-order systems

In nature, very few equations come in the form required by the Cauchy–

Kovalevskaya theorem, but it turns out a lot of PDEs can be cast into this form

after some work. We shall demonstrate this via an example.

Example. Consider the problem

= uu

− u

+ u

t=0

= u

t=0

= u

where u

, u

are some real analytic functions near the origin. We define

f = u

+ tu

This is then real analytic near 0, and f |

t=0

= u

and f

t=0

= u

. Set

w = u − f.

Then w satisfies

= ww

− w

+ w

+ fw

+ f

w + F,

where

F = ff

− f

+ f

and

t=0

= w

t=0

= 0.

We let (

x, y, t

) = (

, x

) and set

= (

w, w

, w

). Then our PDE becomes

= w

= u

= w

= u

= w

= u

= w

= u

− u

+ u

+ fu

+ f

+ F,

and the initial condition is

(

, x

0) = 0. This is not quite autonomous, but

we can solve that problem simply by introducing a further new variable.

Let’s try to understand this in more generality. In certain cases, it is not

possible to write the equation in Cauchy–Kovalevskaya form. For example, if

the equation has no local solutions, then it certainly cannot be written in that

form, or else Cauchy–Kovalevskaya would give us a solution! It is thus helpful

to understand when this is possible.

Note that in the formulation of Cauchy–Kovalevskaya, the derivative

assumed to depend only on

, and not

. If we want

to depend on

well, we can introduce a new variable

n+1

and set (

n+1

)

= 1. So from now

on, we shall ignore the fact that our PDE only has x

on the right-hand side.

Let’s now consider the scalar quasi-linear problem

|α|=k

k−1

u, . . . , Du, u, x)D

u + a

k−1

u, . . . , u, x) = 0,

where u : B

(0) ⊆ R

→ R, with initial data

u =

∂u

∂x

= ··· =

∂

k−1

∂x

k−1

= 0.

whenever |x

| < r, x

= 0.

We introduce a new vector

u =



∂u

∂x

, . . . ,

∂u

∂x

∂

∂x

, . . . ,

∂

n−1

∂x

k−1



= (u

, . . . , u

)

Here

contains all partial derivatives of

up to order

k−

1, for

j ∈ {

, . . . , m−

}

we can compute

∂u

∂x

in terms of u

∂u

∂x

for some ` ∈ {1, . . . , m} and p < n.

To express

∂u

∂x

in terms of the other variables, we need to actually use

the differential equation. To do so, we need to make an assumption about our

equation. We suppose that

(0,...,0,k)

0) is non-zero. We can then rewrite the

equation as

∂

∂x

−1

(0,...,0,k)

k−1

u, . . . , u, x)





|α|=k,α

u + a





where at least near

= 0, the denominator can’t vanish. The RHS can then be

written in terms of

∂u

∂x

and ub for p < n.

So we have cast our original equation into the form we previously discussed,

provided that the

’s and

’s are real analytic about the origin, and that

(0,...,0,k)

, . . . ,

0) = 0. Under these assumptions, we can solve the equation by

Cauchy–Kovalevskaya.

It is convenient to make the following definition: if

(0,...,k)

, . . . ,

= 0, we

say {x

= 0} is non-characteristic. Otherwise, we say it is characteristic.

Often times, we want to specify our initial data on some more exotic surface.

Unfortunately, they cannot be too exotic. They have to be real analytic in some

sense for our theory to have any chance of working.

Definition

(Real analytic hypersurface)

We say that Σ

⊆ R

is a real analytic

hypersurface near

x ∈

Σ if there exists

ε >

0 and a real analytic map Φ :

(

)

→

U ⊆ R

, where U = Φ(B

(x)), such that

– Φ is bijective and Φ

−1

: U → B

(x) is real analytic.

– Φ(Σ ∩ B

(x)) = {x

= 0} ∩ U and Φ(x) = 0.

We think of this Φ as “straightening out the boundary”.

Let γ be the unit normal to Σ, and suppose u solves

|α|=k

k−1

u, . . . , u, x)D

u + a

k−1

u, . . . , u, x) = 0

subject to

u = γ

∂

u = ··· , (γ

∂

)

k−1

u = 0

on Σ.

To do so, we define w(y) = u(Φ

−1

(y)), so that

u(x) = w(Φ(x)).

Then by the chain rule, we have

∂u

∂x

j=1

∂w

∂y

∂ψ

∂x

So plugging this into the equation, we see w satisfies an equation of the form

w + b

= 0,

as well as boundary conditions of

w =

∂w

∂y

= ··· =

∂

k−1

∂y

k−1

= 0.

So we have transformed this to a quasi-linear equation with boundary conditions

= 0, which we can tackle with Cauchy–Kovalevskaya, provided the surface

= 0 is non-characteristic. Can we relate this back to the a’s?

We can compute b

(0,...,0,k)

directly. Note that if |α| = k, then

u =

∂

∂y

(DΦ

)

+ terms not involving

∂

∂y

So the coefficient of

∂

∂y

(0,...,k)

|α|=k

(DΦ

)

Definition

((Non-)characteristic surface)

A surface Σ is non-characteristic at

xΣ provided

|α|=k

(DΦ

)

6= 0.

Equivalently, if

|α|=k

6= 0,

where

is the normal to the surface. We say a surface is characteristic if it is

not non-characteristic.

We focus on the case where our PDE is second-order. Consider an operator

of the form.

Lu =

i,j=1

∂

∂x

where

∈ R

. We may wlog assume

. For example the wave equation

and Laplace’s equation are given by operators of this form. Consider the equation

Lu = f

u = v

∂u

∂x

= 0 on Π

= {x · ν = 0}.

Then Π

is non-characteristic if

i,j

6= 0.

Since

is diagonalizable, we see that if all eigenvalues are positive, then

is non-zero, and so the problem has no characteristic surfaces. In this

case, we say the operator is elliptic. If (

) has one negative eigenvalue and the

rest positive, then we say L is hyperbolic.

Example. If L is the Laplacian

L = ∆ =

i=1

∂

∂x

then L is elliptic.

If L is the wave operator

L = −∂

+ ∆,

then L is hyperbolic.

If we consider the problem

Lu = 0,

and forget the Cauchy data, we can look for solutions of the form

ik·x

, as a

good physicist would do. We can plug this into our operator to compute

L(e

ik·x

) = −

i,j=1

ik·x

So if

is elliptic, the only solution of this form is

= 0. If

is hyperbolic, we

can have non-trivial plane wave solutions provided k ∝ ν for some ν with

i,j=1

= 0.

So if we set

(

) =

iλν·x

for such a

(with

|ν|

= 1, wlog). By taking

very large, we can arrange this solution to have very large derivative in the

direction. Vaguely, this says the characteristic directions are the directions where

singularities can propagate. By contrast, we will see that this is not the case for

elliptic operators, and this is known as elliptic regularity. In fact, we will show

that if L is elliptic and u satisfies Lu = 0, then u ∈ C

∞

While Cauchy–Kovalevskaya is sometimes useful, it has a few issues:

– Not all functions are real analytic.

– We have no control over “how long” a solution exists.

– It doesn’t answer the question of well-posedness.

Indeed, consider the PDE

+ u

= 0.

This admits a solution

u(x, y) = cos kx cosh ky

for some

k ∈ R

. We can think of this coming as coming from the Cauchy problem

u(x, 0) = cos kx, u

(x, 0) = 0.

By Cauchy–Kovalevskaya, there is a unique real analytic solution, and we’ve

found one. So this is the unique solution.

Let’s think about what happens when

gets large. In this case, it seems

like nothing is very wrong with the initial data. While the initial data oscillates

more and more, it is still bounded by 1. However, we see that the solution at

any

ε >

0 grows exponentially. We might say that the derivatives of the

initial condition grows to infinity as well, but if we do a bit more work (as you

will on the example sheet), we can construct a sequence of initial data all of

whose derivatives tend to 0, but the solution still blows up.

This is actually a serious problem. If we want to solve the PDE for a more

general initial condition, we may want to decompose the initial data into Fourier

modes, and then integrate up these solutions we found. But we cannot do this

in general, if these solutions blow up as k → ∞.

3 Function spaces

From now on, we shall restrain our desire to be a physicist, and instead tackle

PDEs with functional analytic methods. This requires some technical under-

standing of certain function spaces.

3.1 The H¨older spaces

The most straightforward class of functions paces is the

spaces. These are

spaces based on classical continuity and differentiability.

Definition

(

spaces)

Let

U ⊆ R

be an open set. We define

(

) to be

vector space of all

U → R

such that

-times differentiable and the partial

derivatives D

u : U → R are continuous for |α| ≤ k.

We want to turn this into a Banach space by putting the supremum norm

on the derivatives. However, even

sup |u|

is not guaranteed to exist, as

may

be unbounded. So this doesn’t give a genuine norm. This suggests the following

definition.

Definition

(

) spaces)

We define

(

)

⊆ C

(

) to be the subspace of

all

such that D

are all bounded and uniformly continuous. We define a

norm on C

(

U) by

kuk

(

|α|≤k

sup

x∈U

u(x)k.

This makes C

(

U) a Banach space.

In some cases, we might want a “fractional” amount of differentiability. This

gives rise to the notion of H¨older spaces.

Definition

(H¨older continuity)

We say a function

U → R

is H¨older

continuous with index γ if there exists C ≥ 0 such that

|u(x) − u(y)| ≤ C|x − y|

for all x, y ∈ U .

We write

0,γ

(

)

⊆ C

(

) for the subspace of all H¨older continuous functions

with index γ.

We define the γ-H¨older semi-norm by

[u]

0,γ

(

= sup

x6=y∈U

|u(x) − u(y)|

|x − y|

We can then define a norm on C

0,γ

(

U) by

kuk

(0,γ

(

= kuk

(

+ [u]

0,γ

(

We say

u ∈ C

k,γ

(

) if

u ∈ C

(

) and D

u ∈ C

0,γ

(

) for all

|α|

, and we

define

kuk

k,γ

(

= kuk

(

|α|=k

0,γ

(

This makes C

k,γ

(

U) into a Banach space as well.

Note that C

0,1

(

U) is the set of (uniformly) Lipschitz functions on U.

3.2 Sobolev spaces

The properties of H¨older spaces are not difficult to understand, but on the other

hand they are not too useful. This is not too surprising, perhaps, because the

supremum norm only sees the maximum of the function, and ignores the rest.

In contrast, the

norm takes into account the values at all points. This gives

rise to the notion of Sobolev spaces.

Definition

(

space)

Let

U ⊆ R

be open, and suppose 1

≤ p ≤ ∞

. We

define the space L

(U) by

(U) = {u : U → R measurable | kuk

(U)

< ∞}/{equality a.e.}.

where, if p < ∞, we define

kuk

(U)



|u(x)|



1/p

and

kuk

∞

(U)

= inf{C ≥ 0 | |u(x)| ≤ C almost everywhere}.

Theorem. L

(U) is a Banach space with the L

norm.

We can also define local versions of

spaces by saying

u ∈ L

loc

(

) if

u ∈ L

(

) for every

V b U

, i.e.

V ⊆ U

and

is compact. This is read as “

is compactly contained in

”. By working with

loc

(

), we ignore any possible

blowing up at the boundary. Note that

loc

(

) is not Banach, but is a Fr´echet

space.

What we want to do is to define differentiability for these things. If we try to

define them via limits, then we run into difficulties since the value of an element

(

) at a point is not well-defined. To proceed, we use the notion of a weak

derivative.

Definition

(Weak derivative)

Suppose

u, v ∈ L

loc

(

) and

is a multi-index.

We say that v is the αth weak derivative of u if

φ dx = (−1)

|α|

vφ dx

for all

φ ∈ C

∞

(

), i.e. for all smooth, compactly supported function on

. We

write v = D

Note that if

is a genuine smooth function, then D

is the

th weak

derivative of u, as integration by parts tells us.

For those who have seen distributions, this is the same as the definition of a

distributional derivative, except here we require that the derivative is an

loc

function.

Lemma.

Suppose

v, ˜v ∈ L

loc

(

) are both

th weak derivatives of

u ∈ L

loc

(

then v = ˜v almost everywhere.

Proof. For any φ ∈ C

∞

(U), we have

(v − ˜v)φ dx = (−1)

|α|

(u − u)D

φ dx = 0.

Therefore v − ˜v = 0 almost everywhere.

Now that we have weak derivatives, we can define the Sobolev spaces.

Definition

(Sobolev space)

We say that

u ∈ L

loc

(

) belongs to the Sobolev

space W

k,p

(U) if u ∈ L

(U) and D

u exists and is in L

(U) for all |α| ≤ k.

If p = 2, we write H

(U) = W

k,2

(U), which will be a Hilbert space.

If p < ∞, we define the W

k,p

(U) norm by

kuk

k,p

(U)





|α|≤k





1/p

If p = ∞, we define

kuk

k,∞

(U)

|α|≤k

∞

(U)

We denote by

k,p

(

) the completion of

∞

(

) in this norm (and again

(U) = W

k,2

(U)).

To see that these things are somehow interesting, it would be nice to find

some functions that belong to these spaces but not the C

spaces.

Example. Let u = B

(0) be the unit ball in R

, and set

u(x) = |x|

−α

when x ∈ U, x 6= 0. Then for x 6= 0, we have

u =

−αx

|x|

α+1

By considering

φ ∈ C

∞

(

(0)

}

), it is clear that if

is weakly differentiable,

then it must be given by

u =

−αx

|a|

α+1

(∗)

We can check that u ∈ L

loc

(U) iff α < n, and

|x|

α+1

∈ L

loc

(U) if α < n − 1.

So if we want

u ∈ W

1,p

(

), then we must take

α < n −

1. To check (

∗

) is

indeed the weak derivative, suppose

φ ∈ C

∞

(

). Then integrating by parts, we

get

−

U−B

(0)

uφ

dx =

U−B

(0)

uφ dx −

∂B

(0)

uφν

dS,

where ν = (ν

, . . . , ν

) is the inwards normal. We can estimate



∂B

(0)

uφν



≤ kφk

∞

· ε

−α

· Cε

n−1

≤

Cε

n−1−α

→ 0 as ε → 0

for some constants

and

. So the second term vanishes. So by, say, dominated

convergence, it follows that (∗) is indeed the weak derivative.

Finally, note that D

u ∈ L

(

) iff

(

+ 1)

< n

. Thus, if

α <

n−p

, then

u ∈ W

1,p

(

). Note that if

p > n

, then the condition becomes

α <

0, and

continuous.

Note also that if α >

, then u 6∈ W

1,p

(U).

Theorem.

For each

= 0

, . . .

and 1

≤ p ≤ ∞

, the space

k,p

(

) is a Banach

space.

Proof.

Homogeneity and positivity for the Sobolev norm are clear. The triangle

inequality follows from the Minkowski inequality.

For completeness, note that

(U)

≤ kuk

k,p

(U)

for |α| ≤ k.

So if (

)

∞

i=1

is Cauchy in

k,p

(

), then (D

)

∞

i=1

is Cauchy in

(

) for

|α| ≤ k. So by completeness of L

(U), we have

→ u

∈ L

(U)

for some

. It remains to show that

= D

, where

(0,0,...,0)

. Let

φ ∈ C

∞

(U). Then we have

(−1)

|α|

φ dx =

φ dx

for all j. We send j → ∞. Then using D

→ u

in L

(U), we have

(−1)

|α|

φ dx =

φ dx.

So D

u = u

∈ L

(U) and we are done.

3.3 Approximation of functions in Sobolev spaces

It would be nice if we could approximate functions in

k,p

(

) with something

more tractable. For example, it would be nice if we could approximate them by

smooth functions, so that the weak derivatives are genuine derivatives. A useful

trick to improve regularity of a function is to convolve with a smooth mollifier.

Definition (Standard mollifier). Let

η(x) =

(

1/(|x|

−1)

|x| < 1

0 |x| ≥ 1

where C is chosen so that

η(x) dx = 1.

One checks that this is a smooth function on R

, peaked at x = 0.

For each ε > 0, we set

(x) =





Of course, the pre-factor of

is chosen so that

is appropriately normalized.

We call η

the standard mollifier, and it satisfies supp η

⊆ B

(0).

We think of these η

as approximations of the δ-function.

Now suppose U ⊆ R

is open, and let

= {x ∈ U : dist(x, ∂U ) > ε}.

Definition

(Mollification)

f ∈ L

loc

(

), we define the mollification

→

R by the convolution

= η

∗ f.

In other words,

(x) =

(x − y)f(y) =

(x)

(x − y)f(y) dy.

Thus,

is the “local average” of

around each point, with the weighting

given by

. The hope is that

will have much better regularity properties

than f .

Theorem. Let f ∈ L

loc

(U). Then

(i) f

∈ C

∞

(ii) f

→ f almost everywhere as ε → 0.

(iii) If in fact f ∈ C(U), then f

→ f uniformly on compact subsets.

(iv)

If 1

≤ p < ∞

and

f ∈ L

loc

(

), then

→ f

loc

(

), i.e. we have

convergence in L

on any V b U .

In general, the difficulty of proving these approximation theorems lie in what

happens at the boundary

Lemma. Assume u ∈ W

k,p

(U) for some 1 ≤ p < ∞, and set

= η

∗ u on U

Then

(i) u

∈ C

∞

) for each ε > 0

(ii) If V b U, then u

→ u in W

k,p

(V ).

Proof.

(i) As above.

(ii) We claim that

= η

∗ D

for |α| ≤ k in U

To see this, we have

(x) = D

(x − y)u(y) dy

(−1)

|α|

(x − y)u(y) dy

For a fixed

x ∈ U

(

x − ·

)

∈ C

∞

(

), so by the definition of a weak

derivative, this is equal to

(x − y)D

u(y) dy

= η

∗ D

It is an exercise to verify that we can indeed move the derivative past the

integral.

Thus, if we fix

V b U

. Then by the previous parts, we see that D

→

u in L

(V ) as ε → 0 for |α| ≤ k. So

− uk

k.p

(V )

|α|≤k

− D

(V )

→ 0

as ε → 0.

Theorem

(Global approximation)

Let 1

≤ p < ∞

, and

U ⊆ R

be open and

bounded. Then C

∞

(U) ∩ W

k,p

(U) is dense in W

k,p

(U).

Our main obstacle to overcome is the fact that the mollifications are only

defined on U

, and not U.

Proof. For i ≥ 1, define



x ∈ U | dist(x, ∂U) >



= U

i+3

−

i+1

= U

i+4

−

We clearly have

∞

i=1

, and we can choose

b U

such that

∞

i=0

Let

{ζ

}

∞

i=0

be a partition of unity subordinate to

}

. Thus, we have

0 ≤ ζ

≤ 1, ζ

∈ C

∞

) and

∞

i=0

= 1 on U .

Fix δ > 0. Then for each i, we can choose ε

sufficiently small such that

= η

∗ ζ

satisfies supp u

⊆ W

and

− ζ

k.p

(U)

= ku

− ζ

k.p

)

≤

i+1

Now set

v =

∞

i=0

∈ C

∞

(U).

Note that we do not know (yet) that

v ∈ W

k.p

(

). But it certainly is when we

restrict to some V b U .

In any such subset, the sum is finite, and since u =

∞

i=0

u, we have

kv − uk

k,p

(V )

≤

∞

i=0

− ζ

k.p

(V )

≤ δ

∞

i=0

−(i+1)

= δ.

Since the bound

does not depend on

, by taking the supremum over all

we have

kv − uk

k.p

(U)

≤ δ.

So we are done.

It would be nice for

∞

(

) to be dense, instead of just

∞

(

). It turns out

this is possible, as long as we have a sensible boundary.

Definition

(

k,δ

boundary)

Let

U ⊆ R

be open and bounded. We say

∂U

k,δ

if for any point in the boundary

p ∈ ∂U

, there exists

r >

0 and a function

γ ∈ C

k,δ

(

n−1

) such that (possibly after relabelling and rotating axes) we have

U ∩ B

(p) = {(x

, x

) ∈ B

(p) : x

> γ(x

)}.

Thus, this says our boundary is locally the graph of a C

k,δ

function.

Theorem

(Smooth approximation up to boundary)

Let 1

≤ p < ∞

, and

U ⊆ R

be open and bounded. Suppose

∂U

0,1

. Then

∞

(

)

∩ W

k,p

(

) is

dense in W

k,p

(U).

Proof.

Previously, the reason we didn’t get something in

∞

(

) was that we

had to glue together infinitely many mollifications whose domain collectively

exhaust

, and there is no hope that the resulting function is in

∞

(

). In the

current scenario, we know that U locally looks like

The idea is that given a

defined on

, we can shift it downwards by some

It is a known result that translation is continuous, so this only changes

by a

tiny bit. We can then mollify with a

¯ε < ε

, which would then give a function

defined on U (at least locally near x

So fix some

∈ ∂U

. Since

∂U

0,1

, there exists

r >

0 such that

γ ∈ C

0,1

n−1

) such that

U ∩ B

) = {(x

, x

) ∈ B

) | x

> γ(x

)}.

Set

V = U ∩ B

r/2

Define the shifted function u

to be

(x) = u(x + εe

Now pick ¯ε sufficiently small such that

ε,¯ε

= η

¯ε

∗ u

is well-defined. Note that here we need to use the fact that

∂U

0,1

. Indeed,

we can see that if the slope of ∂U is very steep near a point x:

then we need to choose a

¯ε

much smaller than

. By requiring that

is 1-H¨older

continuous, we can ensure there is a single choice of

¯ε

that works throughout

As long as ¯ε is small enough, we know that v

ε,¯ε

∈ C

∞

(

V ).

Fix δ > 0. We can now estimate

ε,˜ε

− uk

k.p

(V )

= kv

ε,˜ε

− u

+ u

− uk

k,p

(V )

≤ kv

ε,˜ε

− u

k,p

(V )

+ ku

− uk

k.p

(V )

Since translation is continuous in the

norm for

p < ∞

, we can pick

ε >

such that

− uk

k.p

(V )

. Having fixed such an

, we can pick

˜ε

so small

that we also have kv

ε,˜ε

− u

k.p

(V )

The conclusion of this is that for any

∈ ∂U

, we can find a neighbourhood

V ⊆ U

such that for any

u ∈ W

k,p

(

) and

δ >

0, there exists

v ∈ C

∞

(

V ) such that ku − vk

k,p

(V )

≤ δ.

It remains to patch all of these together using a partition of unity. By the

compactness of

∂U

, we can cover

∂U

by finitely many of these

, say

, . . . , V

We further pick a V

such that V

b U and

U =

[

i=0

We can pick approximations

∈ C

∞

(

) for

= 0

, . . . , N

(the

= 0 case is given

by the previous global approximation theorem), satisfying

− uk

k,p

)

≤ δ

Pick a partition of unity {ζ

}

i=0

U subordinate to {V

}. Define

v =

i=0

Clearly v ∈ C

∞

(

U), and we can bound

v − D

(U)



i=0

− D

i=0



(U)

≤ C

i=0

− uk

k.p

)

≤ C

(1 + N)δ,

where

is a constant that solely depends on the derivatives of the partition of

unity, which are fixed. So we are done.

3.4 Extensions and traces

U ⊆ R

is open and bounded, then there is of course a restriction map

1,p

(

)

→ W

1,p

(

). It turns out under mild conditions, there is an extension

map going in the other direction as well.

Theorem

(Extension of

1.p

functions)

Suppose

is open, bounded and

∂U

. Pick a bounded

such that

U b V

. Then there exists a bounded linear

operator

E : W

1,p

(U) → W

1.p

)

for 1 ≤ p < ∞ such that for any u ∈ W

1,p

(U),

(i) Eu = u almost everywhere in U

(ii) Eu has support in V

(iii) kEuk

1,p

)

≤ Ckuk

1,p

(U)

, where the constant

depends on

U, V, p

but not u.

Proof.

First note that

(

) is dense in

1,p

(

). So it suffices to show that

the above theorem holds with

1,p

(

) replaced with

(

), and then extend

by continuity.

We first show that we can do this locally, and then glue them together using

partitions of unity.

Suppose

∈ ∂U

is such that

∂U

near

lies in the plane

= 0

}

. In

other words, there exists r > 0 such that

= B

) ∩ {x

≥ 0} ⊆

−

= B

) ∩ {x

≤ 0} ⊆ R

\ U.

The idea is that we want to reflect

across the

= 0 boundary to get a

function on

−

, but the derivative will not be continuous if we do this. So we

define a “higher order reflection” by

¯u(x) =

(

u(x) x ∈ B

−3u(x

, −x

) + 4



, −



x ∈ B

−

−x

−

We see that this is a continuous function. Moreover, by explicitly computing the

partial derivatives, we see that they are continuous across the boundary. So we

know ¯u ∈ C

)).

We can then easily check that we have

k¯uk

1,p

))

≤ Ckuk

1,p

)

for some constant C.

∂U

is not necessarily flat near

∈ ∂U

, then we can use a

diffeomor-

phism to straighten it out. Indeed, we can pick

r >

0 and

γ ∈ C

(

n−1

) such

that

U ∩ B

(p) = {(x

, x

) ∈ B

(p) | x

> γ(x

)}.

We can then use the C

-diffeomorphism Φ : R

→ R

given by

Φ(x)

= x

i = 1, . . . , n − 1

Φ(x)

= x

− γ(x

, . . . , x

)

Then since

diffeomorphisms induce bounded isomorphisms between

1,p

this gives a local extension.

Since

∂U

is compact, we can take a finite number of points

∈ ∂W

, sets

and extensions u

∈ C

) extending u such that

∂U ⊆

[

i=1

Further pick

b U

so that

U ⊆

i=0

. Let

{ζ

}

i=0

be a partition of unity

subordinate to {W

}. Write

¯u =

i=0

¯u

where ¯u

= u. Then ¯u ∈ C

), ¯u = u on U , and we have

k¯uk

1,p

)

≤ Ckuk

1,p

(U)

By multiplying ¯u by a cut-off, we may assume supp ¯u ⊆ V for some V c U .

Now notice that the whole construction is linear in

. So we have constructed

a bounded linear operator from a dense subset of

1,p

(

) to

1,p

(

), and there

is a unique extension to the whole of

1,p

(

) by the completeness of

1,p

(

We can see that the desired properties are preserved by this extension.

Trace theorems

A lot of the PDE problems we are interested in are boundary value problems,

namely we want to solve a PDE subject to the function taking some prescribed

values on the boundary. However, a function

u ∈ L

(

) is only defined up to

sets of measure zero, and

∂U

is typically a set of measure zero. So naively, we

can’t naively define

∂U

. We would hope that if we require

to have more

regularity, then perhaps it now makes sense to define the value at the boundary.

This is true, and is given by the trace theorem

Theorem

(Trace theorem)

Assume

is bounded and has

boundary. Then

there exists a bounded linear operator

1,p

(

)

→ L

(

∂U

) for 1

≤ p < ∞

such that T u = u|

∂U

if u ∈ W

1,p

(U) ∩ C(

U).

We say T u is the trace of u.

Proof.

It suffices to show that the restriction map defined on

∞

functions is a

bounded linear operator, and then we have a unique extension to

1,p

(

). The

gist of the argument is that Stokes’ theorem allows us to express the integral of

a function over the boundary as an integral over the whole of

. In fact, the

proof is indeed just the proof of Stokes’ theorem.

By a general partition of unity argument, it suffices to show this in the case

where U = {x

> 0} and u ∈ C

∞

U with supp u ⊆ B

(0) ∩

U. Then

n−1

|u(x

, 0)|

n−1

∞

∂

∂x

|u(x

, x

p|u|

p−1

sgn u dx

We estimate this using Young’s inequality to get

n−1

|u(x

, 0)|

≤ C

|u|

+ |u

dU ≤ C

kuk

1,p

(U)

So we are done.

We can apply this to each derivative to define trace maps

k,p

(

)

→

k−1,p

(U).

In general, this trace map is not surjective. So in some sense, we don’t

actually need to use up a whole unit of differentiability. In the example sheet,

we see that in the case p = 2, we only lose “half” a derivative.

Note that

∞

(

) is dense in

1,p

(

), and the trace vanishes on

∞

(

vanishes on

1,p

(

). In fact, the converse is true — if

T u

= 0, then

u ∈ W

1,p

(U).

3.5 Sobolev inequalities

Before we can move on to PDE’s, we have to prove some Sobolev inequalities.

These are inequalities that compare different norms, and allows us to “trade”

different desirable properties. One particularly important thing we can do is

to trade differentiability for continuity. So we will know that if

u ∈ W

k,p

(

)

for some large

, then in fact

u ∈ C

(

) for some (small)

. The utility of

these results is that we would like to construct our solutions in

k,p

spaces,

since these are easier to work with, but ultimately, we want an actual, smooth

solution to our equation. Sobolev inequalities let us do so, since if

u ∈ W

k,p

(

)

for all k, then it must be in C

as well.

To see why we should be expected to be able to do that, consider the space

([0

1]). A priori, if

u ∈ H

([0

1]), then we only know it exists as some

measurable function, and there is no canonical representative of this function.

However, we can simply assign

u(x) =

(t) dt,

since we know

is an honest integrable function. This gives a well-defined

representative of the function

, and even better, we can bound its supremum

using ku

([0,1])

Before we start proving our Sobolev inequalities, we first prove the following

lemma:

Lemma. Let n ≥ 2 and f

, . . . , f

∈ L

n−1

). For 1 ≤ i ≤ n, denote

˜x

= (x

, . . . , x

i−1

, x

i+1

, . . . , x

and set

f(x) = f

(˜x

) ···f

(˜x

Then f ∈ L

) with

kfk

) ≤

i=1

n−1

)

Proof. We proceed by induction on n.

If n = 2, then this is easy, since

f(x

, x

) = f

|f(x

, x

)| dx =

)| dx

= kf

)

Suppose that the result is true for n ≥ 2, and consider the n + 1 case. Write

f(x) = f

n+1

(˜x

n+1

)F (x),

where F (x) = f

(˜x

) ···f

(˜x

). Then by H¨older’s inequality, we have

,...,x

|f( ·, x

n+1

)| dx ≤ kf

n+1

)

kF ( ·, x

n+1

n/(n−1)

)

We now use the induction hypothesis to

n/(n−1)

( ·, x

n+1

n/(n−1)

( ·, x

n+1

) ···f

n/(n−1)

( ·, x

n+1

,...,x

|f( ·, x

n+1

)| dx ≤ kf

n+1

)

i=1

n−1

( ·, x

n−1

)

n−1

= kf

n+1

)

i=1

( ·, x

n−1

)

Now integrate over x

n+1

. We get

kfk

n+1

)

≤ kf

n+1

)

n+1

i=1

( ·, x

n+1

n−1

)

≤ kf

n+1

)

i=1

n+1

( ·, x

n+1

n−1

)

n+1

1/n

= kf

n+1

)

i=1

)

Theorem

(Gagliardo–Nirenberg–Sobolev inequality)

Assume

n > p

. Then we

have

1,p

) ⊆ L

∗

where

∗

n − p

> p,

and there exists c > 0 depending on n, p such that

kuk

∗

)

≤ ckuk

1,p

)

In other words, W

1,p

) is continuously embedded in L

∗

Proof.

Assume

u ∈ C

∞

(

), and consider

= 1. Since the support is compact,

u(x) =

−∞

, . . . , x

i−1

, y

, x

i+1

, . . . , x

) dy

So we know that

|u(x)| ≤

∞

−∞

|Du(x

, . . . , x

i−1

, y

, x

i+1

, . . . , x

)| dy

≡ f

(˜x

Thus, applying this once in each direction, we obtain

|u(x)|

n/(n−1)

≤

i=1

(˜x

)

1/(n−1)

If we integrate and then use the lemma, we see that



kuk

n/(n−1)

)



n/(n−1)

≤ C

i=1

1/(n−1)

n−1

)

= kDuk

n/(n−1)

)

kuk

n/(n−1)

)

≤ CkDuk

)

Since C

∞

) is dense in W

1,1

), the result for p = 1 follows.

Now suppose p > 1. We apply the p = 1 case to

v = |u|

for some γ > 1, which we choose later. Then we have

Dv = γ sgn u · |u|

γ−1

Du.



|u|

γn

n−1



n−1

≤ γ

|u|

γ−1

|Du| dx

≤ γ



|u|

(γ−1)

p−1



p−1



|Du|



We choose γ such that

γn

n − 1

(γ − 1)p

p − 1

So we should pick

γ =

p(n − 1)

n − p

> 1.

Then we have

γn

n − 1

n − p

= p

∗



|u|

∗



n−1

≤

p(n − 1)

n − p



|u|

∗



p−1

kDuk

)



|u|

∗



1/p

∗

≤

p(n − 1)

n − p

kDuk

)

This argument is valid for

u ∈ C

∞

(

), and by approximation, we can extend

to W

1,p

We can deduce some corollaries of this result:

Corollary.

Suppose

U ⊆ R

is open and bounded with

-boundary, and

1 ≤ p < n. Then if p

∗

n−p

, we have

1,p

(U) ⊆ L

∗

(U),

and there exists C = C(U, p, n) such that

kuk

∗

(U)

≤ Ckuk

1,p

(U)

Proof.

By the extension theorem, we can find

¯u ∈ W

1,p

(

) with

¯u

almost

everywhere on U and

k¯uk

1,p

)

≤ Ckuk

1,p

(U)

Then we have

kuk

∗

(U)

≤ k¯uk

∗

)

≤ ck¯uk

1,p

)

≤

Ckuk

1,p

(U)

Corollary.

Suppose

is open and bounded, and suppose

u ∈ W

1,p

(

). For

some 1 ≤ p < n, then we have the estimates

kuk

(U)

≤ CkDuk

(U)

for any q ∈ [1, p

∗

]. In particular,

kuk

(U)

≤ CkDuk

(U)

Proof.

Since

u ∈ W

1,p

(

), there exists

∈ C

∞

(

) converging to

1,p

(

Extending u

to vanish on U

, we have

∈ C

∞

Applying Gagliardo–Nirenberg–Sobolev, we find that

∗

)

≤ CkDu

)

So we know that

∗

(U)

≤ CkDu

(U)

Sending m → ∞, we obtain

kuk

∗

(U)

≤ CkDuk

(U)

Since U is bounded, by H¨older, we have



|u|



1/q

≤



1 dx



1/rq



|u|



1/sq

≤ Ckuk

∗

(U)

provided

q ≤ p

∗

, where we choose

such that

∗

, and

such that

= 1.

The previous results were about the case

n > p

. If

n < p < ∞

, then we

might hope that if u ∈ W

1,p

), then u is “better than L

∞

”.

Theorem

(Morrey’s inequality)

Suppose

n < p < ∞

. Then there exists a

constant C depending only on p and n such that

kuk

0,γ

)

≤ Ckuk

1,p

)

for all u ∈ C

∞

) where C = C(p, n) and γ = 1 −

< 1.

Proof. We first prove the H¨older part of the estimate.

Let Q be an open cube of side length r > 0 and containing 0. Define

¯u =

|Q|

u(x) dx.

Then

|¯u − u(0)| =



|Q|

[u(x) − u(0)] dx



≤

|Q|

|u(x) − u(0)| dx.

Note that

u(x) − u(0) =

u(tx) dt =

∂u

∂x

(tx) dt.

|u(x) − u(0)| ≤ r



∂u

∂x

(tx)



dt.

So we have

|¯u − u(0)| ≤

|Q|



∂u

∂x

(tx)



dt dx

|Q|

−n



∂u

∂x

(y)



≤

|Q|

−n

i=1



∂u

∂x



(tQ)

|tQ|

1/p

dt.

where

= 1.

Using that |Q| = r

, we obtain

|¯u − u(0)| ≤ cr

1−n+

kDuk

)

−n+

≤

1 − n/p

1−n/p

kDuk

)

Note that the right hand side is decreasing in

. So when we take

to be very

small, we see that u(0) is close to the average value of u around 0.

Indeed, suppose

x, y ∈ R

with

|x − y|

. Pick a box containing

and

of side length

. Applying the above result, shifted so that

play the role of

0, we can estimate

|u(x) − u(y)| ≤ |u(x) − ¯u| + |u(y) − ¯u| ≤

1−n/p

kDuk

)

Since r < kx − yk, it follows that

|u(x) − u(y)|

|x − y|

1−n/p

≤ C · 2

1−n/p

kDuk

)

So we conclude that [u]

0,γ

)

≤ CkDuk

)

Finally, to see that

is bounded, any

x ∈ R

belongs to some cube

of side

length 1. So we have

|u(x)| ≤ |u(x) − ¯u + ¯u| ≤ |¯u| + CkDuk

)

But also

|¯u| ≤

|u(x)| dx ≤ kuk

)

k1k

(Q)

= kuk

)

So we are done.

Corollary.

Suppose

u ∈ W

1,p

(

) for

open, bounded with

boundary.

Then there exists

∗

∈ C

0,γ

(

) such that

∗

almost everywhere and

∗

0,γ

(U)

≤ Ckuk

1,p

(U)

By applying these results iteratively, we can establish higher order versions

k,p

⊆ L

(U)

with some appropriate q.

4 Elliptic boundary value problems

4.1 Existence of weak solutions

In this chapter, we are going to study second-order elliptic boundary value

problems. The canonical example to keep in mind is the following:

Example.

Suppose

U ⊆ R

is a bounded open set with smooth boundary.

Suppose

∂U

is a perfect conductor and

U → R

is the charge density inside

The electrostatic field φ satisfies

∆φ = ρ on U

φ|

∂U

= 0.

This is an example of an elliptic boundary value problem. Note that we cannot

tackle this with the Cauchy–Kovalevskaya theorem, since we don’t even have

enough boundary conditions, and also because we want an everywhere-defined

solution.

In general, let

U ⊆ R

be open and bounded with

boundary, and for

u ∈ C

(

U), we define

Lu = −

i,j=1

(x)u

)

i=1

(x)u

+ c(x)u,

where

and

are given functions defined on

. Typically, we will assume

they are at least L

∞

, but sometimes we will require more.

If a

∈ C

(U), then we can rewrite this as

Lu = −

i,j=1

(x)u

i=1

(x)u

+ c(x)u

for some

, using the product rule.

We will mostly use the first form, called the divergence form, which is suitable

for the energy method, while the second (non-divergence form) is suited to the

maximum principle. Essentially, what makes the divergence form convenient for

us is that it’s easy to integrate by parts.

Of course, given the title of the chapter, we assume that L is elliptic, i.e.

i,j

(x)ξ

≥ 0

for all x ∈ U and ξ ∈ R

It turns out this is not quite strong enough, because this condition allows

the a

’s to be degenerate, or vanish at the boundary.

Definition (Uniform ellipticity). An operator

Lu = −

i,j=1

(x)u

)

i=1

(x)u

+ c(x)u

is uniformly elliptic if

i,j=1

(x)ξ

≥ θ|ξ|

for some θ > 0 and all x ∈ U, ξ ∈ R

We shall consider the boundary value problem

Lu = f on U

u = 0 on ∂U.

This form of the equation is not very amenable to study by functional analytic

methods. Similar to what we did in the proof of Picard–Lindel¨of, we want to

write this in a weak formulation.

Let’s suppose

u ∈ C

(

) is a solution, and suppose

v ∈ C

(

) also satisfies

∂U

= 0. Multiply the equation

and integrate by parts. Then we

get

vf dx =





v + cuv





dx ≡ B[u, v]. (2)

Conversely, suppose

u ∈ C

(

) and

∂U

= 0. If

[

u, v

] for all

v ∈ C

(

) such that

∂U

= 0, then we claim

in fact solves the original

equation.

Indeed, undoing the integration by parts, we conclude that

vLu dx =

vf dx

for all

v ∈ C

(

) with

∂U

= 0. But if this is true for all

, then it must be

that Lu = f .

Thus, the PDE problem we started with is equivalent to finding

that solves

B[u, v] =

vf dx for all suitable v, provided u is regular enough.

But the point is that (2) makes sense for

u, v ∈ H

(

). So our strategy is

to first show that we can find

u ∈ H

(

) that solves (2), and then hope that

under reasonable assumptions, we can show that any such solution must in fact

be C

(

U).

Definition (Weak solution). We say u ∈ H

(U) is a weak solution of

Lu = f on U

u = 0 on ∂U

for f ∈ L

(U) if

B[u, v] = (f, v)

(U)

for all v ∈ H

(U).

We’ll exploit the Hilbert space structure of H

(U) to find weak solutions.

Theorem

(Lax–Milgram theorem)

Let

be a real Hilbert space with inner

product (

·, ·

). Suppose

H × H → R

is a bilinear mapping such that there

exists constants α, β > 0 so that

– |B[u, v]| ≤ αkukkvk for all u, v ∈ H (boundedness)

– βkuk

≤ B[u, u] (coercivity)

Then if

H → R

is a bounded linear map, then there exists a unique

u ∈ H

such that

B[u, v] = hf, vi

for all v ∈ H.

Note that if

is just the inner product, then this is the Riesz representation

theorem.

Proof.

By the Riesz representation theorem, we may assume that there is some

w such that

hf, vi = (u, v).

For each fixed u ∈ H, the map

v 7→ B[u, v]

is a bounded linear functional on

. So by the Riesz representation theorem,

we can find some Au such that

B[u, v] = (Au, v).

It then suffices to show that A is invertible, for then we can take u = A

−1

– Since B is bilinear, it is immediate that A : H → H is linear.

– A is bounded, since we have

kAuk

= (Au, Au) = B[u, Au] ≤ αkukkAuk.

– A is injective and has closed image. Indeed, by coercivity, we know

βkuk

≤ B[u, u] = (Au, u) ≤ kAukkuk.

Dividing by

kuk

, we see that

is bounded below, hence is injective and

has closed image (since H is complete).

(Indeed, injectivity is clear, and if

→ v

for some

, then

−u

k ≤

kAu

− Au

k →

0 as

m, n → ∞

. So (

) is Cauchy, and hence has a

limit u. Then by continuity, Au = v, and in particular, v ∈ im A)

– Since im A is closed, we know

H = im A ⊕ im A

⊥

Now let w ∈ im A

⊥

. Then we can estimate

βkwk

≤ B[w, w] = (Aw, w) = 0.

So w = 0. Thus, in fact im A

⊥

= {0}, and so A is surjective.

We would like to apply this to our elliptic PDE. To do so, we need to prove

that our

satisfy boundedness and coercivity. Unfortunately, this is not always

true.

Theorem

(Energy estimates for

)

Suppose

, b

, c ∈ L

∞

(

), and

there exists θ > 0 such that

i,j=1

(x)ξ

≥ θ|ξ|

for almost every x ∈ U and ξ ∈ R

. Then if B is defined by

B[u, v] =





v + cuv





dx,

then there exists α, β > 0 and γ ≥ 0 such that

(i) |B[u, v]| ≤ αkuk

(U)

kvk

(U)

for all u, v ∈ H

(U)

(ii) βkuk

(U)

≤ B[u, u] + γkuk

(U)

Moreover, if b

≡ 0 and c ≥ 0, then we can take γ.

Proof.

(i) We estimate

|B[u, v]| ≤

i,j

∞

(U)

|Du||Dv| dx

kbk

∞

(U)

|Du||v| dx

+ kck

∞

(U)

|u||v| dx

≤ c

kDuk

(U)

kDvk

(u)

+ c

kDuk

(U)

kvk

(U)

+ c

kuk

(U)

kvk

(u)

≤ αkuk

(U)

kvk

(U)

for some α.

(ii) We start from uniform ellipticity. This implies

|Du|

dx ≤

i,j=1

(x)u

= B[u, u] −

i=1

u + cu

≤ B[u, u] +

i=1

∞

(U)

|Du||u| dx

+ kck

∞

(U)

|u|

dx.

Now by Young’s inequality, we have

|Du||u| dx ≤ ε

|Du|

dx +

4ε

|u|

for any ε > 0. We choose ε small enough so that

i=1

∞

(U)

≤

So we have

|Du|

dx ≤ B[u, u] +

|Du|

dx + γ

|u|

for some γ. This implies

kDuk

(U)

≤ B[u, u] + γkuk

(U)

We can add

kuk

(U)

on both sides to get the desired bound on

kuk

(U)

To get the “moreover” statement, we see that under these conditions, we have

|Du|

dx ≤ B[u, u].

Then we apply the Poincar´e’s inequality, which tells us there is some

C >

such that for all u ∈ H

(U), we have

kuk

(U)

≤ CkDuk

(U)

The estimate (ii) is sometimes called G˚arding’s inequality.

Theorem.

Let

U, L

be as above. There is a

γ ≥

0 such that for any

µ ≥ γ

and

any f ∈ L

(U), there exists a unique weak solution to

Lu + µu = f on U

u = 0 on ∂U .

Moreover, we have

kuk

(U)

≤ Ckf k

(U)

for some C = C(L, U) ≥ 0.

Again, if b

≡ 0 and c ≥ 0, then we may take γ = 0.

Proof.

Take

from the previous theorem when applied to

. Then if

µ ≥ γ

and

we set

[u, v] = B[u, v] + µ(u, v)

(U)

This is the bilinear form corresponding to the operator

= L + µ.

Then by the previous theorem,

satisfies boundedness and coercivity. So if we

fix any f ∈ L

, and think of it as an element of H

(U)

∗

hf, vi = (f, u)

(U)

fv dx,

then we can apply Lax–Milgram to find a unique

u ∈ H

(

) satisfying

[

u, v

] =

hf, vi

= (

f, v

)

(U)

for all

v ∈ H

(

). This is precisely the condition for

to be

a weak solution.

Finally, the G˚arding inequality tells us

βkuk

(U)

≤ B

[u, u] = (f, u)

(U)

≤ kf k

(U)

kuk

(U)

So we know that

βkuk

(U)

≤ kf k

(U)

In some way, this is a magical result. We managed to solve a PDE without

having to actually work with a PDE. There are a few things we might object

to. First of all, we only obtained a weak solution, and not a genuine solution.

We will show that under some reasonable assumptions on

a, b, c

, if

is better

behaved, then

is also better behaved, and in general, if

f ∈ H

, then

u ∈ H

k+2

This is known as elliptic regularity. Together Sobolev inequalities, this tells us

is genuinely a classical solution.

Another problem is the presence of the

. We noted that if

is, say, Laplace’s

equation, then we can take

= 0, and so we don’t have this problem. But in

general, this theorem requires it, and this is a bit unsatisfactory. We would like

to think a bit more about it.

4.2 The Fredholm alternative

To understand the second problem, we shall seek to prove the following theorem:

Theorem (Fredholm alternative). Consider the problem

Lu = f, u|

∂U

= 0. (∗)

For

a uniformly elliptic operator on an open bounded set

with

boundary,

either

(i)

For each

f ∈ L

(

), there is a unique weak solution

u ∈ H

(

) to (

∗

); or

(ii)

There exists a non-zero weak solution

u ∈ H

(

) to the homogeneous

problem, i.e. (∗) with f = 0.

This is similar to what we know about solving matrix equations

—

either there is a solution for all

, or there are infinitely many solutions to the

homogneous problem.

Similar to the previous theorem, this follows from some general functional

analytic result. Recall the definition of a compact operator:

Definition

(Compact operator)

A bounded operator

H → H

is compact

if every bounded sequence (

)

∞

m=1

has a subsequence

such that (

)

∞

j=1

converges strongly in H.

Recall (or prove as an exercise) the following theorem regarding compact

operators.

Theorem

(Fredholm alternative)

Let

be a Hilbert space and

H → H

be a compact operator. Then

(i) ker(I − K) is finite-dimensional.

(ii) im(I − K) is finite-dimensional.

(iii) im(I − K) = ker(I − K

†

)

⊥

(iv) ker(I − K) = {0} iff im(I − K) = H.

(v) dim ker(I − K) = dim ker(I − K

†

) = dim coker(I − K).

How do we apply this to our situation? Our previous theorem told us that

is invertible for large

, and we claim that (

)

−1

is compact. We can

then deduce the previous result by applying (iv) of the Fredholm alternative

with K a (scalar multiple of) (L + γ)

−1

(plus some bookkeeping).

So let us show that (

)

−1

is compact. Note that this maps sends

f ∈ L

(

)

u ∈ H

(

). To make it an endomorphism, we have to compose this with the

inclusion

(

)

→ L

(

). The proof that (

)

−1

is compact will not involve

(

)

−1

in any way — we shall show that the inclusion

(

)

→ L

(

) is

compact!

We shall prove this in two steps. First, we need the notion of weak conver-

gence.

Definition

(Weak convergence)

Suppose (

)

∞

n=1

is a sequence in a Hilbert

space H. We say u

converges weakly to u ∈ H if

, w) → (u, w)

for all w ∈ H. We write u

* u.

Of course, we have

Lemma. Weak limits are unique.

Lemma. Strong convergence implies weak convergence.

We shall show that given any bounded sequence in

(

), we can find

a subsequence that is weakly convergent. We then show that every weakly

convergent sequence in H

(U) is strongly convergent in L

(U).

In fact, the first result is completely general:

Theorem

(Weak compactness)

Let

be a separable Hilbert space, and suppose

(

)

∞

m=1

is a bounded sequence in

with

k ≤ K

for all

. Then

admits

a subsequence (u

)

∞

j=1

such that u

* u for some u ∈ H with kuk ≤ K.

One can prove this theorem without assuming

is separable, but it is slightly

messier.

Proof.

Let (

)

∞

i=1

be an orthonormal basis for

. Consider (

, u

). By Cauchy–

Schwarz, we have

|(e

, u

)| ≤ ke

kke

k ≤ K.

So by Bolzano–Weierstrass, there exists a subsequence (

) such that (

, u

)

converges.

Doing this iteratively, we can find a subsequence (

) such that for each

there is some c

such that (e

, v

) → c

as ` → ∞.

We would expect the weak limit to be

. To prove this, we need to first

show it converges. We have

j=1

= lim

k→∞

j=1

|(e

, v

≤ sup

j=1

|(e

, v

≤ sup kv

≤ K

using Bessel’s inequality. So

u =

∞

j=1

converges in H, and kuk ≤ K. We already have

, v

) → (e

, u)

for all

. Since

− uk

is bounded by 2

, it follows that the set of all

such

that

(w, v

) → (v, u) (†)

is closed under finite linear combinations and taking limits, hence is all of

To see that it is closed under limits, suppose

→ w

, and

satisfy (

†

). Then

|(w, v

)−(w, u)| ≤ |(w−w

, v

−u)|+|(w

, v

−u)| ≤ 2Kkw −w

k+|(w

, v

−u)|

So we can first find

large enough such that the first term is small, then pick

such that the second is small.

We next want to show that if

* u

(

), then

→ u

. We

may as well assume that

is some large cube of length

by extension. Notice

that since

is bounded, the constant function 1 is in

(

). So

* u

particular implies

− u) dx → 0.

Recall that the Poincar´e inequality tells us if

u ∈ H

(

), then we can bound

kuk

(Q)

by some multiple of

(U)

. If we try to prove this without the

assumption that

vanishes on the boundary, then we find that we need a

correction term. The resulting lemma is as follows:

Lemma

(Poincar´e revisited)

Suppose

u ∈ H

(

). Let

= [

, ξ

]

×···×

[ξ

, ξ

+ L] be a cube of length L. Then we have

kuk

(Q)

≤

|Q|



u(x) dx



kDuk

(Q)

We can improve this to obtain better bounds by subdividing

into smaller

cubes, and then applying this to each of the cubes individually. By subdividing

enough, this leads to a proof that u

* u in H

implies u

→ u in H

Proof. By approximation, we can assume u ∈ C

∞

(

Q). For x, y ∈ Q, we write

u(x) − u(y) =

u(t, x

, . . . , x

) dt

u(y

, t, x

, . . . , x

) dt

+ ···

u(y

, . . . , y

n−1

, t) dt.

Squaring, and using 2ab ≤ a

+ b

, we have

u(x)

+ u(y)

− 2u(x)u(y) ≤ n



u(t, x

, . . . , x

) dt



+ ···

+ n



u(y

, . . . , y

n−1

, t) dt



Now integrate over x and y. On the left, we get

Q×Q

dx dy (u(x)

+ u(y)

− 2u(x)u(y)) = 2|Q|kuk

(Q)

− 2



u(x) dx



On the right we have



u(t, x

, . . . , x

) dt



≤



u(t, x

, . . . , x

)



dt (Cauchy–Schwarz)

≤ L



u(t, x

, . . . , x

)



dt.

Integrating over all x, y ∈ Q, we get

Q×Q

dx dy I

≤ L

|Q|kD

(Q)

Similarly estimating the terms on the right-hand side, we find that

2|Q|kuk

(Q)

− 2



u(x) dx



≤ n|Q|

i=1

(Q)

= n|Q|L

kDuk

(Q)

It now follows that

Theorem

(Rellich–Kondrachov)

Let

U ⊆ R

be open, bounded with

boundary. Then if (

)

∞

m=1

is a sequence in

(

) with

* u

, then

→ u

in L

In particular, by weak compactness any sequence in

(

) has a subsequence

that is convergent in L

(U).

Note that to obtain the “in particular” part, we need to know that

(

) is

separable. This is an exercise on the example sheet. Alternatively, we can appeal

to a stronger version of weak compactness that does not assume separability.

Proof.

By the extension theorem, we may assume

for some large cube

with U b Q.

We subdivide

into

many cubes of side length

, such that the cubes

only intersect at their faces. Call these {Q

}

a=1

We apply Poincar´e separately to each of these to obtain

− uk

(Q)

a=1

− uk

)

≤

a=1



− u) dx



nδ

kDu

− Duk

)

a=1



− u) dx



nδ

kDu

− Duk

(Q)

Now since

−

(Q)

is fixed, for

small enough, the second term is

Then since u

* u, we in particular have

− u) dx → 0 as i → ∞

for all

, since this is just the inner product with the constant function 1. So for

i large enough, the first term is also <

The same result holds with

(

) replaced by

(

). The proof is in fact

simpler, and we wouldn’t need the assumption that the boundary is C

Corollary.

Suppose

(

)

→ H

(

) is a bounded linear operator. Then

the composition

(U) H

(U) L

(U)

is compact.

The slogan is that we get compactness whenever we improve regularity, which

is something that happens in much more generality.

Proof.

Indeed, if

∈ L

(

) is bounded, then

is also bounded. So by

Rellich–Kondrachov, there exists a subsequence u

→ u in L

(U).

We are now ready to prove the Fredholm alternative for elliptic boundary

value problems. Recall that in our description of the Fredholm alternative, we

had the direct characterizations

(

I − K

) =

ker

(

I − K

†

)

⊥

. We can make the

analogous statement here. To do so, we need to talk about the adjoint of

Since

is not an operator defined on

(

), trying to write down what it means

to be an adjoint is slightly messy. Instead, we shall be content with talking

about “formal adjoints”.

It’s been a while since we’ve met a PDE, so let’s recall the setting we had.

We have a uniformly elliptic operator

Lu = −

i,j=1

(x)u

)

i=1

(x)u

+ c(x)u

on an open bounded set U with C

boundary. The associated bilinear form is

B[u, v] =





i,j

(x)u

i=1

(x)u

v + c(x)uv





dx.

We are interested solving in the boundary value problem

Lu = f, u|

∂u

= 0

with f ∈ L

(U).

The formal adjoint of L is defined by the relation

(Lφ, ψ)

(U)

= (φ, L

†

ψ)

(U)

for all φ, ψ ∈ C

∞

(U). By integration by parts, we know L

†

should be given by

†

v = −

i,j=1

)

−

i=1

(x)v

c −

i=1

Note that here we have to assume that

∈ C

(

). However, what really

interests us is the adjoint bilinear form, which is simply given by

†

[v, u] = B[u, v].

We are actually just interested in

†

, and not

†

, and we can sensibly talk about

†

even if b

is not differentiable.

As usual, we say

v ∈ H

(

) is a weak solution of the adjoint problem

†

v = f, v|

∂U

= 0 if

†

[v, u] = (f, u)

(U)

for all u ∈ H

(U).

Given this set up, we can now state and prove the Fredholm alternative.

Theorem

(Fredholm alternative for elliptic BVP)

Let

be a uniformly elliptic

operator on an open bounded set U with C

boundary. Consider the problem

Lu = f, u|

∂U

= 0. (∗)

Then exactly one of the following are true:

(i) For each f ∈ L

(U), there is a unique weak solution u ∈ H

(U) to (∗)

(ii)

There exists a non-zero weak solution

u ∈ H

(

) to the homogeneous

problem, i.e. (∗) with f = 0.

If this holds, then the dimension of

ker L ⊆ H

(

) is equal to the

dimension of N

∗

= ker L

†

⊆ H

(U).

Finally, (∗) has a solution if and only if (f, v)

(U)

= 0 for all v ∈ N

∗

Proof.

We know that there exists

γ >

0 such that for any

f ∈ L

(

), there is a

unique weak solution u ∈ H

(U) to

u = Lu + γu = f, u|

∂U

= 0.

Moreover, we have the bound

kuk

(U)

≤ Ckf k

(U)

(which gives uniqueness).

Thus, we can set

−1

to be this

, and then

−1

(

)

→ H

(

) is a

bounded linear map. Composing with the inclusion

(

), we get a compact

endomorphism of L

(U).

Now suppose u ∈ H

is a weak solution to (∗). Then

B[u, v] = (f, v)

(U)

for all v ∈ H

(U)

is true if and only if

[u, v] ≡ B[u, v] + γ(u, v) = (f + γu, v) for all v ∈ H

(U).

Hence, u is a weak solution of (∗) if and only if

u = L

−1

(f + γu) = γL

−1

u + L

−1

In other words, u solves (∗) iff

u − Ku = h,

for

K = γL

−1

, h = L

−1

Since we know that

(

)

→ L

(

) is compact, by the Fredholm alternative

for compact operators, either

(i) u − Ku = h admits a solution u ∈ L

(U) for all h ∈ L

(U); or

(ii)

There exists a non-zero

u ∈ L

(

) such that

u − Ku

= 0. Moreover,

im(I − K) = ker(I − K

†

)

⊥

and dim ker(I − K) = dim im(I − K)

⊥

There is a bit of bookkeeping to show that this corresponds to the two alternatives

in the theorem.

(i) We need to show that u ∈ H

(U). But this is trivial, since we have

u = γL

−1

u + L

−1

and we know that L

−1

maps L

(U) into H

(U).

(ii)

As above, we know that the non-zero solution

. There are two things to

show. First, we have to show that

v − K

†

= 0 iff

is a weak solution to

†

v = 0, v|

∂U

= 0.

Next, we need to show that h = L

−1

f ∈ (N

∗

)

⊥

iff f ∈ (N

∗

)

⊥

For the first part, we want to show that

v ∈ ker

(

I − K

†

) iff

†

[

v, u

] =

B[u, v] = 0 for all u ∈ H

(U).

We are good at evaluating

[

u, v

] when

is of the form

−1

, by definition

of a weak solution. Fortunately,

im L

−1

contains

∞

(

), since

−1

for all

φ ∈ C

∞

(

). In particular,

im L

−1

is dense in

(

). So it

suffices to show that

v ∈ ker

(

I − K

†

) iff

[

−1

w, v

] = 0 for

w ∈ L

(

This is immediate from the computation

B[L

−1

w, v] = B

−1

w, v]−γ(L

−1

w, v) = (w, v)−(Kw, v) = (w, v−K

†

v).

The second is also easy — if v ∈ N

∗

= ker(I − K

†

), then

−1

f, v) =

(Kf, v) =

(f, K

†

v) =

(f, v).

4.3 The spectrum of elliptic operators

Let’s recap what we have obtained so far. Given

, we have found some

such

that whenever

µ ≥ γ

, there is a unique solution to (

)

(plus boundary

conditions). In particular,

has trivial kernel. For

µ ≤ γ

, (

)

= 0 may

or may not have a non-trivial solution, but we know this satisfies the Fredholm

alternative, since L + µ is still an elliptic operator.

Rewriting (

)

= 0 as

−µu

, we are essentially considering eigen-

values of

. Of course,

is not a bounded linear operator, so our usual spectral

theory does not apply to

. However, as always, we know that

−1

is compact

for large enough

, and so the spectral theory of compact operators can tell us

something about what the eigenvalues of L look like.

We first recall some elementary definitions. Note that we are explicitly

working with real Hilbert spaces and spectra.

Definition

(Resolvent set)

Let

H → H

be a bounded linear operator.

Then the resolvent set is

ρ(A) = {λ ∈ R : A − λI is bijective}.

Definition (Spectrum). The spectrum of a bounded linear A : H → H is

σ(A) = R \ ρ(A).

Definition

(Point spectrum)

We say

η ∈ σ

(

) belongs to the point spectrum

of A if

ker(A − ηI) 6= {0}.

If η ∈ σ

(A) and w satisfies Aw = ηw, then w is an associated eigenvector .

Our knowledge of the spectrum of

will come from known results about the

spectrum of compact operators.

Theorem

(Spectral theorem of compact operators)

Let

dim H

∞

, and

K : H → H a compact operator. Then

– σ(K) = σ

(K) ∪ {0}. Note that 0 may or may not be in σ

(K).

– σ(K) \ {0} is either finite or is a sequence tending to 0.

– If λ ∈ σ

(K), then ker(K − λI) is finite-dimensional.

–

is self-adjoint, i.e.

†

and

is separable, then there exists a

countable orthonormal basis of eigenvectors.

From this, it follows easily that

Theorem (Spectrum of L).

(i)

There exists a countable set Σ

⊆ R

such that there is a non-trivial solution

to Lu = λu iff λ ∈ Σ.

(ii)

If Σ is infinite, then Σ =

{λ

}

∞

k=1

, the values of an increasing sequence

with λ

→ ∞.

(iii) To each λ ∈ Σ there is an associated finite-dimensional space

E(λ) = {u ∈ H

(U) | u is a weak solution of (∗) with f = 0}.

We say

λ ∈

Σ is an eigenvalue and

u ∈ E

(

) is the associated eigenfunction.

Proof.

Apply the spectral theorem to compact operator

−1

(

)

→ L

(

and observe that

−1

u = λu ⇐⇒ u = λ(L + γ)u ⇐⇒ Lu =

1 − λγ

Note that L

−1

does not have a zero eigenvalue.

In certain cases, such as Laplace’s equation, our operator is “self-adjoint”,

and more things can be said. As before, we want the “formally” quantifier:

Definition

(Formally self-adjoint)

An operator

is formally self-adjoint if

L = L

†

. Equivalently, if b

≡ 0.

Definition

(Positive operator)

We say

is positive if there exists

C >

0 such

that

kuk

(U)

≤ CB[u, u] for all u ∈ H

(U).

Theorem.

Suppose

is a formally self-adjoint, positive, uniformly elliptic

operator on

, an open bounded set with

boundary. Then we can represent

the eigenvalues of L as

0 < λ

≤ λ

≤ ··· ,

where each eigenvalue appears according to its multiplicity (

dim E

(

)), and there

exists an orthonormal basis

}

∞

k=1

(

) with

∈ H

(

) an eigenfunction

of L with eigenvalue λ

Proof.

Note that positivity implies

c ≥

0. So the inverse

−1

(

)

→ L

(

)

exists and is a compact operator. We are done if we can show that

−1

self-adjoint. This is trivial, since for any f, g, we have

−1

f, g)

(U)

= B[v, u] = B[u, v] = (L

−1

g, f )

(U)

4.4 Elliptic regularity

We can finally turn to the problem of regularity. We previously saw that when

solving

, if

f ∈ L

(

), then by definition of a weak solution, we have

u ∈ H

(

), so we have gained some regularity when solving the differential

equation. However, it is not clear that

u ∈ H

(

), so we cannot actually say

solves

. Even if

u ∈ H

(

), it may not be classically differentiable, so

isn’t still holding in the strongest possible sense. So we might hope that

under reasonable circumstances,

is in fact twice continuously differentiable.

But human desires are unlimited. If

is smooth, we might hope further that

is also smooth. All of these will be true.

Let’s think about how regularity may fail. It could be that the individual

derivatives of

are quite singular, but in

all these singularities happen to

cancel with each other. Thus, the content of elliptic regularity is that this doesn’t

happen.

To see why we should expect this to be true, suppose for convenience that

u, f ∈ C

∞

) and

−∆u = f.

Using integration by parts, we compute

dx =

(∆u)

i,j

u)(D

u) dx

i,j

u)(D

u) dx

= kD

)

So we have deduced that

)

= k∆uk

)

This is of course not a very useful result, because we have a priori assumed

that

and

are

∞

, while what we want to prove that

is, for example, in

(u). However, the fact that we can control the H

norm if we assumed that

u ∈ H

(

) gives us some strong indication that we should be able to show that

u must always be in H

(U).

The idea is to run essentially the same argument for weak solutions, without

mentioning the word “second derivative”. This involves the use of difference

quotients.

Definition

(Difference quotient)

Suppose

U ⊆ R

is open and

V b U

. For

0 < |h| < dist(V, ∂U), we define

∆

u(x) =

u(x + he

) − u(x)

∆

u(x) = (∆

u, . . . , ∆

u).

Observe that if

u ∈ L

(

), then ∆

u ∈ L

(

). If further

u ∈ H

(

), then

∆

u ∈ H

(V ) and D∆

u = ∆

Du.

What makes difference quotients useful is the following lemma:

Lemma. If u ∈ L

(U), then u ∈ H

(V ) iff

k∆

(V )

≤ C

for some C and all 0 < |h| <

dist(V, ∂U ). In this case, we have

kDuk

(V )

≤ k∆

(V )

≤

CkDuk

(V )

Proof. See example sheet.

Thus, if we are able to establish the bounds we had for the Laplacian using

difference quotients, then this tells us u is in H

loc

(U).

Lemma. If w, v and compactly supported in U, then

w∆

−h

v dx =

(∆

w)v dx

∆

(wv) = (τ

w)∆

v + (∆

w)v,

where τ

w(x) = w(x + he

Theorem

(Interior regularity)

Suppose

is uniformly elliptic on an open set

U ⊆ R

, and assume

∈ C

(

, c ∈ L

∞

(

) and

f ∈ L

(

). Suppose

further that u ∈ H

(U) is such that

B[u, v] = (f, v)

(U)

(†)

for all v ∈ H

(U). Then u ∈ H

loc

(U), and for each V b U, we have

kuk

(V )

≤ C(kf k

(U)

+ kuk

(U)

with C depending on L, V, U, but not f or u.

Note that we don’t require

u ∈ H

(

), so we don’t require

to satisfy the

boundary conditions. In this case, there may be multiple solutions, so we need

the

on the right. Also, observe that we don’t actually need uniform ellipticity,

as the property of being in

loc

(

) can be checked locally, and

is always

locally uniformly elliptic.

The proof is essentially what we did for the Laplacian just now, except this

time it is much messier since we need to use difference quotients instead of

derivatives, and there are lots of derivatives of

’s that have to be kept track

of.

When using regularity results, it is often convenient to not think about it in

terms of “solving equations”, but as something that (roughly) says “if

is such

that Lu happens to be in L

(say), then u is in H

loc

(U)”.

Proof.

We first show that we may in fact assume

= 0. Indeed, if we know

the theorem for such L, then given a general L, we write

u = −

)

, Ru =

+ cu.

Then if

is a weak solution to

, then it is also a weak solution to

f − Ru

. Noting that

Ru ∈ L

(

), this tells us

u ∈ H

loc

(

). Moreover,

on V b U ,

– We can control kuk

(V )

by kf − Ruk

(V )

and kuk

(V )

(by theorem).

– We can control kf − Ruk

(V )

by kfk

(V )

, kuk

(V )

and kDuk

(V )

–

By G˚arding’s inequality, we can control

(V )

kuk

(V )

and

B[u, u] = (f, u)

(V )

– By H¨older, we can control (f, u)

(V )

by kfk

(V )

and kuk

(V )

So it suffices to consider the case where

only has second derivatives. Fix

V b U

and choose

such that

V b W b U

. Take

ξ ∈ C

∞

(

) such that

ζ ≡

on V .

Recall that our example of Laplace’s equation, we considered the integral

and did some integration by parts. Essentially, what we did was to

apply the definition of a weak solution to ∆

. There we was lucky, and we could

obtain the result in one go. In general, we should consider the second derivatives

one by one.

For k ∈ {1, . . . n}, we consider the function

v = −∆

−h

(ζ

∆

u).

As we shall see, this is the correct way to express

in terms of difference

quotients (the

−h

in the first ∆

−h

comes from the fact that we want to integrate

by parts). We shall put this into the definition of a weak solution to say

[

u, v

] = (

f, v

). The plan is to isolate a

∆

term on the left and then

bound it.

We first compute

B[u, v] = −

i,j

∆

−h

(ζ

∆

i,j

∆

)(ζ

∆

i,j

(τ

∆

+ (∆

)(ζ

∆

+ 2ζζ

∆

u) dx

≡ A

+ A

where

i,j

(τ

)(∆

) dx

i,j

(∆

∆

+ 2ζζ

∆

u(τ

∆

+ (∆

)

dx.

By uniform ellipticity, we can bound

≥ θ

|∆

Du|

dx.

This is what we want to be small.

Note that

looks scary, but every term either only involves “first derivatives”

, or a product of a second derivative of

with a first derivative. Thus, applying

Young’s inequality, we can bound

by a linear combination of

∆

and

|Du|

, and we can make the coefficient of |∆

Du|

as small as possible.

In detail, since

∈ C

(

) and

is supported in

, we can uniformly

bound a

, ∆

, ζ

, and we have

| ≤ C

ζ|∆

Du||Du| + ζ|Du||∆

u| + ζ|∆

Du||∆

dx.

Now recall that

∆

is bounded by

. So applying Young’s inequality, we

may bound (for a different C)

| ≤ ε

|∆

Du|

+ C

|Du|

dx.

Thus, taking ε =

, it follows that

(f, v) = B[u, v] ≥

|∆

Du|

dx − C

|Du|

dx.

This is promising.

It now suffices to bound (f, v) from above. By Young’s inequality,

|(f, v)| ≤

|f||∆

−h

(ζ

∆

u)| dx

≤ C

|f||D(ζ

∆

u)| dx

≤ ε

|D(ζ

∆

u)|

dx + C

|f|

≤ ε

|ζ

∆

Du|

dx + C(kfk

(U)

+ kDuk

(U)

)

Setting ε =

, we get

|∆

Du|

dx ≤ C(kfk

(W )

+ kDuk

(W )

and so, in particular, we get a uniform bound on

∆

(V )

. Now as before,

we can use G˚arding to get rid of the kDuk

(W )

dependence on the right.

Notice that this is a local result. In order to have

u ∈ H

(

), it is enough

for us to have

f ∈ L

(

) for some

slightly larger than

. Thus, singularities

do not propagate either in from the boundary or from regions where

is not

well-behaved.

With elliptic regularity, we can understand weak solutions as genuine solutions

to the equation

. Indeed, if

is a weak solution, then for any

v ∈ C

∞

(

we have

[

u, v

] = (

f, v

), hence after integrating by parts, we recover (

Lu−f, v

) =

0 for all v ∈ C

∞

(U). So in fact Lu = f almost everywhere.

It is natural to hope that we can get better than

u ∈ H

loc

(

). This is actually

not hard given our current work. If

, and all

, b

, c, f

are sufficiently

well-behaved, then we can simply differentiate the whole qeuation with respect

, and then observe that

satisfies some second-order elliptic PDE of the

form previously understood, and if we do this for all

, then we can conclude

that

u ∈ H

loc

(

). Of course, some bookkeeping has to be done if we were to do

this properly, since we need to write everything in weak form. However, this is

not particularly hard, and the details are left as an exercise.

Theorem

(Elliptic regularity)

, b

and

are

m+1

(

) for some

m ∈ N

and f ∈ H

(U), then u ∈ H

m+2

loc

(U) and for V b W b U, we can estimate

kuk

m+2

(V )

≤ C(kf k

(W )

+ kuk

(W )

In particular, if

is large enough, then

u ∈ C

loc

(

), and if all

, b

, c, f

are

smooth, then u is also smooth.

We can similarly obtain a H¨older theory of elliptic regularity, which gives

(roughly) f ∈ C

k,α

(U) implies u ∈ C

k+2,α

(U).

The final loose end is to figure out what happens at the boundary.

Theorem

(Boundary

regularity)

Assume

∈ C

(

, c ∈ L

∞

(

), and

f ∈ L

(

). Suppose

u ∈ H

(

) is a weak solution of

f, u|

∂U

= 0. Finally,

we assume that ∂U is C

. Then

kuk

(U)

≤ C(kf k

(U)

+ kuk

(U)

is the unique weak solution, we can drop the

kuk

(U)

from the right hand

side.

Proof.

Note that we already know that

is locally in

loc

(

). So we only have

to show that the second-derivative is well-behaved near the boundary.

By a partition of unity and change of coordinates, we may assume we are in

the case

U = B

(0) ∩ {x

> 0}.

Let

1/2

(0)

∩ {x

}

. Choose a

ζ ∈ C

∞

(

(0)) with

ζ ≡

1 on

and

0 ≤ ζ ≤ 1.

Most of the proof in the previous proof goes through, as long as we restrict

v = −∆

−h

(ζ

∆

with

k 6

, since all the translations keep us within

, and hence are well-defined.

Thus, we control all second derivatives of the form D

, where

k ∈

{

, . . . , n −

}

and

i ∈ {

, . . . , n}

. The only remaining second-derivative to

control is D

. To understand this, we go back to the PDE and look at the

PDE itself. Recall that we know it holds pointwise almost everywhere, so

i,j=1

)

i=1

+ cu = f.

So we can write

almost everywhere, where

depends on

a, b, c, f

and all (up to) second derivatives of

that are not

. Thus,

is controlled

. But uniform ellipticity implies

is bounded away from 0. So we are

done.

Similraly, we can reiterate this to obtain higher regularity results.

5 Hyperbolic equations

So far, we have been looking at elliptic PDEs. Since the operator is elliptic,

there is no preferred “time direction”. For example, Laplace’s equation models

static electric fields. Thus, it is natural to consider boundary value problems in

these cases.

Hyperbolic equations single out a time direction, and these model quantities

that evolve in time. In this case, we are often interested in initial value problems

instead. Let’s first define what it means for an equation to by hyperbolic

Definition

(Hyperbolic PDE)

A second-order linear hyperbolic PDE is a PDE

of the form

n+1

i,j=1

(y)u

)

n+1

i=1

(y)u

+ c(y)u = f

with y ∈ R

n+1

, a

= a

, b

, c ∈ C

∞

n+1

), such that the principal symbol

Q(ξ) =

n+1

i,j=1

(y)ξ

has signature (+

, −, −, . . .

) for all

. That is to say, after perhaps changing basis,

at each point we can write

q(ξ) = λ

n+1

−

i=1

with λ

> 0.

It turns out not to be too helpful to treat this equation at this generality.

We would like to pick out a direction that corresponds to the positive eigenvalue.

By a coordinate transformation, we can locally put our equation in the form

i,j=1

(x, t)u

)

i=1

(x, t)u

+ c(x, t)u.

Note that we did not write down a

term. It doesn’t make much difference,

and it is notationally convenient to leave it out.

In this form, hyperbolicity is equivalent to the statement that the operator

on the right is elliptic for each

(or rather, the negative of the right hand side).

We observe that

= 0 is a non-characteristic surface. So we can hope to

solve the Cauchy problem. In other words, we shall specify

t=0

and

t=0

Actually, we’ll look at an initial boundary value problem. Consider a region of

the form R × U , where U ⊆ R

is open bounded with C

boundary.

t = 0

t = T

We define

= (0, t) × U

= {t} × U

∂

∗

= [0, t] × ∂U.

Then

∂U

= Σ

t Σ

t ∂

∗

The general initial boundary value problem (IVBP) is as follows: Let

be a

(time-dependent) uniformly elliptic operator. We want to solve

+ Lu = f on U

u = ψ on Σ

= ψ

on Σ

u = 0 on ∂

∗

In the case of elliptic PDEs, we saw that Laplace’s equation was a canonical,

motivating example. In this case, if we take

−

∆, then we obtain the wave

equation. Let’s see what we can do with it.

Example.

Start with the equation

−

∆

= 0. Multiply by

and integrate

over U

to obtain

0 =



− u

∆u



dx dt



∂

∂t

− ∇ · (u

Du) + Du

· Du



dx dt



∂

∂t



+ |Du|



− ∇ · (u

Du)



dx dt

−Σ



+ |Du|



dx −

∂

∗

∂u

∂ν

dS.

But

vanishing on

∂

∗

implies

vanishes as well. So the second term vanishes,

and we obtain

+ |Du|

dx =

+ |Du|

dx.

This is the conservation of energy! Thus, if a solution exists, we control

kuk

(Σ

)

in terms of

kψk

(Σ

)

and

kψ

(Σ

)

. We also see that the solution is uniquely

determined by

and

, since if

= 0, then

= D

= 0 and

is zero at

the boundary.

Estimates like this that control a solution without needing to construct it are

known as a priori estimates. These are often crucial to establish the existence

of solutions (cf. G˚arding).

We shall first find a weak formulation of this problem that only requires

u ∈ H

(

). Note that when we do so, we have to understand carefully what

we mean by

. We shall see how we will deal with that in the derivation

of the weak formulation.

Assume that

u ∈ C

(

) is a classical solution. Multiply the equation by

v ∈ C

(

) which satisfies v = 0 on ∂

∗

∪ Σ

. Then we have

dx dt (fv) =

dx dt (u

v + Luv)

dx dt



−u

+ cu





v dx



−



∂U

v dS



dt.

Using the boundary conditions, we find that

fv dx dt =



−u

v + cuv



dx dt

−

v dx. (†)

Conversely, suppose

u ∈ C

(

) satisfies (

†

) for all such

, and

and

∂

∗

= 0. Then by first testing on

v ∈ C

∞

(

), reversing the integration by

parts tells us

0 =

+ Lu − f)v dx,

since there is no boundary term. Hence we get

+ Lu = f

. To check the boundary conditions, if

v ∈ C

∞

(

) vanishes on

∂

∗

∪

then again reversing the integration by parts shows that

+ Lu − f)v dx dt =

(ψ

− u

)v dx.

Since we know that the LHS vanishes, it follows that ψ

= u

on Σ

. So we see

that our weak formulation can encapsulate the boundary condition on Σ

Definition

(Weak solution)

Suppose

f ∈ L

(

ψ ∈ H

(Σ

) and

∈

(Σ

). We say u ∈ H

) is a weak solution to the hyperbolic PDE if

(i) u|

= ψ in the trace sense;

(ii) u|

∂

∗

= 0 in the trace sense; and

(iii) (†) holds for all v ∈ H

) with v = 0 on ∂

∗

∪ Σ

in a trace sense.

Theorem (Uniqueness of weak solution). A weak solution, if exists, is unique.

Proof.

It suffices to consider the case

= 0, and show any solution

must be zero. Let

v(x, t) =

−λs

u(x, s) ds,

where

is a real number we will pick later. The point of introducing this

−λt

is that in general, we do not expect conservation of energy. There could be some

exponential growth in the energy, so want to suppress this.

Then this function belongs to H

), v = 0 on Σ

∪ ∂

∗

, and

= −e

−λt

Using the fact that u is a weak solution, we have

−λt

−

λt

v + (c − 1)uv − vv

λt

dx dt = 0.

Integrating by parts, we can write this as A = B, where

A =



−λt

−

λt

−

λt





−λt

λt

+ v

λt





dx dt

B = −



λt

−

uv −

u + (c − 1)uv



dx dt.

Here

is the nice bit, which we can control, and

is the junk bit, which we

will show that we can absorb elsewhere.

Integrating the time derivative in

, using

= 0 on Σ

and

= 0 on Σ

, we

have

A = e

λT

dx +



+ v





−λt

λt

+ v

λt



dx dt.

Using the uniform ellipticity condition (and the observation that the first line is

always non-negative), we can bound

A ≥



−λt

+ θ|Dv|

λt

+ v

λt



dx dt.

Doing some integration by parts, we can also bound

B ≤



−λt

+ θ|Dv|

λt

+ v

λt



dx dt,

where the constant c does not depend on λ. Taking this together, we have

λ − c



−λt

+ θ|Dv|

λt

+ v

λt



dx dt ≤ 0.

Taking

λ > c

, this tells us the integral must vanish. In particular, the integral of

λt

= 0. So u = 0.

We now want to prove the existence of weak solutions. While we didn’t

need to assume much regularity in the uniqueness result, since we are going

to subtract the boundary conditions off anyway, we expect that we need more

regularity to prove existence.

Theorem

(Existence of weak solution)

Given

ψ ∈ H

(

) and

∈ L

(

f ∈ L

), there exists a (unique) weak solution with

kuk

)

≤ C(kψk

(U)

+ kψ

(U)

+ kfk

)

). (†)

Proof.

We use Galerkin’s method . The way we write our equations suggests we

should think of our hyperbolic PDE as a second-order ODE taking values in the

infinite-dimensional space

(

). To apply the ODE theorems we know, we

project our equation onto a finite-dimensional subspace, and then take the limit.

First note that by density arguments, we may assume

ψ, ψ

∈ C

∞

(

) and

f ∈ C

∞

), as long as we prove the estimate (†). So let us do so.

Let

{ϕ

}

∞

k=1

be an orthonormal basis for

(

), with

∈ H

(

). For

example, we can take

to be eigenfunctions of

−

∆ with Dirichlet boundary

conditions.

We shall consider “solutions” of the form

(x, t) =

k=1

(t)ϕ

(x).

We want this to be a solution after projecting to the subspace spanned by

, . . . , ϕ

. Thus, we want (

Lu − f, ϕ

)

(Σ

)

= 0 for all

= 1

, . . . , N

After some integration by parts, we see that we want



¨u

, ϕ



(U)



(ϕ

)

+ b

+ cu



dx = (f, ϕ

)

(U)

(∗)

We also require

(0) = (ψ, ϕ

)

(U)

˙u

(0) = (ψ

, ϕ

)

(U)

Notice that if we have a genuine solution

that can be written as a finite sum

of the ϕ

(x), then these must be satisfied.

This is a system of ODEs for the functions

(

), and the RHS is uniformly

and linear in the

’s. By Picard–Lindel¨of, a solution exists for

t ∈

, T

So for each

, we have an approximate solution that solves the equation

when projected onto

hϕ

, . . . , ϕ

. What we need to do is to extract from this

solution a genuine weak solution. To do so, we need some estimates to show

that the functions u

converge.

We multiply (

∗

) by

−λt

˙u

(

), sum over

= 1

, . . . , N

, and integrate from 0

to τ ∈ (0, T ), and end up with



¨u

˙u

−λt

˙u

+ cu

˙u



−λt

du(f ˙u

−λt

As before, we can rearrange this to get A = B, where

A =

dt dx



( ˙u

)

−λt





( ˙u

)

+ (u

)



−λt



and

B =

dt dx



˙a

−

˙u

+ (1 − c)u

˙u

+ f ˙u



−λt

Integrating in time, and estimating as before, for λ sufficiently large, we get



( ˙u

)

+ |Du



dx +



( ˙u

)

+ |Du

+ (u

)



dx dt

≤ C(kψk

(U)

+ kψ

(U)

+ kfk

This, in particular, tells us u

is bounded in H

Since

(0) =

n=1

(

ψ, ϕ

)

, we know this tends to

(

). So for

N large enough, we have

(Σ

)

≤ 2kψk

(U)

Similarly, k˙u

(Σ

)

≤ 2kψ

(U)

Thus, we can extract a convergent subsequence

* u

(

) for some

u ∈ H

(U) such that

kuk

)

≤ C(kψk

(U)

+ kψk

(U)

+ kfk

)

For convenience, we may relabel the sequence so that in fact u

* u.

To check that

is a solution, suppose

k=1

(

)

for some

∈

((0, T )) with v

(T ) = 0. By definition of u

, we have

(¨u

, v)

(U)

i,j

v + cuv dx = (f, v)

(U)

Integrating

dt using v(T ) = 0, we have



−u

v + cuv



dx dt −

v dx

fv dx dt.

Now note that if

N > M

, then

. Now, passing to the

weak limit, we have



−u

v + cuv



dx dt −

v dx

fv dx dt.

So u

satisfies the identity required for u to be a weak solution.

Now for

= 1

, . . . , M

, the map

w ∈ H

(

)

7→

wϕ

is a bounded

linear map, since the trace is bounded in L

. So we conclude that

uϕ

dx = lim

N→∞

dx = (ψ, ϕ

)

(H)

Since this is true for all

, it follows that

, and

of the form considered

are dense in H

) with v = 0 on ∂

∗

∪ Σ

. So we are done.

In fact, we have

ess sup

t∈(0,T )

(k˙uk

(Σ

)

+ kuk

(Σ

)

) ≤ C · (data).

So we can say u ∈ L

∞

((0, T ), H

(U)) and ˙u ∈ L

∞

((0, T ), L

(U)).

We would like to improve the regularity of the solution. To motivate how we

are going to do that, let’s go back to the wave equation for a bit.

Suppose that in fact

u ∈ C

∞

(

) is a smooth solution to the wave equation

with initial conditions (

ψ, ψ

). We want a quantitative estimate for

u ∈ H

(Σ

The idea is to differentiate the equation with respect to

. Writing

, we

get

− ∆w = 0

= ψ

= ∆ψ

∂

∗

= 0.

By the energy estimate we have for the wave equation, we get

(Σ

)

+ kwk

(Σ

)

≤ C(kψ

(U)

+ k∆ψk

(U)

)

≤ C(kψ

(U)

+ kψk

(U)

So we now have control of

and

(Σ

). But once we know that

is controlled in

, then we can use the elliptic estimate to gain control on the

second-order spatial derivatives of u. So

kuk

(Σ

)

≤ C(k∆uk

(Σ

)

) = Cku

(Σ

)

So we control all second-derivatives of u in terms of the data.

Theorem.

, b

, c ∈ C

(

) and

∂U ∈ C

, then for

ψ ∈ H

(

) and

∈ H

(U), and f, f

∈ L

), we have

u ∈ H

) ∩ L

∞

((0, T ); H

(U))

∈ L

∞

((0, T ), H

(U))

∈ L

∞

((0, T ); L

(U))

Proof.

We return to the Galerkin approximation. Now by assumption, we have

a linear system with

coefficients. So

∈ C

((0

, T

)). Differentiating with

respect to t (assuming as we can f, f

∈ C

(

)), we have

(∂

, ϕ

)

(U)



˙u

(ϕ

)

˙u

+ c ˙u



= (

f, ϕ

)

(U)

−



˙a

(ϕ

)

+ ˙cuϕ



dx.

Multiplying by

¨u

−λt

, summing

= 1

, . . . , N

, integrating

, and recalling

we already control u ∈ H

), we get

sup

t∈(0,T )

(ku

(Σ

)

+ ku

(Σ

)

+ ku

)

≤ C



(Σ

)

+ ku

(Σ

)

+ kψk

(Σ

)

+ kψ

(Σ

)

+ kfk

)

+ kf

)



We know

t=0

k=1

(ψ

, ϕ

)

(U)

Since ϕ

are a basis for H

, we have

(Σ

)

≤ kψ

(Σ

)

To control

, let us assume for convenience that in fact

are the eigenfunctions

−∆. From the fact that

(¨u

, ϕ

)

(U)

i,j

(ϕ

)

+cu

dx dt = (f, ϕ

)

(U)

integrate the first term in the integral by parts, multiply by

¨u

, and sum to get

≤ C(ku

(Σ

)

+ kfk

)

+ kf

)

We need to control

(Σ

)

kψk

(Σ

)

. Then, using that ∆

∂U

= 0

and u

is a finite sum of these ϕ

’s,

(∆u

, ∆u

)

(Σ

)

= (u

, ∆

)

(Σ

)

= (ψ, ∆

)

(Σ

)

= (∆ψ, ∆u

)

(Σ

)

(Σ

)

≤ k∆u

(Σ

)

≤ Ckψk

(U).

Passing to the weak limit, we conclude that

∈ H

)

∈ L

∞

((0, T ), H

(U))

∈ L

∞

((0, T ), L

(U)).

Since

, by an elliptic estimate on (almost) every constant

, we

obtain u ∈ L

∞

((0, T ), H

(U)).

We can now understand the equation as holding pointwise almost everywhere

by undoing the integration by parts that gave us the definition of the weak

solution. The initial conditions can also be understood in a trace sense.

Returning to the case

ψ ∈ H

(

) and

∈ L

(

), by approximating in

(

), by approximating in

(

) respectively, we can show that a

weak solution can be constructed as a strong limit in

(

). This implies the

energy identity, so that in fact weak solutions satisfy

u ∈ C

((0, T ); H

(U))

∈ C

((0, T ); L

(U))

This requires slightly stronger regularity assumptions on

and

. Such

solutions are said to be in the energy class.

Finally, note that we can iterate the argument to get higher regularity.

Theorem. If a

, b

, c ∈ C

k+1

(

) and ∂U is C

k+1

, and

∂

∈ H

(U) i = 0, . . . , k

∂

k+1

∈ L

(U)

∂

f ∈ L

((0, T ); H

k−i

(U)) i = 0, . . . , k

then u ∈ H

k+1

(U) and

∂

u ∈ L

∞

((0, T ); H

k+1−i

(U))

for i = 0, . . . , k + 1.

In particular, if everything is smooth, then we get a smooth solution.

The first two conditions should be understood as conditions on

and

using the fact that the equation allows us to express higher time derivatives

in terms of lower time derivatives and spatial derivatives. One can check

that these condition imply

ψ ∈ H

k+1

(

) and

∈ H

(

), but the condition we

wrote down also encodes some compatibility conditions, since we know

ought

to vanish at the boundary, hence all time derivatives should.

Those were the standard existence and regularity theorems for hyperbolic

PDEs. However, there are more things to say about hyperbolic equations. The

“physicist’s version” of the wave equation involves a constant c, and says

¨u − c

∆x = 0.

This constant

is the speed of propagation. This tells us in the wave equation,

information propagates at a speed of at most

. We can see this very concretely

in the 1-dimensional wave equation, where d’Alembert wrote down an explicit

solution to the wave equation given by

u(x, t) =

(ψ(x − ct) + ψ(x + ct)) +

x+ct

x−ct

(y) dy.

Thus, we see that the value of

at any point (

x, t

) is completely determined by

the values of ψ and ψ

in the interval [x − ct, x + ct].

(x, t)

t = 0

This is true for a general hyperbolic PDE. In this case, the speed of propa-

gation should be measured by the principal symbol

(

) =

(

)

. The

correct way to formulate this result is as follows:

Let

⊆ U

be an open set with (say) smooth boundary. Let

→

, T

]

be a smooth function vanishing on ∂S

, and define

D = {(t, x) ∈ U

: x ∈ S

, 0 < t < τ(x)}

= {(τ(x), x) : x ∈ S

We say S

is spacelike if

i,j=1

< 1

for all x ∈ S

Theorem.

is a weak solution of the usual thing, and

is spacelike, then

depends only on ψ|

, ψ

and f |

The proof is rather similar to the proof of uniqueness of solutions.

Proof. Returning to the definition of a weak solution, we have

−u

i,j=1

i=1

+ cuv dx dt −

v dx =

fv dx dt.

By linearity it suffices to show that if

= 0 if

ψ|

= 0 and

= 0.

We take as test function

v(t, x) =

(

τ(x)

−λs

u(s, x) ds (t, x) ∈ D

0 (t, x) 6∈ D

One checks that this is in H

), and v = 0 on Σ

∪ ∂

∗

with

= τ

−λτ

u(x, τ) +

τ(x)

−λs

(x, s) ds

= −e

−λt

u(x, t).

Plugging these into the definition of a weak solution, we argue as in the previous

uniqueness proof. Then



−λt

−

λt

−

λt





−λt

λt

+ v

λt



dx dt



λt

−

v − (c − 1)uv



dx dt

Noting that

τ(x)

, we can perform the

integral of the

term, and we get contribution from S

which is given by





(τ(x), x)e

−λτ(x)

−

i,j

−λτ





We have used

= 0 on

and

−λτ

. Using the definition of a spacelike

surface, we have

0. The rest of the argument of the uniqueness of solutions

goes through to conclude that u = 0 on D.

This implies no signal can travel faster than a certain speed. In particular, if

i,j

≤ µ|ξ|

for some

, then no signal can travel faster than

√

. This allows us to solve

hyperbolic equations on unbounded domains by restricting to bounded domains.