IB Methods - Sturm-Liouville Theory

3Sturm-Liouville Theory

IB Methods

3.1 Sturm-Liouville operators

In finite dimensions, we often consider linear maps

V → W

. If

}

is a

basis for

and

}

is a basis for

, then we can represent the map by a

matrix with entries

= (w

, Mv

A map

V → V

is called self-adjoint if

†

as matrices. However, it is

not obvious how we can extend this notion to arbitrary maps between arbitrary

vector spaces (with an inner product) when they cannot be represented by a

matrix.

Instead, we make the following definitions:

Definition

(Adjoint and self-adjoint)

The adjoint

of a map

V → V

is a

map such that

(Bu, v) = (u, Av)

for all vectors u, v ∈ V . A map is then self-adjoint if

(Mu, v) = (u, Mv).

Self-adjoint matrices come with a natural basis. Recall that the eigenvalues

of a matrix are the roots of

det

(

M − λI

) = 0. The eigenvector corresponding to

an eigenvalue λ is defined by M v

= λ

In general, eigenvalues can be any complex number. However, self-adjoint

maps have real eigenvalues. Suppose

= λ

Then we have

, v

) = (v

, Mv

) = (Mv

, v

) = λ

∗

, v

So λ

= λ

∗

Furthermore, eigenvectors with distinct eigenvalues are orthogonal with

respect to the inner product. Suppose that

= λ

, Mv

= λ

Then

, v

) = (v

, Mv

) = (Mv

, v

) = λ

, v

Since λ

6= λ

, we must have (v

, v

) = 0.

Knowing eigenvalues and eigenvalues gives a neat way so solve linear equations

of the form

Mu = f .

Here we are given

and

, and want to find

. Of course, the answer is

u = M

−1

f. However, if we expand in terms of eigenvectors, we obtain

Mu = M

Hence we have

Taking the inner product with v

, we know that

So far, these are all things from IA Vectors and Matrices. Sturm-Liouville theory

is the infinite-dimensional analogue.

In our vector space of differentiable functions, our “matrices” would be linear

differential operators L. For example, we could have

L = A

(x)

+ A

p−1

(x)

p−1

+ ··· + A

(x)

+ A

(x).

It is an easy check that this is in fact linear.

We say L has order p if the highest derivative that appears is

In most applications, we will be interested in the case

= 2. When will our

L be self-adjoint?

In the p = 2 case, we have

Ly = P

+ R

− Qy

= P



−



= P



−





−



Let p = exp





. Then we can write this as

= P p

−1







−



We further define

. We also drop a factor of

P p

−1

. Then we are left with

L =



p(x)



− q(x).

This is the Sturm-Liouville form of the operator. Now let’s compute (

f, Lg

). We

integrate by parts numerous times to obtain

(f, Lg) =

∗





− qg



= [f

∗

]

−



∗

+ f

∗



= [f

∗

− f

0∗

pg]



∗



− qf

∗



g dx

= [(f

∗

− f

0∗

g)p]

+ (Lf, g),

assuming that p, q are real.

So 2nd order linear differential operators are self-adjoint with respect to this

norm if

p, q

are real and the boundary terms vanish. When do the boundary

terms vanish? One possibility is when

is periodic (with the right period), or if

we constrain f and g to be periodic.

Example. We can consider a simple case, where

L =

Here we have p = 1, q = 0. If we ask for functions to be periodic on [a, b], then

∗

dx =

∗

g dx.

Note that it is important that we have a second-order differential operator. If it

is first-order, then we would have a negative sign, since we integrated by parts

once.

Just as in finite dimensions, self-adjoint operators have eigenfunctions and

eigenvalues with special properties. First, we define a more sophisticated inner

product.

Definition

(Inner product with weight)

An inner product with weight

written ( ·, ·)

, is defined by

(f, g)

∗

(x)g(x)w(x) dx,

where w is real, non-negative, and has only finitely many zeroes.

Why do we want a weight

(

)? In the future, we might want to work with

the unit disk, instead of a square in

. When we want to use polar coordinates,

we will have to integrate with

, instead of just d

. Hence we need the

weight of

. Also, we allow it to have finitely many zeroes, so that the radius

can be 0 at the origin.

Why can’t we have more zeroes? We want the inner product to keep the

property that (

f, f

)

= 0 iff

= 0 (for continuous

). If

is zero at too many

places, then the inner product could be zero without f being zero.

We now define what it means to be an eigenfunction.

Definition

(Eigenfunction with weight)

An eigenfunction with weight

is a function y : [a, b] → C obeying the differential equation

Ly = λwy,

where λ ∈ C is the eigenvalue.

This might be strange at first sight. It seems like we can take any nonsense

, apply

, to get some nonsense

. But then it is fine, since we can write it

as some nonsense

times our original

. So any function is an eigenfunction?

No! There are many constraints

has to satisfy, like being positive, real and

having finitely many zeroes. It turns out this severely restraints what values

can take, so not everything will be an eigenfunction. In fact we can develop

this theory without the weight function

. However, weight functions are much

more convenient when, say, dealing with the unit disk.

Proposition. The eigenvalues of a Sturm-Liouville operator are real.

Proof. Suppose Ly

= λ

. Then

, y

)

= λ

, wy

) = (y

, Ly

) = (Ly

, y

) = (λ

, y

) = λ

∗

, y

)

Since (y

, y

)

6= 0, we have λ

= λ

∗

Note that the first and last terms use the weighted inner product, but the

middle terms use the unweighted inner product.

Proposition.

Eigenfunctions with different eigenvalues (but same weight) are

orthogonal.

Proof. Let Ly

= λ

and Ly

= λ

. Then

, y

)

= (y

, Ly

) = (Ly

, y

) = λ

, y

)

Since λ

6= λ

, we must have (y

, y

)

= 0.

Those were pretty straightforward manipulations. However, the main results

of Sturm–Liouville theory are significantly harder, and we will not prove them.

We shall just state them and explore some examples.

Theorem. On a compact domain, the eigenvalues λ

, λ

, ··· form a countably

infinite sequence and are discrete.

This will be a rather helpful result in quantum mechanics, since in quantum

mechanics, the possible values of, say, the energy are the eigenvalues of the

Hamiltonian operator. Then this result says that the possible values of the

energy are discrete and form an infinite sequence.

Note here the word compact. In quantum mechanics, if we restrict a particle

in a well [0

1], then it will have quantized energy level since the domain is

compact. However, if the particle is free, then it can have any energy at all since

we no longer have a compact domain. Similarly, angular momentum is quantized,

since it describe rotations, which takes values in S

, which is compact.

Theorem.

The eigenfunctions are complete: any function

: [

a, b

]

→ C

(obeying

appropriate boundary conditions) can be expanded as

f(x) =

(x),

where

∗

(x)f(x)w(x) dx.

Example.

Let [

a, b

] = [

−L, L

= 1, restricting to periodic functions

Then our eigenfunction obeys

= λ

(x),

Then our eigenfunctions are

(x) = exp



inπx



with eigenvalues

= −

for n ∈ Z. This is just the Fourier series!

Example

(Hermite polynomials)

We are going to cheat a little bit and pick

our domain to be R. We want to study the differential equation

− xH

= λH,

with H : R → C. We want to put this in Sturm-Liouville form. We have

p(x) = exp



−

2t dt



= e

−x

ignoring constant factors. Then q(x) = 0. We can rewrite this as



−x



= −2λe

−x

H(x).

So we take our weight function to be w(x) = e

−x

We now ask that

(

) grows at most polynomially as

|x| → ∞

. In particular,

we want

−x

(

)

→

0. This ensures that the boundary terms from integration

by parts vanish at the infinite boundary, so that our Sturm-Liouville operator is

self-adjoint.

The eigenfunctions turn out to be

(x) = (−1)



−x



These are known as the Hermite polynomials. Note that these are indeed

polynomials. When we differentiate the

−x

term many times, we get a lot

of things from the product rule, but they will always keep an

−x

, which will

ultimately cancel with e

Just as for matrices, we can use the eigenfunction expansion to solve forced

differential equations. For example, if might want to solve

Lg = f (x),

where f(x) is a forcing term. We can write this as

Lg = w(x)F (x).

We expand our g as

g(x) =

n∈Z

ˆg

(x).

Then by linearity,

Lg =

n∈Z

ˆg

(x) =

n∈Z

ˆg

w(x)y

(x).

We can also expand our forcing function as

w(x)F (x) = w(x)

n∈Z

(x)

Taking the (regular) inner product with

(

) (and noting orthogonality of

eigenfunctions), we obtain

w(x)ˆg

= w(x)

This tells us that

ˆg

So we have

g(x) =

n∈Z

(x),

provided all λ

are non-zero.

This is a systematic way of solving forced differential equations. We used to

solve these by “being smart”. We just looked at the forcing term and tried to

guess what would work. Unsurprisingly, this approach does not succeed all the

time. Thus it is helpful to have a systematic way of solving the equations.

It is often helpful to rewrite this into another form, using the fact that

= (y

, F )

. So we have

g(x) =

n∈Z

, F )

(x) =

n∈Z

∗

(t)y

(x)w(t)F (t) dt.

Note that we swapped the sum and the integral, which is in general a dangerous

thing to do, but we don’t really care because this is an applied course. We can

further write the above as

g(x) =

G(x, t)F (t)w(t) dt,

where G(x, t) is the infinite sum

G(x, t) =

n∈Z

∗

(t)y

(x).

We call this the Green’s function. Note that this depends on

and

only.

It depends on the differential operator

, but not the forcing term

. We can

think of this as something like the “inverse matrix”, which we can use to solve

the forced differential equation for any forcing term.

Recall that for a matrix, the inverse exists if the determinant is non-zero,

which is true if the eigenvalues are all non-zero. Similarly, here a necessary

condition for the Green’s function to exist is that all the eigenvalues are non-zero.

We now have a second version of Parseval’s theorem.

Theorem (Parseval’s theorem II).

(f, f)

n∈Z

Proof. We have

(f, f)

Ω

∗

(x)f(x)w(x) dx

n,m∈Z

Ω

∗

(x)

(x)w(x) dx

n,m∈Z

∗

, y

)

n∈Z