III Hydrodynamic Stability (Full)

Part III — Hydrodynamic Stability

Based on lectures by C. P. Caulfield

Notes taken by Dexter Chua

Michaelmas 2017

These notes are not endorsed by the lecturers, and I have modified them (often

significantly) after lectures. They are nowhere near accurate representations of what

was actually lectured, and in particular, all errors are almost surely mine.

Developing an understanding by which “small” perturbations grow, saturate and modify

fluid flows is central to addressing many challenges of interest in fluid mechanics.

Furthermore, many applied mathematical tools of much broader relevance have been

developed to solve hydrodynamic stability problems, and hydrodynamic stability theory

remains an exceptionally active area of research, with several exciting new developments

being reported over the last few years.

In this course, an overview of some of these recent developments will be presented.

After an introduction to the general concepts of flow instability, presenting a range of

examples, the major content of this course will be focussed on the broad class of flow

instabilities where velocity “shear” and fluid inertia play key dynamical roles. Such

flows, typically characterised by sufficiently“high” Reynolds number

Ud/ν

, where

and

are characteristic velocity and length scales of the flow, and

is the kinematic

viscosity of the fluid, are central to modelling flows in the environment and industry.

They typically demonstrate the key role played by the redistribution of vorticity within

the flow, and such vortical flow instabilities often trigger the complex, yet hugely

important process of “transition to turbulence”.

A hierarchy of mathematical approaches will be discussed to address a range of

“stability” problems, from more traditional concepts of “linear” infinitesimal normal

mode perturbation energy growth on laminar parallel shear flows to transient, inherently

nonlinear perturbation growth of general measures of perturbation magnitude over

finite time horizons where flow geometry and/or fluid properties play a dominant

role. The course will also discuss in detail physical interpretations of the various flow

instabilities considered, as well as the industrial and environmental application of the

results of the presented mathematical analyses

Pre-requisites

Elementary concepts from undergraduate real analysis. Some knowledge of complex

analysis would be advantageous (e.g. the level of IB Complex Analysis/Methods). No

knowledge of functional analysis is assumed.

Contents

1 Linear stability analysis

1.1 Rayleigh–Taylor instability

1.2 Rayleigh–B´enard convection

1.3 Classical Kelvin–Helmholtz instability

1.4 Finite depth shear flow

1.5 Stratified flows

2 Absolute and convective instabilities

3 Transient growth

3.1 Motivation

3.2 A toy model

3.3 A general mathematical framework

3.4 Orr-Sommerfeld and Squire equations

4 A variational point of view

1 Linear stability analysis

1.1 Rayleigh–Taylor instability

In this section, we would like to investigate the stability of a surface between

two fluids:

We assume the two fluids have densities

and

, and are separated by a

smooth interface parametrized by the deviation η.

Recall that the Navier–Stokes equations for an incompressible fluid are



∂u

∂t

+ u ·∇u



= −∇P − gρ

z + µ∇

u, ∇ ·u = 0.

Usually, when doing fluid mechanics, we assume density is constant, and divide

the whole equation across by

. We can then forget the existence of

and carry

on. We can also get rid of the gravity term. We know that the force of gravity

is balanced out by a hydrostatic pressure gradient. More precisely, we can write

P = P

(z) + p(x, t),

where P

satisfies

∂P

∂z

= −gρ.

We can then write our equation as

∂u

∂t

+ u ·∇u = −∇





+ ν∇

To do all of these, we must assume that the density

is constant, but this is

clearly not the case when we have tow distinct fluids.

Since it is rather difficult to move forward without making any simplifying

assumptions, we shall focus on the case of inviscid fluids, so that we take

= 0.

We will also assume that the fluid is irrotational. In this case, it is convenient to

use the vector calculus identity

u ·∇u = ∇



|u|



− u ×(∇× u),

where we can now drop the second term. Moreover, since

∇ × u

= 0, Stokes’

theorem allows us to write

∇φ

for some velocity potential

. Finally, in

each separate region, the density ρ is constant. So we can now write

∇



∂φ

∂t

ρ|∇φ|

+ P + gρz



= 0.

This is Bernoulli’s theorem.

Applying these to our scenario, since we have two fluids, we have two separate

velocity potentials

, φ

for the two two regions. Both of these independently

satisfy the incompressibility hypothesis

∇

1,2

= 0.

Bernoulli’s theorem tells us the quantity

∂φ

∂t

ρ|∇φ|

+ P + gρz

should be constant across the fluid. Our investigation will not use the full

equation. All we will need is that this quantity matches at the interface, so that

we have

∂φ

∂t

|∇φ

+ P

+ gρ



z=η

= ρ

∂φ

∂t

|∇φ

+ P

+ gρ



z=η

To understand the interface, we must impose boundary conditions. First of all

the vertical velocities of the fluids must match with the interface, so we impose

the kinematic boundary condition

∂φ

∂z



z=η

∂φ

∂z



z=η

Dη

where

D =

∂

∂t

+ u · ∇

We also make another, perhaps dubious boundary condition, namely that the

pressure is continuous along the interface. This does not hold at all interfaces.

For example, if we have a balloon, then the pressure inside is greater than the

pressure outside, since that is what holds the balloon up. In this case, it is the

rubber that is exerting a force to maintain the pressure difference. In our case,

we only have two fluids meeting, and there is no reason to assume a discontinuity

in pressure, and thus we shall assume it is continuous. If you are not convinced,

this is good, and we shall later see this is indeed a dubious assumption.

But assuming it for the moment, this allows us to simplify our Bernoulli’s

theorem to give

∂φ

∂t

|∇φ

+ gρ

η = ρ

∂φ

∂t

|∇φ

+ gρ

η.

There is a final boundary condition that is specific to our model. What we are

going to do is that we will start with a flat, static solution with

= 0 and

= 0.

We then perturb

a bit, and see what happens to our fluid. In particular, we

want to see if the perturbation is stable.

Since we expect the interesting behaviour to occur only near the interface,

we make the assumption that there is no velocity in the far field, i.e.

→

z → ∞

and

→

0 as

z → −∞

(and similarly for the derivatives). For

simplicity, we also assume there is no y dependence.

We now have equations and boundary conditions, so we can solve them. But

these equations are pretty nasty and non-linear. At this point, one sensible

approach is to linearize the equations by assuming everything is small. In addition

to assuming that

is small, we also need to assume that various derivatives such

∇φ

are small, so that we can drop all second-order terms. Since

is small,

the value of, say,

∂φ

∂z

should be similar to that at

= 0. Since

∂φ

∂z

itself is

already assumed to be small, the difference would be second order in smallness.

So we replace all evaluations at

with evaluations at

= 0. We are then left

with the collection of 3 equations

∇

1,2

= 0

∂φ

1,2

∂z



z=0

= η

∂φ

∂t

+ gρ



z=0

= ρ

∂φ

∂t

+ gρ



z=0

This is a nice linear problem, and we can analyze the Fourier modes of the

solutions. We plug in an ansatz

1,2

(x, z, t) =

1,2

(z)e

i(kx−ωt)

η(x, t) = Be

i(kx−ωt)

Substituting into Laplace’s equation gives

1,2

− k

1,2

= 0.

Using the far field boundary conditions, we see that we have a family of solutions

= A

−kz

= A

The kinematic boundary condition tells us we must have

1,2

(0) = −iωB.

We can solve this to get

B =

iω

= −

iω

In particular, we must have A ≡ A

= −A

. We can then write

η =

iω

ik(kx−ωt)

Plugging these into the final equation gives us

(−iωA) + gρ

iω

= ρ

iωA + gρ

iω

Crucially, we can cancel the

throughout the equation, and gives us a result

independent of

. This is, after all, the point of linearization. Solving this gives

us a dispersion relation relating the frequency (or phase speed

ω/k

) to the

wavenumber:

g(ρ

− ρ

+ ρ

 ρ

, then this reduces to

≈ gk

, and this is the usual dispersion relation

for deep water waves. On the other hand, if

− ρ

0 is small, this can lead

to waves of rather low frequency, which is something we can observe in cocktails,

apparently.

But nothing in our analysis actually required

> ρ

. Suppose we had

> ρ

. This amounts to putting the heavier fluid on top of the lighter one.

Anyone who has tried to turn a cup of water upside down will know this is highly

unstable. Can we see this from our analysis?

< ρ

, then

has to be imaginary. We shall write it as

±iσ

, where

σ > 0. We can then compute

σ =

g(ρ

− ρ

+ ρ

Writing out our expression for

1,2

, we see that there are

±σt

terms, and the

σt

term dominates in the long run, causing

1,2

to grow exponentially. This is

the Rayleigh–Taylor instability.

There is more to say about the instability. As the wavelength decreases,

increases, and we see that

increases as well. Thus, we see that short-scale

perturbations blow up exponentially much more quickly, which means the system

is not very well-posed. This is the ultra-violet catastrophe. Of course, we should

not trust our model here. Recall that in our simplifying assumptions, we not

only assumed that the amplitudes of our

and

were small, but also that their

derivatives were small. The large

behaviour gives us large

derivatives, and

so we have to take into account the higher order terms as well.

But we can also provide physical reasons for why small scales perturbations

should be suppressed by such a system. In our model, we assumed there is no

surface tension. Surface tension quantifies the tendency of interfaces between

fluids to minimize surface area. We know that small-scale variations will cause

a large surface area, and so the surface tension will suppress these variations.

Mathematically, what the surface allows for is a pressure jump across the

interface.

Surface tension is quantified by

, the force per unit length. This has

dimensions [

] =

−1

. Empirically, we observe that the pressure difference

across the interface is

∆p = −γ∇ ·

n = 2γH = γ





where

is the unit normal and

is the mean curvature. This is an empirical

result.

Again, we assume no dependence in the

direction, so we have a cylindrical

interface with a single radius of curvature. Linearizing, we have a pressure

difference of

− P

z=η

= −γ

∂

∂x



1 +



∂η

∂x





3/2

≈ −γ

∂

∂x

Therefore the (linearized) dynamic boundary condition becomes

∂φ

∂t

+ gρ



z=0

+ γ

∂

∂x

= ρ

∂φ

∂t

+ gρ



z=0

If we run the previous analysis again, we find a dispersion relation of

g(ρ

− ρ

)k + γk

+ ρ

Since

is always positive, even if we are in the situation when

> ρ

, for

sufficiently large, the system is stable. Of course, for small

, the system is still

unstable — we still expect water to fall out even with surface tension. In the

case ρ

< ρ

, what we get is known as internal gravity-capillary waves.

In the unstable case, we have

= k



g(ρ

− ρ

)

+ ρ



(1 − l

where

g(ρ

− ρ

)

is a characteristic length scale. For

1, the oscillations are stable, and the

maximum value of k is attained at k =

√

In summary, we have a range of wavelength, and we have identified a “most

unstable wavelength”. Over time, if all frequencies modes are triggered, we

should expect this most unstable wavelength to dominate. But it is rather

hopeful thinking, because this is the result of linear analysis, and we can’t expect

it to carry on to the non-linear regime.

1.2 Rayleigh–B´enard convection

The next system we want to study is something that looks like this:

+ ∆TT

+ ∆T

There is a hot plate at the bottom, a cold plate on top, and some fluid in between.

We would naturally expect heat to transmit from the bottom to the top. There

are two ways this can happen:

–

Conduction: Heat simply diffuses from the bottom to the top, without any

fluid motion.

–

Convection: The fluid at the bottom heats up, expands, becomes lighter,

and moves to the top.

The first factor is controlled purely by thermal diffusivity

, while the latter

is mostly controlled by the viscosity

. It is natural to expect that when the

temperature gradient ∆

is small, most of the heat transfer is due to conduction,

as there isn’t enough force to overcome the viscosity to move fluid around. When

∆

is large, the heat transfer will be mostly due to conduction, and we would

expect such a system to be unstable.

To understand the system mathematically, we must honestly deal with the

case where we have density variations throughout the fluid. Again, recall that

the Navier–Stokes equation



∂u

∂t

+ u · ∇u



= −∇P − gρ

z + µ∇

u, ∇ · u = 0.

The static solution to our system is the pure conduction solution, where

= 0,

and there is a uniform vertical temperature gradient across the fluid. Since the

density is a function of temperature, which in the static case is just a function

, we know

(

). When we perturb the system, we allow some horizontal

and time-dependent fluctuations, and assume we can decompose our density

field into

ρ = ρ

(z) + ρ

(x, t).

We can then define the hydrostatic pressure by the equation

= −gρ

This then allows us to decompose the pressure as

P = P

(z) + p

(x, t).

Then our equations of motion become

∂u

∂t

+ u · ∇u = −

∇p

−

gρ

z + ν∇

u, ∇ · u = 0

We have effectively “integrated out” the hydrostatic part, and it is now the

deviation from the average that leads to buoyancy forces.

An important component of our analysis involves looking at the vorticity.

Indeed, vorticity necessarily arises if we have non-trivial density variations. For

concreteness, suppose we have a interface between two fluids with different

densities, ρ and ρ + ∆ρ. If the interface is horizontal, then nothing happens:

ρ + ∆ρ

However, if the interface is tilted, then there is a naturally induced torque:

ρ + ∆ρ

In other words, vorticity is induced. If we think about it, we see that the reason

this happens is that the direction of the density gradient does not align with the

direction of gravity. More precisely, this is because the density gradient does not

align with the pressure gradient.

Let’s try to see this from our equations. Recall that the vorticity is defined

ω = ∇ × u.

Taking the curl of the Navier–Stokes equation, and doing some vector calculus,

we obtain the equation.

∂ω

∂t

+ u · ∇ω = ω ·∇u +

∇ρ × ∇P + ν∇

The term on the left hand side is just the material derivative of the vorticity. So

we should interpret the terms on the right-hand-side as the terms that contribute

to the change in vorticity.

The first and last terms on the right are familiar terms that have nothing to

do with the density gradient. The interesting term is the second one. This is

what we just described — whenever the pressure gradient and density gradient

do not align, we have a baroclinic torque.

In general, equations are hard to solve, and we want to make approximations.

A common approximation is the Boussinesq approximation. The idea is that

even though the density difference is often what drives the motion, from the

point of view of inertia, the variation in density is often small enough to be

ignored. To take some actual, physical examples, salt water is roughly 4% more

dense than fresh water, and every 10 degrees Celsius changes the density of air

by approximately 4%.

Thus, what we do is that we assume that the density is constant except in

the buoyancy force. The mathematically inclined people could think of this as

taking the limit g → ∞ but ρ

→ 0 with gρ

remaining finite.

Under this approximation, we can write our equations as

∂u

∂t

+ u · ∇u = −

∇p

− g

z + ν∇

∂ω

∂t

+ u · ∇ω = ω ·∇u +

z × ∇ρ + ν∇

ω,

where we define the reduced gravity to be

gρ

Recall that our density is to be given by a function of temperature

. We

must write down how we want the two to be related. In reality, this relation is

extremely complicated, and may be affected by other external factors such as

salinity (in the sea). However, we will use a “leading-order” approximation

ρ = ρ

(1 − α(T −T

)).

We will also need to know how temperature

evolves in time. This is simply

given by the diffusion equation

∂T

∂t

+ u · ∇T = κ∇

Note that in thermodynamics, temperature and pressure are related quantities.

In our model, we will completely forget about this. The only purpose of pressure

will be to act as a non-local Lagrange multiplier that enforces incompressibility.

There are some subtleties with this approximation. Inverting the relation,

T =



1 −



+ T

Plugging this into the diffusion equation, we obtain

∂ρ

∂t

+ u · ∇ρ = κ∇

ρ.

The left-hand side is the material derivative of density. So this equation is saying

that density, or mass, can “diffuse” along the fluid, independent of fluid motion.

Mathematically, this follows from the fact that density is directly related to

temperature, and temperature diffuses.

This seems to contradict the conservation of mass. Recall that the conserva-

tion of mass says

∂ρ

∂t

+ ∇ · (uρ) = 0.

We can than expand this to say

∂ρ

∂t

+ u · ∇ρ = −ρ∇ · u.

We just saw that the left-hand side is given by

κ∇

, which can certainly be

non-zero, but the right-hand side should vanish by incompressibility! The issue

is that here the change in

is not due to compressing the fluid, but simply

thermal diffusion.

If we are not careful, we may run into inconsistencies if we require

∇ · u

= 0.

For our purposes, we will not worry about this too much, as we will not use the

conservation of mass.

We can now return to our original problem. We have a fluid of viscosity

and thermal diffusivity

. There are two plates a distance

apart, with the

top held at temperature

and bottom at

+ ∆

. We make the Boussinesq

approximation.

+ ∆TT

+ ∆T

We first solve for our steady state, which is simply

U = 0

= T

− ∆T

(z −d)

= ρ



1 + α∆T

z −d



= P

− gρ



z +

α∆T z

(z −2d)



In this state, the only mode of heat transfer is conduction. What we would

like to investigate, of course, is whether this state is stable. Consider small

perturbations

u = U + u

T = T

+ θ

P = P

+ p

We substitute these into the Navier–Stokes equation. Starting with

∂u

∂t

+ u · ∇u = −

∇p

+ gαθ

z + ν∇

we assume the term u · ∇u will be small, and end up with the equation

∂u

∂t

= −

∇p

+ gαθ

z + ν∇

together with the incompressibility condition ∇ · u

= 0.

Similarly, plugging these into the temperature diffusion equation and writing

u = (u, v, w) gives us

∂θ

∂t

− w

∆T

= κ∇

θ.

We can gain further understanding of these equations by expressing them in

terms of dimensionless quantities. We introduce new variables

t =

κt

x =

θ =

∆T

˜p =

In terms of these new variables, our equations of motion become rather elegant:

∂

− ˜w =

∇

∂

= −

∇˜p +



gα∆T d

νκ







z +

∇

Ultimately, we see that the equations of motion depend on two dimensionless

constants: the Rayleigh number and Prandtl number

Ra =

gα∆T d

νκ

Pr =

These are the two parameters control the behaviour of the system. In particular,

the Prandtl number measures exactly the competition between viscous and

diffusion forces. Different fluids have different Prandtl numbers:

–

In a gas, then

∼

1, since both are affected by the mean free path of a

particle.

–

In a non-metallic liquid, the diffusion of heat is usually much slower than

that of momentum, so Pr ∼ 10.

–

In a liquid metal,

is very very low, since, as we know, metal transmits

heat quite well.

Finally, the incompressibility equation still reads

∇ ·

u = 0

Under what situations do we expect an unstable flow? Let’s start with some

rather more heuristic analysis. Suppose we try to overturn our system as shown

in the diagram:

Suppose we do this at a speed

. For each

d × d × d

block, the potential

energy released is

P E ∼ g · d · (d

∆ρ) ∼ gρ

αd

∆T,

and the time scale for this to happen is τ ∼ d/U .

What is the friction we have to overcome in order to achieve this? The

viscous stress is approximately

∂U

∂z

∼ µ

The force is the integral over the area, which is

∼

µU

. The work done, being

F dz, then evaluates to

µU

· d

· d = µU d

We can figure out

by noting that the heat is diffused away on a time scale

τ ∼ d

/κ

. For convection to happen, it must occur on a time scale at least

as fast as that of diffusion. So we have

U ∼ κ/d

. All in all, for convection to

happen, we need the potential energy gain to be greater than the work done,

which amounts to

gρ

α∆T d

& ρ

νκd.

In other words, we have

Ra =

gα∆T d

νκ

& 1.

Now it is extremely unlikely that

Ra ≥

1 is indeed the correct condition, as

our analysis was not very precise. However, it is generally true that instability

occurs only when Ra is large.

To work this out, let’s take the curl of the Navier–Stokes equation we

previously obtained, and dropping the tildes, we get

∂ω

∂t

= RaPr∇θ ×

z + Pr∇

This gets rid of the pressure term, so that’s one less thing to worry about. But

this is quite messy, so we take the curl yet again to obtain

∂

∂t

∇

u = RaPr



∇

z − ∇



∂θ

∂z



+ Pr∇

Reading off the z component, we get

∂

∂t

∇

w = RaPr



∂

∂x

∂

∂y



θ + Pr∇

We will combine this with the temperature equation

∂θ

∂t

− w = ∇

to understand the stability problem.

We first try to figure out what boundary conditions to impose. We’ve got a

6th order partial differential equation (4 from the Navier–Stokes and 2 from the

temperature equation), and so we need 6 boundary conditions. It is reasonable

to impose that w = θ = 0 at z = 0, 1, as this gives us impermeable boundaries.

It is also convenient to assume the top and bottom surfaces are stress-free:

∂u

∂z

∂v

∂z

= 0, which implies

∂

∂z



∂u

∂x

∂v

∂y



= 0.

This then gives us

∂

∂z

= 0. This is does not make much physical sense, because

our fluid is viscous, and we should expect no-slip boundary conditions, not

no-stress. But they are mathematically convenient, and we shall assume them.

Let’s try to look for unstable solutions using separation of variables. We see

that the equations are symmetric in x and y, and so we write our solution as

w = W (z)X(x, y)e

σt

θ = Θ(z)X(x, y)e

σt

If we can find solutions of this form with σ > 0, then the system is unstable.

What follows is just a classic separation of variables. Plugging these into the

temperature equation yields



− σ +



∂

∂x

∂

∂y



XΘ = −XW.

Or equivalently



− σ



XΘ + XW = −



∂

∂x

∂

∂y



XΘ.

We see that the differential operator on the left only acts in Θ, while those on

the right only act on X. We divide both sides by XΘ to get



− σ



Θ + W

= −





Since the LHS is purely a function of

, and the RHS is purely a function of

they must be constant, and by requiring that the solution is bounded, we see

that this constant must be positive. Our equations then become



∂

∂x

∂

∂y



X = −λ



− λ

− σ



Θ = −W.

We now look at our other equation. Plugging in our expressions for

and

and using what we just obtained, we have



− λ



W = −RaPrλ

Θ + Pr



− λ



On the boundary, we have Θ = 0 =

∂

∂z

= 0, by assumption. So it follows

that we must have



z=0,1

= 0.

We can eliminate Θ by letting



− σ − λ



act on both sides, which converts

the Θ into W . We are then left with a 6th order differential equation



− σ − λ





− λ



− σ



− λ



W = −RaPrλ

This is an eigenvalue problem, and we see that our operator is self-adjoint. We

can factorize and divide across by Pr to obtain



− λ



− σ − λ



− λ

−



W = −Raλ

The boundary conditions are that

W =

= 0 at z = 0, 1.

We see that any eigenfunction of

gives us an eigenfunction of our big scary

differential operator, and our solutions are quantized sine functions,

sin nπz

with n ∈ N. In this case, the eigenvalue equation is

+ λ

)(n

+ σ + λ

)



+ λ



= Raλ

When

σ >

0, then, noting that the LHS is increasing with

on the range [0

, ∞

this tells us that we have

Ra(n) ≥

+ λ

)

We minimize the RHS with respect to

and

. We clearly want to set

= 1,

and then solve for

0 =

d[λ

]

+ λ

)

(π

+ λ

)

(2λ

− π

So the minimum value of Ra is obtained when

√

So we find that we have a solution only when

Ra ≥ Ra

27π

≈ 657.5.

If we are slightly about

, then there is the

= 1 unstable mode. As

increases, the number of available modes increases. While the critical Rayleigh

number depends on boundary conditions, but the picture is generic, and the

width of the unstable range grows like

√

Ra − Ra

stable

unstable

We might ask — how does convection enhance the heat flux? This is

quantified by the Nusselt number

Nu =

convective heat transfer

conductive heat transfer

hwTid

κ∆T

= h˜w

θi.

Since this is a non-dimensional number, we know it is a function of Ra and Pr.

There are some physical arguments we can make about the value of

. The

claim is that the convective heat transfer is independent of layer depth. If there

is no convection at all, then the temperature gradient will look approximately

There is some simple arguments we can make about this. If we have a purely

conductive state, then we expect the temperature gradient to be linear:

This is pretty boring. If there is convection, then we claim that the temperature

gradient will look like

The reasoning is that the way convection works is that once a parcel at the

bottom is heated to a sufficiently high temperature, it gets kicked over to the

other side; if a parcel at the top is cooled to a sufficiently low temperature, it gets

kicked over to the other side. In between the two boundaries, the temperature is

roughly constant, taking the average temperature of the two boundaries.

In this model, it doesn’t matter how far apart the two boundaries are, since

once the hot or cold parcels are shot off, they can just travel on their own until

they reach the other side (or mix with the fluid in the middle).

If we believe this, then we must have

hwTi ∝ d

, and hence

Nu ∝ d

. Since

only Ra depends on d, we must have N u ∝ Ra

1/3

Of course, the real situation is much more subtle.

1.3 Classical Kelvin–Helmholtz instability

Let’s return to the situation of Rayleigh–Taylor instability, but this time, there

is some horizontal flow in the two layers, and they may be of different velocities.

Our analysis here will be not be very detailed, partly because what we do

is largely the same as the analysis of the Rayleigh–Taylor instability, but also

because the analysis makes quite a few assumptions which we wish to examine

more deeply.

In this scenario, we can still use a velocity potential

(u, w) =



∂φ

∂x

∂φ

∂z



We shall only consider the system in 2D. The far field now has velocity

= U

x + φ

= U

x + φ

with φ

→ 0 as z → ∞ and φ

→ 0 as z → −∞.

The boundary conditions are the same. Continuity of vertical velocity requires

∂φ

1,2

∂z



z=η

Dη

The dynamic boundary condition is that we have continuity of pressure at the

interface if there is no surface tension, in which case Bernoulli tells us

∂φ

∂t

|∇φ

+ gρ

η = ρ

∂φ

∂t

|∇φ

+ gρ

η.

The interface conditions are non-linear, and again we want to linearize. But

since U

1,2

is of order 1, linearization will be different. We have

∂φ

1,2

∂z



z=0



∂

∂t

+ U

1,2

∂

∂x



η.

So the Bernoulli condition gives us



∂

∂t

+ U

∂

∂x



+ gη



= ρ



∂

∂t

+ U

∂

∂x



+ gη



This modifies our previous eigenvalue problem for the phase speed and wavenum-

ber

2π

. We go exactly as before, and after some work, we find that

we have

c =

+ ρ



g(ρ

− ρ

)

− ρ

− U

)



1/2

So we see that we have instability if c has an imaginary part, i.e.

k >

g(ρ

− ρ

)

− U

)

So we see that there is instability for sufficiently large wavenumbers, even for

static stability. Similar to Rayleigh–Taylor instability, the growth rate

grows monotonically with wavenumber, and as

k → ∞

, the instability becomes

proportional to the difference

−U

(as opposed to Rayleigh–Taylor instability,

where it grows unboundedly).

How can we think about this result? If

and we have a discrete

change in velocity, then this means there is a

-function of vorticity at the

interface. So it should not be surprising that the result is unstable!

Another way to think about it is to change our frame of reference so that

−U

. In the Boussinesq limit, we have

= 0, and instability arises

whenever

g∆ρλ

ρU

< 4π. We can see the numerator as the potential energy cost if

we move a parcel from the bottom layer to the top layer, while the denominator

as some sort of kinetic energy. So this says we are unstable if there is enough

kinetic energy to move a parcel up.

This analysis of the shear flow begs a few questions:

–

How might we regularize this expression? The vortex sheet is obviously

wildly unstable.

– Was it right to assume two-dimensional perturbations?

– What happens if we regularize the depth of the shear?

1.4 Finite depth shear flow

We now consider a finite depth shear flow. This means we have fluid moving

between two layers, with a z-dependent velocity profile:

z = L

z = −L z = −L

U(z)

We first show that it suffices to work in 2 dimensions. We will assume that the

mean flow points in the

direction, but the perturbations can point at an angle.

The inviscid homogeneous incompressible Navier–Stokes equations are again



∂u

∂t

+ u · ∇u



= −∇





, ∇ · u = 0.

We linearize about a shear flow, and consider some 3D normal modes

u =

U(z)

x + u

(x, y, z, t),

where

, p

/ρ) = [

u(z), ˆp(z)]e

i(kx+`y−kct)

The phase speed is then

+ `

)

1/2

+ `

)

1/2

and the growth rate of the perturbation is simply σ

= kc

We substitute our expression of

and

/ρ

into the Navier–Stokes equations

to obtain, in components,

ik(

U − c)ˆu + ˆw

= −ikˆp

ik(

U − c)ˆv = −ilˆp

ik(

U − c) ˆw = −

dˆp

ikˆu + i`ˆv +

d ˆw

= 0

Our strategy is to rewrite everything to express things in terms of new variables

κ =

+ `

, κ˜u = kˆu + `ˆv, ˜p =

κˆp

and if we are successful in expressing everything in terms of

˜u

, then we could

see how we can reduce this to the two-dimensional problem.

To this end, we can slightly rearrange our first two equations to say

ik(

U − c)ˆu + ˆw

= −ik

ˆp

i`(

U − c)ˆv = −i`

ˆp

which combine to give

U − c)κ˜u + ˆw

= −iκ˜p.

We can rewrite the remaining two equations in terms of ˜p as well:

iκ(

U − c) ˆw = −

˜p

U +

d ˆw

= 0.

This looks just like a 2d system, but with an instability growth rate of

κc

> kc

. Thus, our 3d problem is “equivalent” to a 2d problem with

greater growth rate. However, whether or not instability occurs is unaffected.

One way to think about the difference in growth rate is that the

component

of the perturbation sees less of the original velocity

, and so it is more stable.

This result is known as Squire’s Theorem.

We now restrict to two-dimensional flows, and have equations

ik(

U − c)ˆu + ˆw

= ikˆp

ik(

U − c) ˆw = −

dˆp

ikˆu +

d ˆw

= 0.

We can use the last incompressibility equation to eliminate

ˆu

from the first

equation, and be left with

−(

U − c)

d ˆw

+ ˆw

= −ikˆp.

We wish to get rid of the

ˆp

, and to do so, we differentiate this equation with

respect to z and use the second equation to get

−(

U − c)

ˆw

−

d ˆw

+ ˆw

= −k

(

U − c) ˆw.

The terms in the middle cancel, and we can rearrange to obtain the Rayleigh

equation



(

U − c)



− k



−



ˆw = 0.

We see that when

, we have a regular singular point. The natural boundary

conditions are that ˆw → 0 at the edge of the domain.

Note that this differential operator is not self-adjoint! This is since

has non-

trivial

-dependence. This means we do not have a complete basis of orthogonal

eigenfunctions. This is manifested by the fact that it can have transient growth.

To analyze this scenario further, we rewrite Rayleigh equation in conventional

form

ˆw

− k

ˆw −

U/dz

U − c

ˆw = 0.

The trick is to multiply by

∗

, integrate across the domain, and apply boundary

conditions to obtain

−L

U − c

|ˆw|

dz = −

−L

(|ˆw

+ k

|ˆw|

) dz

We can split the LHS into real and imaginary parts:

−L



(

U − c

)

U − c|



|ˆw|

dz + ic

−L



U − c|



|ˆw|

dz.

But since the RHS is purely real, we know the imaginary part must vanish.

One way for the imaginary part to vanish is for

to vanish, and this

corresponds to stable flow. If we want

to be non-zero, then the integral must

vanish. So we obtain the Rayleigh inflection point criterion:

must change

sign at least once in −L < z < L.

Of course, this condition is not sufficient for instability. If we want to get

more necessary conditions for instability to occur, it might be wise to inspect

the imaginary part, as Fjortoft noticed. If instability occurs, then we know that

we must have

−L



U − c|



|ˆw|

dz = 0.

Let’s assume that there is a unique (non-degenerate) inflection point at

with

(

). We can then add (

−

) times the above equation to the

real part to see that we must have

−

−L

(|ˆw

+ k

|ˆw|

) dz =

−L



(

U −

)

U − c|



|ˆw|

dz.

Assuming further that

is monotonic, we see that both

U −

and

change

sign at

, so the sign of the product is unchanged. So for the equation to be

consistent, we must have

(

U −

) ≤ 0 with equality only at z

We can look at some examples of different flow profiles:

In the leftmost example, Rayleigh’s criterion tells us it is stable, because there is

no inflection point. The second example has an inflection point, but does not

satisfy Fjortoft’s criterion. So this is also stable. The third example satisfies

both criteria, so it may be unstable. Of course, the condition is not sufficient, so

we cannot make a conclusive statement.

Is there more information we can extract from them Rayleigh equation?

Suppose we indeed have an unstable mode. Can we give a bound on the growth

rate c

given the phase speed c

The trick is to perform the substitution

W =

ˆw

(

U − c)

Note that this substitution is potentially singular when

, which is the

singular point of the equation. By expressing everything in terms of

, we

are essentially hiding the singularity in the definition of

instead of in the

equation.

Under this substitution, our Rayleigh equation then takes the self-adjoint

form

(

U − c)

− k

(

U − c)

W = 0.

We can multiply by

∗

and integrate over the domain to obtain

−L

(

U − c)







+ k





| {z }

≡Q

dz = 0.

Since Q ≥ 0, we may again take imaginary part to require

−L

(

U − c

)Q dz = 0.

This implies that we must have

min

< c

< U

max

, and gives a bound on the

phase speed.

Taking the real part implies

−L

[(

U − c

)

− c

]Q dz = 0.

But we can combine this with the imaginary part condition, which tells us

−L

UQ dz = c

−L

Q dz.

So we can expand the real part to give us

−L

Q dz =

−L

+ c

)Q dz.

Putting this aside, we note that tautologically, we have

min

≤

U ≤ U

max

. So

we always have

−L

(

U − U

max

)(

U − U

min

)Q dz ≤ 0.

expanding this, and using our expression for

−L

Q dz, we obtain

−L

((c

+ c

) − (U

max

− U

min

+ U

max

min

)Q dz ≤ 0.

But we see that we are just multiplying

by a constant and integrating.

Since we know that

Q dz > 0, we must have

+ c

) − (U

max

+ U

min

+ U

max

min

≤ 0.

By completing the square, we can rearrange this to say



−

max

+ U

min



+ (c

− 0)

≤



max

− U

min



This is just an equation for the circle! We can now concretely plot the region of

possible c

and c

max

min

max

Of course, lying within this region is a necessary condition for instability to

occur, but not sufficient. The actual region of instability is a subset of this

semi-circle, and this subset depends on the actual

. But this is already very

helpful, since if we way to search for instabilities, say numerically, then we know

where to look.

1.5 Stratified flows

In the Kelvin–Helmholtz scenario, we had a varying density. Let’s now try to

model the situation in complete generality.

Recall that the set of (inviscid) equations for Boussinesq stratified fluids is



∂u

∂t

+ u · ∇u



= −∇p

− gρ

with incompressibility equations

∇ · u = 0,

Dρ

= 0.

We consider a mean base flow and a 2D perturbation as a normal mode:

u =

U(z)

x + u

(x, z, t)

p = ¯p(z) + p

(x, z, t)

ρ = ¯ρ(z) + ρ

(x, z, t),

with

, p

, ρ

) = [

u(z), ˆp(z), ˆρ(z)]e

i(kx−ωt)

= [

u(z), ˆp(z), ˆρ(z)]e

ik(x−ct)

We wish to obtain an equation that involves only

ˆw

. We first linearize our

equations to obtain

¯ρ



∂u

∂t

∂u

∂x

+ w



= −

∂p

∂x

¯ρ



∂w

∂t

∂w

∂x



= −

∂p

∂z

− gρ

∂ρ

∂t

∂ρ

∂x

+ w

d¯ρ

= 0

∂u

∂x

∂w

∂z

= 0.

Plugging in our normal mode solution into the four equations, we obtain

ik¯ρ(

U − c)ˆu + ¯ρ ˆw

U = −ikˆp

ik¯ρ(

U − c) ˆw = −

dˆp

− g ˆρ. (∗)

ik(

U − c)ˆρ + w

d¯ρ

= 0 (†)

ikˆu +

∂w

∂z

= 0.

The last (incompressibility) equation helps us eliminate

ˆu

from the first equation

to obtain

−¯ρ(

U − c)

d ˆw

+ ¯ρ ˆw

= −ikˆp.

To eliminate

ˆp

as well, we differentiate with respect to

and apply (

∗

), and then

further apply (†) to get rid of the ˆρ term, and end up with

−¯ρ(

U − c)

ˆw

+ ¯ρ ˆw

+ k

¯ρ(

U − c) ˆw = −

g ˆw

U − c

d¯ρ

This allows us to write our equation as the Taylor–Goldstein equation



− k



ˆw −

ˆw

(

U − c)

ˆw

(

U − c)

= 0,

where

= −

¯ρ

d¯ρ

This

is known as the buoyancy frequency, and has dimensions

−2

. This is

the frequency at which a slab of stratified fluid would oscillate vertically.

Indeed, if we have a slab of volume

, and we displace it by an infinitesimal

amount ζ in the vertical direction, then the buoyancy force is

F = V gζ

∂ρ

∂z

Since the mass of the fluid is

, by Newton’s second law, the acceleration of

the parcel satisfies



−

∂ρ

∂z



ζ = 0.

This is just simple harmonic oscillation of frequency N .

Crucially, what we see from this computation is that a density stratification

of the fluid can lead to internal waves. We will later see that the presence of

these internal waves can destabilize our system.

Miles–Howard theorem

In 1961, John Miles published a series of papers establishing a sufficient condition

for infinitesimal stability in the case of a stratified fluid. When he submitted

this to review, Louis Howard read the paper, and replied with a 3-page proof of

a more general result. Howard’s proof was published as a follow-up paper titled

“Note on a paper of John W. Miles”.

Recall that we just derived the Taylor–Goldstein equation



− k



ˆw −

ˆw

(

U − c)

ˆw

(

U − c)

= 0,

The magical insight of Howard was to introduce the new variable

H =

(

U − c)

1/2

We can regretfully compute the derivatives

ˆw = (

U − c)

1/2

ˆw =

(

U − c)

−1/2

+ (

U − c)

1/2

ˆw = −

(

U − c)

−3/2





(

U − c)

−1/2

+ (

U − c)

−1/2

+ (

U − c)

1/2

We can substitute this into the Taylor–Goldstein equation, and after some

algebraic mess, we obtain the decent-looking equation



(

U − c)



− H







(

U − c) +





− N

U − c







= 0.

This is now self-adjoint. Imposing boundary conditions

ˆw,

d ˆw

→

0 in the far

field, we multiply by H

∗

and integrate over all space. The first term gives

∗



(

U − c)



dz = −

(

U − c)



dz,

while the second term is given by







−k

|H|

(

U − c) −

|H|

− |H|







− N



(

U − c

∗

)

U − c|







dz.

Both the real and imaginary parts of the sum of these two terms must be zero.

In particular, the imaginary part reads









+ k

|H|

+ |H|

−





U − c|







dz = 0.

So a necessary condition for instability is that

−





< 0

somewhere. Defining the Richardson number to be

Ri(z) =

U/dz)

the necessary condition is

Ri(z) <

Equivalently, a sufficient condition for stability is that Ri(z) ≥

everywhere.

How can we think about this? When we move a parcel, there is the buoyancy

force that attempts to move it back to its original position. However, if we move

the parcel to the high velocity region, we can gain kinetic energy from the mean

flow. Thus, if the velocity gradient is sufficiently high, it becomes “advantageous”

for parcels to move around.

Piecewise-linear profiles

If we want to make our lives simpler, we can restrict our profile with piecewise

linear velocity and layered density. In this case, to solve the Taylor–Goldstein

equation, we observe that away from the interfaces, we have

= 0. So

the Taylor–Goldstein equation reduces to



− k



ˆw = 0.

This has trivial exponential solutions in each of the individual regions, and to

find the correct solutions, we need to impose some matching conditions at the

interface.

So assume that the pressure is continuous across all interfaces. Then using

the equation

−¯ρ(

U − c)

d ˆw

+ ρ ˆw

= −ikˆp,

we see that the left hand seems like the derivative of a product, except the sign

is wrong. So we divide by

(

U−c)

, and then we can write this as

¯ρ

ˆp

(

U − c)



ˆw

U − c



For a general

, we integrate over an infinitesimal distance at the interface. We

assume

ˆp

is continuous, and that

¯ρ

and (

U − c

) have bounded discontinuity.

Then the integral of the LHS vanishes in the limit, and so the integral of the

right-hand side must be zero. This gives the matching condition



ˆw

U − c



−

= 0.

To obtain a second matching condition, we rewrite the Taylor–Goldstein equa-

tion as a derivative, which allows for the determination of the other matching

condition:



(

U − c)

d ˆw

− ˆw

−

g ¯ρ



ˆw

U − c



= k

(

U − c) ˆw −

g ¯ρ



ˆw

U − c



Again integrating over an infinitesimal distance over at the interface, we see

that the LHS must be continuous along the interface. So we obtain the second

matching condition.



(

U − c)

d ˆw

− ˆw

−

g ¯ρ



ˆw

U − c



−

= 0.

We begin by applying this to a relatively simple profile with constant density,

and whose velocity profile looks like

∆U

For convenience, we scale distances by

and speeds by

∆U

, and define

˜c

and α by

c =

∆U

˜c, α =

These quantities

and

˜c

(which we will shortly start writing as

instead) are

nice dimensionless quantities to work with. After the scaling, the profile can be

described by

z = 1

z = −1

U = −1

U = z

U = 1

III

ˆw = Ae

−α(z−1)

ˆw = Be

αz

+ Ce

−αz

ˆw = De

α(z+1)

We have also our exponential solutions in each region between the interfaces,

noting that we require our solution to vanish at the far field.

We now apply the matching conditions. Since

U − c

is continuous, the first

matching condition just says ˆw has to be continuous. So we get

A = Be

+ Ce

−α

D = Be

−α

+ Ce

The other matching condition is slightly messier. It says



(

U − c)

d ˆw

− ˆw

−

g ¯ρ



ˆw

U − c



−

= 0,

which gives us two equations

(Be

+ Ce

−α

)(1 − c)(−α) = (Be

− Ce

−α

)(1 − c)(α) − (Be

+ Ce

−α

)

(Be

−α

+ Ce

)(−1 − c)(α) = (Be

−α

− Ce

)(−1 − c)(α) − (Be

−α

+ Ce

Simplifying these gives us

(2α(1 − c) − 1)Be

= Ce

−α

(2α(1 + c) − 1)Ce

= Be

−α

Thus, we find that

(2α − 1)

− 4α

= e

−4α

and hence we have the dispersion relation

(2α − 1)

− e

−4α

4α

We see that this has the possibility of instability. Indeed, we can expand the

numerator to get

(1 − 4α + 4α

) − (1 − 4α + 8α

+ O(α

))

4α

= −1 + O(α).

So for small

, we have instability. This is known as Rayleigh instability. On

the other hand, as

grows very large, this is stable. We can plot these out in a

graph

0.64

We see that the critical value of

is 0

64, and the maximally instability point

is α ≈ 0.4.

Let’s try to understand where this picture came from. For large

, the

wavelength of the oscillations is small, and so it seems reasonable that we can

approximate the model by one where the two interfaces don’t interact. So

consider the case where we only have one interface.

z = 1

U = z

U = 1

−α(z−1)

α(z−1)

We can perform the same procedure as before, solving



ˆw

U − c



−

= 0.

This gives the condition A = B. The other condition then tell us

(1 − c)(−α)A = (1 − c)αA −Ae

α(z−1)

We can cancel the A, and rearrange this to say

c = 1 −

2α

So we see that this interface supports a wave at speed

= 1

−

2α

at a speed

lower than

. Similarly, if we treated the other interface at isolation, it would

support a wave at c

−

= −1 +

2α

Crucially these two waves are individually stable, and when

is large, so that

the two interfaces are effectively in isolation, we simply expect two independent

and stable waves. Indeed, in the dispersion relation above, we can drop the

−4α

term when α is large, and approximate

c = ±

2α − 1

2α

= ±



1 −

2α



When

is small, then we expect the two modes to interact with each other,

which manifests itself as the

−e

4α

term. Moreover, the resonance should be the

greatest when the two modes have equal speed, i.e. at

α ≈

, which is quite

close to actual maximally unstable mode.

Density stratification

We next consider the case where we have density stratification. After scaling,

we can draw our region and solutions as

z = −1

z = 0

z = 1

U = −1

U = z

U = 1

III

−α(z−1)

α(z−1)

+ Ce

−α(z−1)

α(z+1)

+ Ee

−α(z+1)

F e

α(z+1)

¯ρ = −1

¯ρ = +1

Note that the

¯ρ

is the relative density, since it is the density difference that

matters (fluids cannot have negative density).

Let’s first try to understand how this would behave heuristically. As before,

at the I-II and III-IV interfaces, we have waves with



1 −

2α



. At the

density interface, we previously saw that we can have internal gravity waves

with c

igw

= ±

, where

is the bulk Richardson number (before scaling, it is Ri

g∆ρh

∆U

We expect instability to occur when the frequency of this internal gravity

wave aligns with the frequency of the waves at the velocity interfaces, and this

is given by



1 −

2α



It is easy to solve this to get the condition

' α − 1.

This is in fact what we find if we solve it exactly.

So let’s try to understand the situation more responsibly now. The procedure

is roughly the same. The requirement that ˆw is continuous gives

A = B + C

−α

+ Ce

= De

+ Ee

−α

D + E = F.

If we work things out, then the other matching condition



(

U − c)

d ˆw

− ˆw

− Ri

¯ρ



ˆw

U − c



−

= 0

gives

(1 − c)(−α)A = (1 − c)(α)(B − C) − (B + C)

(−1 − c)(α)F = (−1 − c)(α)(D − E) − (D + E)

(−c)(α)(Be

−α

− Ce

) +

(−c)

(Be

−α

+ Ce

)

= (−c)(α)(De

− Ee

−α

) −

(−c)

(De

+ Ee

−α

This defines a 6

6 matrix problem, 4th order in

. Writing it as

= 0 with

= (

A, B, C, D, E, F

)

, the existence of non-trivial solutions is given by the

requirement that det S = 0, which one can check (!) is given by

+ c



−4α

− (2α − 1)

4α

−





−2α

+ (2α − 1)

2α



= 0.

This is a biquadratic equation, which we can solve.

We can inspect the case when

is very small. In this case, the dispersion

relation becomes

+ c



−4α

− (2α − 1)

4α



Two of these solutions are simply given by

= 0, and the other two are just those

we saw from Rayleigh instability. This is saying that when the density difference

is very small, then the internal gravitational waves are roughly non-existent, and

we can happily ignore them.

In general, we have instability for all

! This is Holmboe instability. The

scenario is particularly complicated if we look at small

, where we expect the

effects of Rayleigh instability to kick in as well. So let’s fix some 0

< α <

64,

and see what happens when we increase the Richardson number.

As before, we use dashed lines to denote the imaginary parts and solid lines

to denote the real part. For each

, any combination of imaginary part and

real part gives a solution for

, and, except when there are double roots, there

should be four such possible combinations.

We can undersatnd this as follows. When

= 0, all we have is Rayleigh

instability, which are the top and bottom curves. There are then two

= 0

solutions, as we have seen. As we increase

to a small, non-sero amount, this

mode becomes non-zero, and turns into a genuine unstable mode. As the value

increases, the two imaginary curves meet and merge to form a new curve.

The solutions then start to have a non-zero real part, which gives a non-zero

phase speed to our pertrubation.

Note that when

becomes very large, then our modes become unstable

again, though this is not clear from our graph.

Taylor instability

While Holmboe instability is quite interesting, our system was unstable even

without the density stratification. Since we saw that instability is in general trig-

gered by the interaction between two interfaces, it should not be surprising that

even in a Rayleigh stable scenario, if we have two layers of density stratification,

then we can still get instability.

Consider the following flow profile:

z = 1

z = −1

U = z

III

−α(z−1)

αz

+ Ce

−αz

α(z+1)

¯ρ = R − 1

¯ρ = R + 1

As before, continuity in ˆw requires

A = Be

+ Ce

−α

D = Be

−α

+ Ce

Now there is no discontinuity in vorticity, but the density field has jumps. So

we need to solve the second matching condition. They are

(1 − c)(−α)(Be

+ Ce

−α

) +

1 − c

(Be

+ Ce

−α

) = (1 − c)(α)(Be

− Ce

−α

)

(1 − c)(α)(Be

−α

+ Ce

) +

1 + c

(Be

−α

+ Ce

) = (−1 − c)(α)(Be

− Ce

)

These give us, respectively,

(2α(1 − c)

− Ri

)B = Ri

−2α

(2α(1 + c)

− Ri

)C = Ri

−2α

So we get the biquadratic equation

− c



2 +





1 −

2α



−

−4α

4α

= 0.

We then apply the quadratic formula to say

= 1 +

2α

2Ri

−4α

4α

So it is possible to have instability with no inflection! Indeed, instability occurs

when c

< 0, which is equivalent to requiring

2α

1 + e

−2α

< Ri

2α

1 − e

−2α

Thus, for any

0, we can have instability. This is known as Taylor

instability.

Heuristically, the density jumps at the interface have speed

igw

= ±1 ∓



2α



1/2

We get equality when

= 0, so that

= 2

. This is in very good agreement

with what our above analysis gives us.

2 Absolute and convective instabilities

So far, we have only been considering perturbations that are Fourier modes,

, p

, ρ

) = [

u(z), ˆp(z), ˆρ(z)]e

i(k·x−ωt

This gives rise to a dispersion relation

(

k, ω

) = 0. This is an eigenvalue problem

for

(

), and the

th mode is unstable if the solution

(

) has positive imaginary

part.

When we focus on Fourier modes, they are necessarily non-local. In reality,

perturbations tend to be local. We perturb the fluid at some point, and the

perturbation spreads out. In this case, we might be interested in how the

perturbations spread.

To understand this, we need the notion of group velocity. For simplicity,

suppose we have a sum of two Fourier modes of slightly different wavenumbers

± k

∆

. The corresponding frequencies are then

± ω

∆

, and for small

∆

, we

may approximate

∆

= k

∆

∂ω

∂k



k=k

We can then look at how our wave propagates:

η = cos[(k

+ k

∆

)x − (ω

+ ω

∆

)t] + cos[k

− k

∆

)x − (ω

− ω

∆

)t]

= 2 cos(k

∆

x − ω

∆

t) cos(k

x − ω

= 2 cos

∆

w −

∂ω

∂k



k=k

cos(k

x − ω

Since

∆

is small, we know the first term has a long wavelength, and determines

the “overall shape” of the wave.

As time evolves, these “packets” propagate with group velocity

∂ω

∂k

. This

is also the speed at which the energy in the wave packets propagate.

In general, there are 4 key characteristics of interest for these waves:

– The energy in a wave packet propagates at a group velocity.

– They can disperse (different wavelengths have different speeds)

– They can be advected by a streamwise flow.

– They can be unstable, i.e. their amplitude can grow in time and space.

In general, if we have an unstable system, then we expect the perturbation to

grow with time, but also “move away” from the original source of perturbation.

We can consider two possibilities — we can have convective instability, where

the perturbation “goes away”; and absolute instability, where the perturbation

doesn’t go away.

We can imagine the evolution of a convective instability as looking like this:

whereas an absolute instability would look like this:

Note that even in the absolute case, the perturbation may still have non-zero

group velocity. It’s just that the perturbations grow more quickly than the group

velocity.

To make this more precise, we consider the response of the system to an

impulse. We can understand the dispersion relation as saying that in phase space,

a quantity χ (e.g. velocity, pressure, density) must satisfy

D(k, ω) ˜χ(k, ω) = 0,

where

˜χ(k, ω) =

∞

−∞

∞

−∞

χ(x, t)e

−i(kx−ωt)

dx dt.

Indeed, this equation says we can have a non-zero (

k, ω

) mode iff they satisfy

(

k, ω

) = 0. The point of writing it this way is that this is now linear in

˜χ

, and

so it applies to any χ, not necessarily a Fourier mode.

Going back to position space, we can think of this as saying



−i

∂

∂x

, i

∂

∂t



χ(x, t) = 0.

This equation allows us to understand how the system responds to some external

forcing F (x, t). We simply replace the above equation by



−i

∂

∂x

, i

∂

∂t



χ(x, t) = F (x, t).

Usually, we want to solve this in Fourier space, so that

D(k, ω) ˜χ(k, ω) =

F (k, ω).

In particular, the Green’s function (or impulse response) is given by the response

to the impulse

ξ,τ

(

x, t

) =

(

x − ξ

)

(

t − τ

). We may wlog assume

= 0,

and just call this

. The solution is, by definition, the Green’s function

(

x, t

satisfying



−i

∂

∂x

, i

∂

∂t



G(x, t) = δ(x)δ(t).

Given the Green’s function, the response to an arbitrary forcing

is “just” given

χ(x, t) =

G(x − ξ, t − τ)F (ξ, τ) dξ dτ.

Thus, the Green’s function essentially controls the all behaviour of the system.

With the Green’s function, we may now make some definitions.

Definition (Linear stability). The base flow of a system is linearly stable if

lim

t→∞

G(x, t) = 0

along all rays

= C.

A flow is unstable if it is not stable.

Definition

(Linearly convectively unstable)

An unstable flow is linearly con-

vectively unstable if lim

t→∞

G(x, t) = 0 along the ray

= 0.

Definition

(Linearly absolutely unstable)

An unstable flow is linearly absolutely

unstable if lim

t→∞

G(x, t) 6= 0 along the ray

= 0.

The first case is what is known as an amplifier , where the instability grows

but travels away at the same time. In the second case, we have an oscillator .

Even for a general

, it is easy to solve for

˜χ

in Fourier space. Indeed, we

simply have

˜χ(k, ω) =

F (k, ω)

D(k, ω)

To recover χ from this, we use the Fourier inversion formula

χ(x, t) =

(2π)

˜χ(k, ω)e

i(kx−ωt)

dk dω.

Note that here we put some general contours for

and

instead of integrating

along the real axis, so this is not the genuine Fourier inversion formula. However,

we notice that as we deform our contour, when we pass through a singularity,

we pick up a multiple of

i(kx−ωt)

. Moreover, since the singularities occur when

(

k, ω

) = 0, it follows that these extra terms we pick up are in fact solutions to

the homogeneous equation

(

−i∂

, i∂

)

= 0. Thus, no matter which contour

we pick, we do get a genuine solution to our problem.

So how should we pick a contour? The key concept is causality — the

response must come after the impulse. Thus, if

(

x, t

) = 0 for

t <

0, then we

also need

(

x, t

) = 0 in that region. To understand this, we perform only the

temporal part of the Fourier inversion, so that

˜χ(k, t) =

2π

F (k, ω)

D(k, ω; R)

−iωt

dω.

To perform this contour integral, we close our contour either upwards or down-

wards, compute residues, and then take the limit as our semi-circle tends to

infinity. For this to work, it must be the case that the contribution by the

circular part of the contour vanishes in the limit, and this determines whether

we should close upwards or downwards.

If we close upwards, then

will have positive imaginary part. So for

−iωt

not to blow up, we need

t <

0. Similarly, we close downwards when

t >

0. Thus,

if we want

to vanish whenever

t <

0, we should pick our contour so that it

it lies above all the singularities, so that it picks up no residue when we close

upwards. This determines the choice of contour.

But the key question is which of these are causal. When

t <

0, we require

(

k, t

)

0. By Jordan’s lemma, for

t <

0, when performing the

integral, we

should close the contour upwards when when perform the integral; when

t >

we close it downwards. Thus for

(

k, t

) = 0 when

t <

0, we must pick

to lie

above all singularities of ˜χ(k, ω).

Assume that

has a single simple zero for each

. Let

(

) be the corre-

sponding value of ω. Then by complex analysis, we obtain

˜χ(k, t) = −i

F [k, ω(k)]

∂

∂ω

[k, ω(k)]

−iω(k)t

We finally want to take the inverse Fourier transform with respect to x:

χ(x, t) =

2π

F [k, ω(k)]

∂

∂ω

[k, ω(k)]

−iω(k)t

dk.

We are interested in the case

(

)

(

), i.e.

(

k, ω

) = 1. So the central

question is the evaluation of the integral

G(x, t) = −

2π

exp(i(kx − ω(k)t)

∂

∂ω

[k, ω(k)]

dk.

Recall that our objective is to determine the behaviour of

t → ∞

with

fixed. Since we are interested in the large

behaviour instead of obtaining

exact values at finite

, we may use what is known as the method of steepest

descent.

The method of steepest descent is a very general technique to approximate

integrals of the form

H(t) =

−i

2π

f(k) exp (tρ(k)) dk,

in the limit t → ∞. In our case, we take

f(k) =

∂

∂ω

[k, ω(k)]

ρ (k) = i (kV − ω(k)) .

In an integral of this form, there are different factors that may affect the

limiting behaviour when

t → ∞

. First is that as

gets very large, the fast

oscillations in

causes the integral to cancel except at stationary phase

∂ρ

∂k

= 0.

On the other hand, since we’re taking the exponential of

, we’d expect the

largest contribution to the integral to come from the

where

(

) is the greatest.

The idea of the method of steepest descent is to deform the contour so that

we integrate along paths of stationary phase, i.e. paths with constant

, so that

we don’t have to worry about the oscillating phase.

To do so, observe that the Cauchy–Riemann equations tell us

∇ρ

· ∇ρ

= 0,

where in

∇

we are taking the ordinary derivative with respect to

and

viewing the real and imaginary parts as separate variables.

Since the gradient of a function is the normal to the contours, this tells us

the curves with constant

are those that point in the direction of

. In other

words, the stationary phase curves are exactly the curves of steepest descent of

Often, the function

has some stationary points, i.e. points

∗

satisfying

(

∗

) = 0. Generically, we expect this to be a second-order zero, and thus

looks like (k − k

∗

)

near k

∗

. We can plot the contours of ρ

near k

∗

as follows:

∗

where the arrows denote the direction of steepest descent of

. Note that

since the real part satisfies Laplace’s equation, such a stationary point must

be a saddle, instead of a local maximum or minimum, even if the zero is not

second-order.

We now see that if our start and end points lie on opposite sides of the ridge,

i.e. one is below the horizontal line and the other is above, then the only way to

do so while staying on a path of stationary phase is to go through the stationary

point.

Along such a contour, we would expect the greatest contribution to the

integral to occur when ρ

is the greatest, i.e. at k

∗

. We can expand ρ about k

∗

ρ(k) ∼ ρ(k

∗

) +

∂

∂k

∗

)(k − k

∗

)

We can then approximate

H(t) ∼

−i

2π

f(k)e

tρ(k)

dk,

where we are just integrating over a tiny portion of our contour near

∗

. Putting

in our series expansion of ρ, we can write this as

H(t) ∼

−i

2π

f(k

∗

tρ(k

∗

)

exp



∂

∂k

∗

)(k − k

∗

)



dk.

Recall that we picked our path to be the path of steepest descent on both sides

of the ridge. So we can paramretrize our path by K such that

(iK)

∂

∂k

∗

)(k − k

∗

)

where K is purely real. So our approximation becomes

H(t) ∼

f(k

∗

tρ(k

∗

)

2π

tρ

∗

)

−ε

−K

dK.

Since

−K

falls of so quickly as

gets away from 0, we may approximate that

integral by an integral over the whole real line, which we know gives

√

. So our

final approximation is

H(t) ∼

f(k

∗

tρ(k

∗

)

2πtρ

∗

)

We then look at the maxima of

(

) along these paths and see which has the

greatest contribution.

Now let’s apply this to our situation. Our ρ was given by

ρ(k) = i(kV − ω(k)).

So k

∗

is given by solving

∂ω

∂k

∗

) = V. (∗)

Thus, what we have shown was that the greatest contribution to the Green’s

function along the

direction comes from the the modes whose group velocity

is V ! Note that in general, this k

∗

is complex.

Thus, the conclusion is that given any

, we should find the

∗

such that (

∗

)

is satisfied. The temporal growth rate along

= V is then

σ(V ) = ω

∗

) − k

∗i

This is the growth rate we would experience if we moved at velocity

. However,

it is often more useful to consider the absolute growth rate. In this case, we

should try to maximize

. Suppose this is achieved at

max

, possibly

complex. Then we have

∂ω

∂k

max

) = 0.

But this means that

∂ω

∂k

is purely real. Thus, this maximally unstable mode

can be realized along some physically existent V = c

We can now say

– If ω

i,max

< 0, then the flow is linearly stable.

– If ω

i,max

> 0, then the flow is linearly unstable. In this case,

◦ If ω

0,i

< 0, then the flow is convectively unstable.

◦ If ω

0,i

> 0, then the flow is absolutely unstable.

Here

is defined by first solving

∂ω

∂k

(

) = 0, thus determining the

that

leads to a zero group velocity, and then setting

(

). These quantities

are known as the absolute frequency and absolute growth rate respectively.

Example.

We can consider a “model dispersion relation” given by the linear

complex Ginzburg–Landau equation



∂

∂t

+ U

∂

∂x



χ − µχ − (1 + ic

)

∂

∂x

χ = 0.

This has advection (

), dispersion (

) and instability (

). Indeed, if we replace

∂

∂t

↔ −iω and

∂

∂x

↔ ik, then we have

i(−ω + Uk)χ − µχ + (1 + ic

χ = 0.

This gives

ω = Uk + c

+ i(µ − k

We see that we have temporal instability for

|k| <

√

, where we assume

µ >

This has

= U + c

On the other hand, if we force ω ∈ R, then we have spatial instability. Solving

− i) + Uk + (iµ − ω) = 0,

the quadratic equation gives us two branches

−U ±

− 4(iµ − ω)(c

− i)

2(c

− i)

To understand whether this instability is convective or absolute, we can complete

the square to obtain

ω = ω

(k − k

)

where

= 2(c

− i)

2(i − c

)

= −

4(1 + c

)

+ i



µ −

4(1 + c

)



These k

are then the absolute wavenumber and absolute frequency.

Note that after completing the square, it is clera that

is a double root at

. Of course, there is not nothing special about this example, since

was

defined to solve

∂ω

∂k

) = 0!

I really ought to say something about Bers’ method at this point.

In practice, it is not feasible to apply a

function of perturbation and see

how it grows. Instead, we can try to understand convective versus absolute

instabilities using a periodic and switched on forcing

F (x, t) = δ(x)H(t)e

−iω

where H is the Heaviside step function. The response in spectral space is then

˜χ(k, ω) =

F (k, ω)

D(k, ω)

D(k, ω)(ω − ω

)

There is a new (simple) pole precisely at

on the real axis. We can invert

for t to obtain

˜χ(k, t) =

2π

−iωt

D(k, ω)(ω − ω

)

dω =

−iω

D(k, ω

)

−iω(k)t

(ω(k) − ω

)

∂

∂ω

[k, ω(k)]

We can then invert for x to obtain

χ(x, t) =

−iω

2π

ikx

D(k, ω)

| {z }

(x,t)

2π

i[kx−ω(k)t]

[ω(k) − ω

]

∂

∂ω

[k, ω(k)]

| {z }

(x,t)

The second term is associated with the switch-on transients, and is very similar

to what we get in the previous steepest descent.

If the flow is absolutely unstable, then the transients dominate and invade

the entire domain. But if the flow is convectively unstable, then this term is

swept away, and all that is left is

(

x, t

). This gives us a way to distinguish

between

Note that in the forcing term, we get contributions at the singularities

(

Which one contributes depends on causality. One then checks that the correct

result is

(x, t) = iH(x)

i(k

(ω

)x−ω

∂D

∂k

(ω

), ω

]

− iH(−x)

i(k

−

(ω

)x−ω

∂D

∂k

−

(ω

), ω

]

Note that

(

) may have imaginary parts! If there is some

such that

−k

(

)

0, then we will see spatially growing waves in

x >

0. Similarly, if

there exists some

such that

−k

−

(

)

0, then we see spatially growing

waves in x < 0.

Note that we will see this effect only when we are convectively unstable.

Perhaps it is wise to apply these ideas to an actual fluid dynamics problem.

We revisit the broken line shear layer profile, scaled with the velocity jump, but

allow a non-zero mean U:

z = 1

z = −1

U = −1 + U

U = z + U

U = 1 + U

III

−α(z−1)

αz

+ Ce

−αz

α(z+1)

We do the same interface matching conditions, and after doing the same compu-

tations (or waving your hands with Galilean transforms), we get the dispersion

relation

4(ω − U

α)

= (2α − 1)

− e

−4α

It is now more natural to scale with

rather than ∆

2, and this involves

expressing everything in terms of the velocity ratio

∆U

. Then we can write

the dispersion relation as

D(k, ω; R) = 4(ω − k)

− R

[(2k − 1)

− e

−4k

] = 0.

Note that under this scaling, the velocity for

z < −

1 is

= 1

−R

. In particular,

R <

1, then all of the fluid is flowing in the same direction, and we might

expect the flow to “carry away” the perturbations, and this is indeed true.

The absolute/convective boundary is given by the frequency at wavenumber

for zero group velocity:

∂k

) = 0.

This gives us

= k

−

[2k

− 1 + e

−4k

Plugging this into the dispersion relations, we obtain

[2k

− 1 + e

−4k

] − [(2k

− 1)

− e

−4k

] = 0.

We solve for

, which is in general complex, and then solve for

to see if it

leads to

0,i

0. This has to be done numerically, and when we do this, we see

that the convective/absolute boundary occurs precisely at R = 1.

Gaster relation

Temporal and spatial instabilities are related close to a marginally stable state.

This is flow at a critical parameter

with critical (real) wavenumber and

frequency: D(k

, ω

; R

) = 0 with ω

c,i

= k

c,i

= 0.

We can Taylor expand the dispersion relation

ω = ω

∂ω

∂k

; R

)[k − k

We take the imaginary part

∂ω

∂k

, R

)(k

− k

) +

∂ω

∂k

, R

For the temporal mode, we have k

= 0, and so

(T )

∂ω

∂k

, R

)(k

− k

For the spatial mode, we have ω

= 0, and so

0 =

∂ω

∂k

, R

)[k

− k

] +

∂ω

∂k

, R

(S)

Remembering that c

∂ω

∂k

. So we find that

(T )

= −c

(S)

This gives us a relation between the growth rates of the temporal mode and

spatial mode when we are near the marginal stable state.

Bizarrely, this is often a good approximation when we are far from the

marginal state.

Global instabilities

So far, we have always been looking at flows that were parallel, i.e. the base

flow depends on

alone. However, in real life, flows tend to evolve downstream.

Thus, we want to consider base flows U = U(x, z).

Let

be the characteristic wavelength of the perturbation, and

be the

characteristic length of the change in

. If we assume

ε ∼



1, then we may

want to perform some local analysis.

To leading order in

, evolution is governed by frozen dispersion relation at

each

εx

. We can then extend notions of stability/convective/absolute to

local notions, e.g. a flow is locally convectively unstable if there is some

such

that ω

i,max

(X) > 0, but ω

0,i

(X) < 0 for all X.

However, we can certainly imagine some complicated interactions between

the different region. For example, a perturbation upstream may be swept

downstream by the flow, and then get “stuck” somewhere down there. In general,

we can define

Definition

(Global stability)

A flow is globally stable if

lim

t→∞

(

x, t

) = 0 for

all x.

A flow is globally unstable if there is some

such that

lim

t→∞

(

x, t

)

→ ∞

For a steady spatially developing flow, global modes are

χ(x, t) = φ(x)e

−iω

The complex global frequency

is determined analogously to before using

complex integration.

It can be established that

G,i

≤ ω

0,i,max

. This then gives a necessary

condition for global instability: there must be a region of local absolute instability

within the flow.

Sometimes this is a good predictor, i.e.

' R

. For example, with mixing

layers, we have

= 1

315 while

= 1

34. On the other hand, it is sometimes

poor. For example, for bluff-body wakes, we have Re

= 25 while Re

= 48.5.

3 Transient growth

3.1 Motivation

So far, our model of stability is quite simple. We linearize our theory, look at

the individual perturbation modes, and say the system is unstable if there is

exponential growth. In certain circumstances, it works quite well. In others,

they are just completely wrong.

There are six billion kilometers of pipes in the United States alone, where the

flow is turbulent. A lot of energy is spent pumping fluids through these pipes,

and turbulence is not helping. So we might think we should try to understand

flow in a pipe mathematically, and see if it gives ways to improve the situation.

Unfortunately, we can prove that flow in a pipe is linearly stable for all

Reynolds numbers. The analysis is not wrong. The flow is indeed linearly stable.

The real problem is that linear stability is not the right thing to consider.

We get similar issues with plane Poiseuille flow, i.e. a pressure driven flow

between horizontal plates. As we know from IB Fluids, the flow profile is a

parabola:

One can analyze this and prove that it is linearly unstable when

Re =

> 5772.

However, it is observed to be unstable at much lower Re.

We have an even more extreme issue for plane Couette flow. This is flow

between two plates at

1 (after rescaling) driven at speeds of

1 (after

scaling). Thus, the base flow is given by

U = z, |z| ≤ 1.

Assuming the fluid is inviscid, the Rayleigh equation then tells us perturbations

obey



(

U − c)



− k



−



ˆw = 0.

Since

U = z, the second derivative term drops out and this becomes

(

U − c)



− k



ˆw = 0.

If we want our solution to be smooth, or even just continuously differentiable,

then we need



− k



ˆw = 0. So the solution is of the form

ˆw = A sinh k(z + 1) + B sinh k(z − 1).

However, to satisfy the boundary conditions

ˆw

(

1) = 0, then we must have

A = B = 0, i.e. ˆw = 0.

Of course, it is not true that there can be no perturbations. Instead, we have

to relax the requirement that the eigenfunction is smooth. We shall allow it

to be non-differentiable at certain points, but still require that it is continuous

(alternatively, we relax differentiability to weak differentiability).

The fundamental assumption that the eigenfunction is smooth must be

relaxed. Let’s instead consider a solution of the form

ˆw

= A

sinh k(z − 1) z > z

ˆw

−

= A

−

sinh k(z + 1) z < z

If we require the vertical velocity to be continuous at the critical layer, then we

must have the matching condition

sinh k(z − z

) = A

−

sinh k(z + z

This still satisfies the Rayleigh equation if

at the critical layer. Note that

u is discontinuous at the critical layer, because incompressibility requires

∂w

∂z

= −

∂u

∂x

= −iku.

So for all

|c|

|ω/k| <

1, there is a (marginally stable) mode. The spectrum

is continuous. There is no discrete spectrum. This is quite weird, compared to

what we have previously seen.

But still, we have only found modes with a real

, since

is real! Thus, we

conclude that inviscid plane Couette flow is stable! While viscosity regularizes

the flow, but it turns out that does not linearly destabilize the flow at any

Reynolds number (Romanov, 1973).

Experimentally, and numerically, plane Couette flow is known to exhibit a

rich array of dynamics.

– Up to Re ≈ 280, we have laminar flow.

– Up to Re ≈ 325, we have transient spots.

– Up to Re ≈ 415, we have sustained spots and stripes.

– For Re > 415, we have fully-developed turbulence.

In this chapter, we wish to understand transient growth. This is the case when

small perturbations can grow up to some visible, significant size, and then die

off.

3.2 A toy model

Let’s try to understand transient dynamics in a finite-dimensional setting. Ul-

timately, the existence of transient growth is due to the non-normality of the

operator.

Recall that a matrix

is normal iff

†

. Of course, self-adjoint

matrices are examples of normal operators. The spectral theorem says a normal

operator has a complete basis of orthonormal eigenvectors, which is the situation

we understand well. However, if our operator is not normal, then we don’t

necessarily have a basis of eigenvectors, and even if we do, they need not be

orthonormal.

So suppose we are in a 2-dimensional world, and we have two eigenvectors

that are very close to each other:

Now suppose we have a small perturbation given by

= (

ε,

. While this

perturbation is very small in magnitude, if we want to expand this in the basis

and Φ

, we must use coefficients that are themselves quite large. In this case,

we might have ε = Φ

− Φ

, as indicated in red in the diagram above.

Let’s let this evolve in time. Suppose both Φ

and Φ

are stable modes, but

decays much more quickly than Φ

. Then after some time, the perturbation

will grow like

Note that the perturbation grows to reach a finite, large size, until it vanishes

again as the Φ

component goes away.

Let’s try to put this down more concretely in terms of equations. We shall

consider a linear ODE of the form

x = Ax.

We first begin by considering the matrix

A =



0 1

0 0



which does not exhibit transient growth. We first check that

is not normal.

Indeed,



1 0

0 0





0 0

0 1



= A

Note that the matrix

has a repeated eigenvalue of 0 with a single eigenvector

of (1 0).

To solve this system, write the equations more explicitly as

˙x

= x

˙x

= 0.

If we impose the initial condition

, x

)(0) = (x

, x

then the solution is

(t) = x

t + x

(t) = x

This exhibits linear, algebraic growth instead of the familiar exponential growth.

Let’s imagine we perturb this system slightly. We can think of our previous

system as the

∞

approximation, and this small perturbation as the effect

of a large but finite Reynolds number. For ε > 0, set



−ε 1

0 −2ε



We then have eigenvalues

= −ε

= −2ε,

corresponding to eigenvectors





, e



−ε



Notice that the system is now stable. However, the eigenvectors are very close

to being parallel.

As we previously discussed, if we have an initial perturbation (

ε,

0), then it

is expressed in this basis as

−e

. As we evolve this in time, the

component

decays more quickly than

. So after some time, the

term is mostly gone,

and what is left is a finite multiple of

, and generically, we expect this to have

magnitude larger than ε!

We try to actually solve this. The second row in

x = Ax gives us

˙x

= −2εx

This is easy to solve to get

= x

−2εt

Plugging this into the first equation, we have

˙x

= −εx

+ x

−2εt

We would expect a solution of the form

−εt

−Be

−2εt

, where the first term

is the homogeneous solution and the second comes from a particular solution.

Plugging this into the equation and applying our initial conditions, we need





−εt

−

−2εt

Let us set

= x

= −

Then the full solution is

x = y

−εt

+ y

−2εt

For

εt 

1, we know that

x ∼ y

−εt

. So our solution is an exponentially

decaying solution.

But how about early times? Let’s consider the magnitude of x. We have

kxk

= y

−2εt

· e

+ 2y

−3εt

· e

+ y

−4εt

= y

−2εt

+ 2y

−3εt

+ (1 + ε

−4εt

If y

= 0 or y

= 0, then this corresponds to pure exponential decay.

Thus, consider the situation where

−ay

for

a 6

= 0. Doing some

manipulations, we find that

kxk

= y

(1 − 2a + a

(1 + ε

)) + y

(−2 + 6a − 4a

(1 + ε

))εt + O(ε

Therefore we have initial growth if

(1 + ε

) − a + 2 < 0.

Equivalently, if

−

< a < a

with

3 ±

√

1 − 8ε

4(1 + ε

)

Expanding in ε, we have

= 1 − 2ε

+ O(ε

−

1 + ε

+ O(ε

These correspond to

2ε

and

' εx

respectively, for

ε 

1. This is

interesting, since in the first case, we have

 x

, while in the second case,

we have the opposite. So this covers a wide range of possible x

, x

What is the best initial condition to start with if we want the largest possible

growth? Let’s write everything in terms of

for convenience, so that the energy

is given by

E =

x · x

−2εt

− 2ae

−3εt

+ a

(1 + ε

−4εt

Take the time derivative of this to get

= −ε

−2εt

(2 − 6(ae

−εt

) + 4(1 + ε

)(ae

−εt

)

Setting ˆa = ae

−εt

, we have

> 0 iff

2 − 6ˆa + 4ˆa

(1 + ε

) < 0.

When

= 0, then

ˆa

, and we saw that there is an initial growth if

−

< a < a

We now see that we continue to have growth as long as

ˆa

lies in this region. To

see when E reaches a maximum, we set

= 0, and so we have

−

− ae

−εt

)(ae

−εt

− a

) = 0.

So a priori, this may happen when

ˆa

−

. However, we know it must

occur when

ˆa

hits

−

, since

ˆa

is decreasing with time. Call the time when this

happens t

max

, which we compute to be given by

εt

max

= log

−

Now consider the energy gain

G =

E(t

max

)

E(0)

−

(1 + ε

) − 2a

+ 1

(1 + ε

) − 2a + 1

Setting

= 0, we find that we need a = a

, and so

max

G =

(3a

−

− 1)(1 − a

−

)

(3a

− 1)(1 − a

)

We can try to explicitly compute the value of the maximum energy gain,

using our first-order approximations to a

;

max

G =



1+ε



1 −

1+ε



(3(1 + 2ε

) − 1)(1 − (1 − 2ε

))

(1 + 3ε

)(1 − ε

)

16(1 − 3ε

)ε

∼

16ε

So we see that we can have very large transient growth for a small ε.

How about the case where there is an unstable mode? We can consider a

perturbation of the form

A =



0 −ε



with eigenvectors





, e

1 + (ε

+ ε

)



−(ε

+ ε

)



We then have one growing mode and another decaying mode. Again, we have

two eigenvectors that are very close to being parallel. We can do very similar

computations, and see that what this gives us is the possibility of a large initial

growth despite the fact that

is very small. In general, this growth scales as

1−e

·e

3.3 A general mathematical framework

Let’s try to put this phenomenon in a general framework, which would be

helpful since the vector spaces we deal with in fluid dynamics are not even

finite-dimensional. Suppose x evolves under an equation

x = Ax.

Given an inner product

h·, ·i

on our vector space, we can define the adjoint of

A by requiring

hx, Ayi = hA

†

x, yi

for all

x, y

. To see why we should care about the adjoint, note that in our

previous example, the optimal perturbation came from an eigenvector of

†

for

, namely



+ ε



and we might conjecture that this is a general

phenomenon.

For our purposes, we assume

and hence

†

have a basis of eigenvectors.

First of all, observe that the eigenvalues of

are the complex conjugates of

the eigenvalues of

†

. Indeed, let

, . . . , v

be a basis of eigenvectors of

with eigenvalues

, . . . , λ

, and

, . . . , w

a basis of eigenvectors of

†

with

eigenvalues µ

, . . . , µ

. Then we have

, v

i = hw

, Av

i = hA

†

, v

i = µ

∗

, v

But since the inner product is non-degenerate, for each

, it cannot be that

, v

i = 0 for all j. So there must be some j such that λ

= µ

∗

By picking appropriate basis for each eigenspace, we can arrange the eigen-

vectors so that

, v

= 0 unless

, and

= 1. This is the

biorthogonality property. Crucially, the basis is not orthonormal.

Now if we are given an initial condition

, and we want to solve the equation

. Note that this is trivial to solve if

for some

. Then the

solution is simply

x = e

Thus, for a general

, we should express

as a linear combination of the

’s.

If we want to write

i=1

then using the biorthogonality condition, we should set

, w

Note that since we normalized our eigenvectors so that each eigenvector has

norm 1, the denominator

, w

can be small, hence

can be large, even if

the norm of

is quite small, and as we have previously seen, this gives rise to

transient growth if the eigenvalue of

is also larger than the other eigenvalues.

In our toy example, our condition for the existence of transient growth is that

we have two eigenvectors that are very close to each other. In this formulation,

the requirement is that

, w

is very small, i.e.

and

are close to being

orthogonal. But these are essentially the same, since by the biorthogonality

conditions,

is normal to all other eigenvectors of

. So if there is some

eigenvector of

that is very close to

, then

must be very close to being

orthogonal to v

Now assuming we have transient growth, the natural question to ask is how

large this growth is. We can write the solution to the initial value problem as

x(t) = e

The maximum gain at time t is given by

G(t) = max

6=0

kq(t)k

kq(0)k

= max

6=0

This is, by definition, the matrix norm of B.

Definition

(Matrix norm)

Let

be an

n ×n

matrix. Then the matrix norm

kBk = max

v6=0

kBvk

kvk

To understand the matrix norm, we may consider the eigenvalues of the matrix.

Order the eigenvalues of

by their real parts, so that

(

)

≥ ··· ≥ Re

(

Then the gain is clearly bounded below by

G(t) ≥ e

2 Re(λ

achieved by the associated eigenvector.

is normal, then this is the complete answer. We know the eigenvectors

form an orthonormal basis, so we can write

−1

for Λ =

diag

(

, . . . , λ

)

and V is unitary. Then we have

G(t) = ke

= kV e

Λt

−1

= ke

Λt

= e

2 Re(λ

But life gets enormously more complicated when matrices are non-normal. As a

simple example, the matrix



1 1

0 1



has 1 as the unique eigenvalue, but applying

it to (0, 1) results in a vector of length

√

In the non-normal case, it would still be convenient to be able to diagonalize

in some sense, so that we can read off its norm. To do so, we must relax

what we mean by diagonalizing. Instead of finding a

such that

†

diagonal, we find unitary matrices U, V such that

†

V = Σ = diag(σ

, . . . , σ

where

∈ R

and

> ··· > σ

≥

0. We can always do this. This is known

as the singular value decomposition, and the diagonal entries

are called the

singular values. We then have

G(t) = ke

= max

x6=0

x, e

(x, x)

= max

x6=0

(UΣV

†

x, U ΣV

†

(x, x)

= max

x6=0

(ΣV

†

x, ΣV

†

(x, x)

= max

y6=0

(Σy, Σy)

(y, y)

= σ

(t).

If we have an explicit singular value decomposition, then this tells us the optimal

initial condition if we want to maximize G(t), namely the first column of v.

3.4 Orr-Sommerfeld and Squire equations

Let’s now see how this is relevant to our fluid dynamics problems. For this

chapter, we will use the “engineering” coordinate system, so that the

direction

is the vertical direction. The

x, y, z

directions are known as the streamwise

direction, wall-normal direction and spanwise direction respectively.

Again suppose we have a base shear flow

(

) subject to some small perturbations

(u, v, w). We can write down our equations as

∂u

∂x

∂v

∂y

∂w

∂z

= 0

∂u

∂t

+ U

∂u

∂x

+ vU

= −

∂p

∂x

∇

∂v

∂t

+ U

∂v

∂x

= −

∂p

∂y

∇

∂w

∂t

+ U

∂w

∂x

= −

∂p

∂z

∇

Again our strategy is to reduce these to a single, higher order equation in

To get rid of the pressure term, we differentiate the second, third and fourth

equation with respect to

x, y

and

respectively and apply incompressibility to

obtain

∇

p = −2U

∂v

∂x

By applying

∇

to the third equation, we get an equation for the wall-normal

velocity:



∂

∂t

+ U

∂

∂x



∇

− U

∂

∂x

−

∇



v = 0.

We would like to impose the boundary conditions

∂v

∂y

= 0, but together with

initial conditions, this is not enough to specify the solution uniquely as we have

a fourth order equation. Thus, we shall require the vorticity to vanish at the

boundary as well.

The wall-normal vorticity is defined by

η = ω

∂u

∂z

−

∂w

∂x

By taking

∂

∂z

of the second equation and then subtracting the

∂

∂x

of the last, we

then get the equation



∂

∂t

+ U

∂

∂x

−

∇



η = −U

∂v

∂z

As before, we decompose our perturbations into Fourier modes:

v(x, y, z, t) = ˆv(y)e

i(αx+βz−ωt)

η(x, y, z, t) = ˆη(y)e

i(αx+βz−ωt)

For convenience, we set

and write

for the

derivative. We

then obtain the Orr-Sommerfeld equation and Squire equation with boundary

conditions ˆv = Dˆv = η = 0:



(−iω + iαU)(D

− k

) − iαU

−

− k

)



ˆv = 0



(−iω + iαU) −

− k

)



ˆη = −iβU

ˆv.

Note that if we set Re = ∞, then this reduces to the Rayleigh equation.

Let’s think a bit more about the Squire equation. Notice that there is an

explicit

ˆv

term on the right. Thus, the equation is forced by the wall-normal

velocity. In general, we can distinguish between two classes of modes:

(i)

Squire modes, which are solutions to the homogeneous problem with

ˆv

= 0;

(ii) Orr-Sommerfeld modes, which are particular integrals for the actual ˆv;

The Squire modes are always damped. Indeed, set

ˆv

= 0, write

αc

, multiply

the Squire equation by η

∗

and integrate across the domain:

−1

ˆη

∗

ˆη dy =

−1

U ˆη

∗

η dy −

αRe

−1

ˆη

∗

− D

)ˆη dy.

Taking the imaginary part and integrating by parts, we obtain

−1

|ˆη|

dy = −

αRe

−1

|Dˆη|

+ k

|ˆη|

dy < 0.

So we see that the Squire modes are always stable, and instability in vorticity

comes from the forcing due to the velocity.

It is convenient to express the various operators in compact vector form.

Define

q =



ˆη



, M =



− D

0 1



, L =



iβU



where

= iαU(k

− D

) + iαU

− D

)

= iαU +

− D)

We can then write our equations as

q = iωM

This form of equation reminds us of what we saw in Sturm–Liouville theory,

where we had an eigenvalue equation of the form

λwu

for some weight

function

. Here

is not just a weight function, but a differential operator.

However, the principle is the say. First of all, this tells us the correct inner

product to use on our space of functions is

hp, qi =

†

Mq dy., (∗)

We can make the above equation look like an actual eigenvalue problem by

writing it as

−1

q = iω

We want to figure out if the operator

−1

is self-adjoint under (

∗

), because

this tells us something about its eigenvalues and eigenfunctions. So in particular,

we should be able to figure out what the adjoint should be. By definition, we

have

hp, M

−1

Lqi =

†

−1

Lq dy

†

Lq dy

†

p))

∗

†

M(M

−1

†

p))

∗

dy,

where

†

is the adjoint of

under the

norm. So the adjoint eigenvalue

equation is

†

q = −iωMq.

Here we introduced a negative sign in the right hand side, which morally comes

from the fact we “took the conjugate” of

. Practically, adopting this sign

convention makes, for example, the statement of biorthogonality cleaner.

So mathematically, we know we should take the inner product as

†

Physically, what does this mean? From incompressibility

∂u

∂x

∂v

∂y

∂w

∂z

= 0,

plugging in our series expansions of v, η etc. gives us

iαˆu + iβ ˆw = −Dˆv, iβˆu − iα ˆw = ˆη.

So we find that

ˆu =

(αDˆv − βη), ˆw =

(βDˆv + αη).

Thus, we have

(|u|

+ |w|

) =



|Dˆv|

+ |η|



and the total energy is

E =



|Dˆv|

+ k

|ˆv|

+ |η|



−1

(ˆv

∗

ˆη

∗

)



− D

0 1



ˆv

ˆη



dy =

hq, qi.

So the inner product the mathematics told us to use is in fact, up to a constant

scaling, the energy!

We can now try to discretize this, do SVD (numerically), and find out what

the growth rates are like. However, in the next chapter, we shall see that there

is a better way of doing so. Thus, in the remainder of this chapter, we shall look

at ways in which transient growth can physically manifest themselves.

Orr’s mechanisms

Suppose we have a simple flow profile that looks like this:

Recall that the z-direction vorticity is given by

∂v

∂x

−

∂u

∂y

and evolves as

∂ω

∂t

+ U

∂ω

∂x

= U



∂u

∂x

∂v

∂y



+ vU

∇

Assuming constant shear and

= 0 at high Reynolds number, we have

Dω

Suppose we have some striped patch of vorticity:

In this stripe, we always have

0. Now over time, due to the shear flow, this

evolves to become something that looks like

Now by incompressibility, the area of this new region is the same as the area

of the old. We also argued that

does not change. So the total quantity

dA =

∇ × u · dA does not change. But Stokes’ theorem says

∇ × u · dA =

∂D

u · d`.

Since the left-hand-side didn’t change, the same must be true for the right hand

side. But since the boundary

∂D

decreased in length, this implies

must have

increased in magnitude! This growth is only transient, since after some further

time, the vorticity patch gets sheared into

But remember that Stokes theorem tells us

[∇ × u] dA =

∂D

u · d`.

Thus, if we have vortices that are initially tilted into the shear, then this gets

advected by the mean shear. In this process, the perimeter of each vortex sheet

gets smaller, then grows again. Since

∂D

u ·

is constant, we know

grows

transiently, and then vanishes again.

Lift up

Another mechanism is lift-up. This involves doing some mathematics. Suppose

we instead expand our solutions as

v = ˜v(y, t)e

iαx+iβz

, η = ˜η(y, t)e

iαx+iβz

Focusing on the

= 0 and

∞

case, the Orr-Sommerfeld and Squire

equations are

∂

∂t

˜η(y, t) = −iβU

˜v

∂

∂t

− k

)˜v = 0.

Since we have finite depth, except for a few specific values of

, the only solution

to the second equation is ˜v(y, t) = ˜v

(y), and then

˜η = ˜η

− iβU

˜v

This is an algebraic instability. The constant

˜v

(

y, t

) means fluid is constantly

being lifted up, and we can attribute this to the effects of the

of streamwise

rolls.

We should be a bit careful when we consider the case where the disturbance

is localized. In this case, we should consider quantities integrated over all

. We

use a bar to denote this integral, so that, for example,

¯v

∞

−∞

. Of course,

this makes sense only if the disturbance is local, so that the integral converges.

Ultimately, we want to understand the growth in the energy, but it is convenient

to first understand ¯v, ¯u, and the long-forgotten ¯p. We have three equations

∇

p = −2U

∂v

∂x

∂v

∂t

+ U

∂v

∂x

= −

∂p

∂y

∂u

∂t

+ U

∂u

∂x

+ vU

= −

∂p

∂x

Note that

does not depend on

, and all our (small lettered) variables vanish

at infinity since the disturbance is local. Thus, integrating the first equation

over all

, we get

∇

¯p

= 0. So

¯p

= 0. Integrating the second equation then tells

∂¯v

∂t

= 0. Finally, plugging this into the integral of the last equation tells us

¯u = ¯u

− ¯v

Thus,

¯u

grows linearly with time. However, this does not immediately imply

that the energy is growing, since the domain is growing as well, and the velocity

may be spread over a larger region of space.

Let’s suppose

(

0) = 0 for

|x| > δ

. Then at time

, we would expect

be non-zero only at

min

t − δ < x < U

max

. Recall that Cauchy–Schwarz

says



∞

−∞

f(x)g(x) dx



≤



∞

−∞

|f(x)|



∞

−∞

|g(x)|



Here we can take f = u, and

g(x) =

(

1 U

min

t − δ < x < U

max

t + δ

0 otherwise

Then applying Cauchy–Schwarz gives us

¯u

≤ [∆U + t + 2δ]

∞

−∞

dx.

So we can bound

E ≥

[¯u]

2∆U

t =

[¯v

]

2∆U

provided

t 

2δ

∆U

. Therefore energy grows at least as fast as

, but not necessarily

as fast as t

4 A variational point of view

In this chapter, we are going to learn about a more robust and powerful way

to approach and understand transient growth. Instead of trying to think about

G(t) as a function of t, let us fix some target time T , and just look at G(T ).

For any two times t

< t

, we can define a propagator function such that

u(t

) = Φ(t

, t

)u(t

)

for any solution

of our equations. In the linear approximation, this propagator

is a linear function between appropriate function spaces. Writing Φ = Φ(

T, T

the gain problem is then equivalent to maximizing

G(T, T

) =

E(T )

E(T

)

(T ), u

(T )i

), u

hΦu

), Φu

), u

), Φ

†

Φu

), u

Here the angled brackets denote natural inner product leading to the energy

norm. Note that the operator Φ

†

Φ is necessarily self-adjoint, and so this

is maximized when

(

) is chosen to be the eigenvector of Φ of maximum

eigenvalue.

There is a general method to find the maximum eigenvector of a (self-adjoint)

operator Φ. We start with a random vector

. Then we have Φ

x ∼ λ

n → ∞

, where

is the maximum eigenvalue with associated eigenvector

Indeed, if we write

as the linear combination of eigenvectors, then as we apply

Φ many times, the sum is dominated by the term with largest eigenvalue.

So if we want to find the mode with maximal transient growth, we only need

to be able to compute Φ

†

Φ. The forward propagator Φ(

T, T

) is something we

know how to compute (at least numerically). We simply numerically integrate

the Navier–Stokes equation. So we need to understand Φ(T, T

)

†

Let u(t) be a solution to the linear equation

(t) = L(t)u(t).

Here we may allow

to depend on time. Let

†

be the adjoint, and suppose

v(t) satisfies

(t) = −L(t)

†

v(t).

Then the chain rule tells us

hv(t), u(t)i = h−L(t)

†

v(t), u(t)i+ hv(t), L(t)

†

u(t)i = 0.

So we know that

hv(T

), u(T

)i = hv(T ), u(T )i = hv(T ), Φ(T, T

)u(T

)i.

Thus, given a

, to compute Φ(

T, T

)

†

, we have to integrate the adjoint

−L

(

)

†

backwards in time, using the “initial” condition

(

) =

, and

then we have

Φ(T, T

)

†

= v(T

What does the adjoint equation look like? For a time-dependent background

shear flow, the linearized forward/direct equation for a perturbation

is given

∂u

∂t

+ (U(t) · ∇)u

= −∇p

− (u

· ∇)u(t) + Re

−1

∇

The adjoint (linearized) Navier–Stokes equation is then

−

∂u

∂t

= Ω × u

− ∇ × (U × u

) − ∇p

+ Re

−1

∇

where we again have

∇ · u

= 0, Ω = ∇ × U.

This PDE is ill-posed if we wanted to integrate it forwards in time, but that

does not concern us, because to compute Φ

†

, we have to integrate it backwards

in time.

Thus, to find the maximal transient mode, we have to run the direct-adjoint

loop.

) u

(T )

†

Φ(u

)) u

(T )

†

and keep running this until it converges.

Using these analysis, we can find some interesting results. For example, in

a shear layer flow, 3-dimensional modes with both streamwise and spanwise

perturbations are the most unstable. However, in the long run, the Kelvin-

Helmholtz modes dominate.

Variational formulation

We can also use variational calculus to find the maximally unstable mode. We

think of the calculation as a constrained opt imization problem with the following

requirements:

(i) For all T

≤ t ≤ T , we have

∂q

∂t

= D

q = Lq.

(ii) The initial state is given by q(0) = q

We will need Lagrangian multipliers to impose these constraints, and so the

augmented Lagrangian is

G =

, q

−

q, (D

− L)qi dt + h

, q(0) − q

Taking variations with respect to

and

recover the evolution equation and

the initial condition. The integral can then be written as

q, (D

− L)qi dt =

hq, (D

+ L

†

)

qi dt + h

, q

i − h

, q

Now if we take a variation with respect to q, we see that

q has to satisfy

+ L

†

)

q = 0.

So the Lagrange multiplier within such a variational problem evolves according

to the adjoint equation!

Taking appropriate adjoints, we can write

G =

, q

hq, (D

+ L

†

)

qi dt + h

, q

i− h

, q

i+ boundary terms.

But if we take variations with respect to our initial conditions, then

δG

δq

= 0

gives

2hq

, q

Similarly, setting

δG

δq

= 0, we get

, q

Applying h·, q

i to the first equation and h·, q

i to the second, we find that

, q

i = h

, q

This is the equation we previously saw for the adjoint equation, and this provides

a consistency condition to see if we have found the optimal solution.

Previously our algorithm used power iteration, which requires us to integrate

forwards and backwards many times. However, we now have gradient information

for gain:

δG

∂q

−

2hq

, q

This allows us to exploit optimization algorithms (steepest descent, conjugate

gradient etc.). This has an opportunity for faster convergence.

There are many ways we can modify this. One way is to modify the inner

product. Note that we actually had to use the inner product in three occasions:

(i) Inner product in the objective functional J

(ii) Inner product in initial normalization/constraint of the state vector.

(iii) Inner product in the definition of adjoint operator.

We used the same energy norm for all of these, but there is no reason we have

to use the same norm for all of these. However, there are strong arguments that

an energy norm is natural for norm 2. It is a wide open research question as to

whether there is an appropriate choice for inner product 3. On the other hand,

investigation of variation of inner product 1 has been widely explored.

As an example, consider p-norms

J =



Ω

e(x, T )

dΩ



1/p

, e(x, T ) =

|u(x, T )|

This has the attraction that for large values of

, this would be dominated by

peak values of e, not average.

When we set

= 1, i.e. we use the usual energy norm, then we get a beautiful

example of the Orr mechanism, in perfect agreement with what SVD tells us.

For, say, p = 50, we get more exotic center/wall modes.

Non-linear adjoints

In the variational formulation, there is nothing in the world that stops us from

using a non-linear evolution equation! We shall see that this results in slightly

less pleasant formulas, but it can still be done.

Consider plane Couette flow with

= 1000, and allow arbitrary amplitude

in perturbation

tot

= U + u, ∂

u + (u + U) · ∇(u + U) = −∇p + Re

−1

∇

where U = yˆx. We can define a variational problem with Lagrangian

L =

E(T )

−[∂

u+N(u)+∇p, v]−[∇·u, q]−



, u

i − E



c−hu(0)−u

, v

where

N(u

) = U

∂

+ u

∂

+ u

∂

−

∂

, [v, u] =

hv, ui dt.

Variations with respect to the direct variable

can once again define a non-linear

adjoint equation

δL

δu

= ∂

v + N

†

(v, u) + ∇q +



− v





t=T

+ (v − v

t=0

= 0,

where

†

, u) = ∂

, v

) − v

∂

+ ∂

) − v

∂

− v

∂

We also have

δL

δp

= ∇ · v = 0,

δL

δu

= v

− cu

= 0.

Note that the equation for the adjoint variable

depends on the direct variable

, but is linear in

. Computationally, this means that we need to remember

our solution to

when we do our adjoint loop. If

is large, then it may not be

feasible to store all information about

across the whole period, as that is too

much data. Instead, the method of checkpointing is used:

(i) Pick “checkpoints” 0 = T

< ··· < T

= T .

(ii) When integrating u, we remember high resolution data for u(x, T

(iii)

When integrating

backwards in the interval (

K−1

, T

), we use the data

remembered at

K−1

to re-integrate to obtain detailed information about

u in the interval (T

k−1

, T

This powerful algorithmic approach allows us to identify minimal seed of turbu-

lence, i.e. the minimal, finite perturbation required to enter a turbulence mode.

Note that this is something that cannot be understood by linear approximations!

This is rather useful, because in real life, it allows us to figure out how to modify

our system to reduce the chance of turbulence arising.