Part III Hydrodynamic Stability
Based on lectures by C. P. Caulfield
Notes taken by Dexter Chua
Michaelmas 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Developing an understanding by which “small” perturbations grow, saturate and modify
fluid flows is central to addressing many challenges of interest in fluid mechanics.
Furthermore, many applied mathematical tools of much broader relevance have been
developed to solve hydrodynamic stability problems, and hydrodynamic stability theory
remains an exceptionally active area of research, with several exciting new developments
being reported over the last few years.
In this course, an overview of some of these recent developments will be presented.
After an introduction to the general concepts of flow instability, presenting a range of
examples, the major content of this course will be focussed on the broad class of flow
instabilities where velocity “shear” and fluid inertia play key dynamical roles. Such
flows, typically characterised by sufficiently“high” Reynolds number
Ud/ν
, where
U
and
d
are characteristic velocity and length scales of the flow, and
ν
is the kinematic
viscosity of the fluid, are central to modelling flows in the environment and industry.
They typically demonstrate the key role played by the redistribution of vorticity within
the flow, and such vortical flow instabilities often trigger the complex, yet hugely
important process of “transition to turbulence”.
A hierarchy of mathematical approaches will be discussed to address a range of
“stability” problems, from more traditional concepts of “linear” infinitesimal normal
mode perturbation energy growth on laminar parallel shear flows to transient, inherently
nonlinear perturbation growth of general measures of perturbation magnitude over
finite time horizons where flow geometry and/or fluid properties play a dominant
role. The course will also discuss in detail physical interpretations of the various flow
instabilities considered, as well as the industrial and environmental application of the
results of the presented mathematical analyses
Pre-requisites
Elementary concepts from undergraduate real analysis. Some knowledge of complex
analysis would be advantageous (e.g. the level of IB Complex Analysis/Methods). No
knowledge of functional analysis is assumed.
Contents
1 Linear stability analysis
1.1 Rayleigh–Taylor instability
1.2 Rayleigh–B´enard convection
1.3 Classical Kelvin–Helmholtz instability
1.4 Finite depth shear flow
1.5 Stratified flows
2 Absolute and convective instabilities
3 Transient growth
3.1 Motivation
3.2 A toy model
3.3 A general mathematical framework
3.4 Orr-Sommerfeld and Squire equations
4 A variational point of view
1 Linear stability analysis
1.1 Rayleigh–Taylor instability
In this section, we would like to investigate the stability of a surface between
two fluids:
η
ρ
2
ρ
1
We assume the two fluids have densities
ρ
1
and
ρ
2
, and are separated by a
smooth interface parametrized by the deviation η.
Recall that the Navier–Stokes equations for an incompressible fluid are
ρ
u
t
+ u ·u
= −∇P gρ
ˆ
z + µ
2
u, ·u = 0.
Usually, when doing fluid mechanics, we assume density is constant, and divide
the whole equation across by
ρ
. We can then forget the existence of
ρ
and carry
on. We can also get rid of the gravity term. We know that the force of gravity
is balanced out by a hydrostatic pressure gradient. More precisely, we can write
P = P
h
(z) + p(x, t),
where P
h
satisfies
P
h
z
= gρ.
We can then write our equation as
u
t
+ u ·u = −∇
p
ρ
+ ν
2
u.
To do all of these, we must assume that the density
ρ
is constant, but this is
clearly not the case when we have tow distinct fluids.
Since it is rather difficult to move forward without making any simplifying
assumptions, we shall focus on the case of inviscid fluids, so that we take
ν
= 0.
We will also assume that the fluid is irrotational. In this case, it is convenient to
use the vector calculus identity
u ·u =
1
2
|u|
2
u ×(× u),
where we can now drop the second term. Moreover, since
× u
= 0, Stokes’
theorem allows us to write
u
=
φ
for some velocity potential
φ
. Finally, in
each separate region, the density ρ is constant. So we can now write
ρ
φ
t
+
1
2
ρ|∇φ|
2
+ P + gρz
= 0.
This is Bernoulli’s theorem.
Applying these to our scenario, since we have two fluids, we have two separate
velocity potentials
φ
1
, φ
2
for the two two regions. Both of these independently
satisfy the incompressibility hypothesis
2
φ
1,2
= 0.
Bernoulli’s theorem tells us the quantity
ρ
φ
t
+
1
2
ρ|∇φ|
2
+ P + gρz
should be constant across the fluid. Our investigation will not use the full
equation. All we will need is that this quantity matches at the interface, so that
we have
ρ
1
φ
1
t
+
1
2
ρ
1
|∇φ
1
|
2
+ P
1
+ gρ
1
η
z=η
= ρ
2
φ
2
t
+
2
2
ρ
2
|∇φ
2
|
2
+ P
2
+ gρ
2
η
z=η
.
To understand the interface, we must impose boundary conditions. First of all
the vertical velocities of the fluids must match with the interface, so we impose
the kinematic boundary condition
φ
1
z
z=η
=
φ
2
z
z=η
=
Dη
Dt
,
where
D =
t
+ u ·
We also make another, perhaps dubious boundary condition, namely that the
pressure is continuous along the interface. This does not hold at all interfaces.
For example, if we have a balloon, then the pressure inside is greater than the
pressure outside, since that is what holds the balloon up. In this case, it is the
rubber that is exerting a force to maintain the pressure difference. In our case,
we only have two fluids meeting, and there is no reason to assume a discontinuity
in pressure, and thus we shall assume it is continuous. If you are not convinced,
this is good, and we shall later see this is indeed a dubious assumption.
But assuming it for the moment, this allows us to simplify our Bernoulli’s
theorem to give
ρ
1
φ
1
t
+
1
2
ρ
1
|∇φ
1
|
2
+ gρ
1
η = ρ
2
φ
2
t
+
2
2
ρ
2
|∇φ
2
|
2
+ gρ
2
η.
There is a final boundary condition that is specific to our model. What we are
going to do is that we will start with a flat, static solution with
u
= 0 and
η
= 0.
We then perturb
η
a bit, and see what happens to our fluid. In particular, we
want to see if the perturbation is stable.
Since we expect the interesting behaviour to occur only near the interface,
we make the assumption that there is no velocity in the far field, i.e.
φ
1
0
as
z
and
φ
2
0 as
z −∞
(and similarly for the derivatives). For
simplicity, we also assume there is no y dependence.
We now have equations and boundary conditions, so we can solve them. But
these equations are pretty nasty and non-linear. At this point, one sensible
approach is to linearize the equations by assuming everything is small. In addition
to assuming that
η
is small, we also need to assume that various derivatives such
as
φ
are small, so that we can drop all second-order terms. Since
η
is small,
the value of, say,
φ
1
z
at
η
should be similar to that at
η
= 0. Since
φ
1
z
itself is
already assumed to be small, the difference would be second order in smallness.
So we replace all evaluations at
η
with evaluations at
z
= 0. We are then left
with the collection of 3 equations
2
φ
1,2
= 0
φ
1,2
z
z=0
= η
t
ρ
1
φ
1
t
+ gρ
1
η
z=0
= ρ
2
φ
2
t
+ gρ
2
η
z=0
.
This is a nice linear problem, and we can analyze the Fourier modes of the
solutions. We plug in an ansatz
φ
1,2
(x, z, t) =
ˆ
φ
1,2
(z)e
i(kxωt)
η(x, t) = Be
i(kxωt)
.
Substituting into Laplace’s equation gives
ˆ
φ
1,2
k
2
ˆ
φ
1,2
= 0.
Using the far field boundary conditions, we see that we have a family of solutions
ˆ
φ
1
= A
1
e
kz
,
ˆ
φ
2
= A
2
e
kz
.
The kinematic boundary condition tells us we must have
ˆ
φ
0
1,2
(0) = B.
We can solve this to get
B =
kA
1
=
kA
2
.
In particular, we must have A A
1
= A
2
. We can then write
η =
kA
e
ik(kxωt)
.
Plugging these into the final equation gives us
ρ
1
(A) + gρ
1
kA
= ρ
2
A + gρ
2
kA
.
Crucially, we can cancel the
A
throughout the equation, and gives us a result
independent of
A
. This is, after all, the point of linearization. Solving this gives
us a dispersion relation relating the frequency (or phase speed
c
p
=
ω/k
) to the
wavenumber:
ω
2
=
g(ρ
2
ρ
1
)k
ρ
1
+ ρ
2
.
If
ρ
2
ρ
1
, then this reduces to
ω
2
gk
, and this is the usual dispersion relation
for deep water waves. On the other hand, if
ρ
2
ρ
1
>
0 is small, this can lead
to waves of rather low frequency, which is something we can observe in cocktails,
apparently.
But nothing in our analysis actually required
ρ
2
> ρ
1
. Suppose we had
ρ
1
> ρ
2
. This amounts to putting the heavier fluid on top of the lighter one.
Anyone who has tried to turn a cup of water upside down will know this is highly
unstable. Can we see this from our analysis?
If
ρ
1
< ρ
2
, then
ω
has to be imaginary. We shall write it as
ω
=
±
, where
σ > 0. We can then compute
σ =
s
g(ρ
1
ρ
2
)k
ρ
1
+ ρ
2
.
Writing out our expression for
φ
1,2
, we see that there are
e
±σt
terms, and the
e
σt
term dominates in the long run, causing
φ
1,2
to grow exponentially. This is
the Rayleigh–Taylor instability.
There is more to say about the instability. As the wavelength decreases,
k
increases, and we see that
σ
increases as well. Thus, we see that short-scale
perturbations blow up exponentially much more quickly, which means the system
is not very well-posed. This is the ultra-violet catastrophe. Of course, we should
not trust our model here. Recall that in our simplifying assumptions, we not
only assumed that the amplitudes of our
φ
and
η
were small, but also that their
derivatives were small. The large
k
behaviour gives us large
x
derivatives, and
so we have to take into account the higher order terms as well.
But we can also provide physical reasons for why small scales perturbations
should be suppressed by such a system. In our model, we assumed there is no
surface tension. Surface tension quantifies the tendency of interfaces between
fluids to minimize surface area. We know that small-scale variations will cause
a large surface area, and so the surface tension will suppress these variations.
Mathematically, what the surface allows for is a pressure jump across the
interface.
Surface tension is quantified by
γ
, the force per unit length. This has
dimensions [
γ
] =
MT
1
. Empirically, we observe that the pressure difference
across the interface is
p = γ ·
ˆ
n = 2γH = γ
1
R
x
+
1
R
y
,
where
ˆ
n
is the unit normal and
H
is the mean curvature. This is an empirical
result.
Again, we assume no dependence in the
y
direction, so we have a cylindrical
interface with a single radius of curvature. Linearizing, we have a pressure
difference of
(P
2
P
1
)|
z=η
=
γ
R
x
= γ
2
η
x
2
1 +
η
x
2
3/2
γ
2
η
x
2
.
Therefore the (linearized) dynamic boundary condition becomes
ρ
1
φ
1
t
+ gρ
1
η
z=0
+ γ
2
η
x
2
= ρ
2
φ
2
t
+ gρ
2
η
z=0
.
If we run the previous analysis again, we find a dispersion relation of
ω
2
=
g(ρ
2
ρ
1
)k + γk
3
ρ
1
+ ρ
2
.
Since
γ
is always positive, even if we are in the situation when
ρ
1
> ρ
2
, for
k
sufficiently large, the system is stable. Of course, for small
k
, the system is still
unstable we still expect water to fall out even with surface tension. In the
case ρ
1
< ρ
2
, what we get is known as internal gravity-capillary waves.
In the unstable case, we have
σ
2
= k
g(ρ
2
ρ
1
)
ρ
1
+ ρ
2
(1 l
2
c
k
2
),
where
l
2
c
=
γ
g(ρ
1
ρ
2
)
is a characteristic length scale. For
k`
c
>
1, the oscillations are stable, and the
maximum value of k is attained at k =
l
c
3
.
In summary, we have a range of wavelength, and we have identified a “most
unstable wavelength”. Over time, if all frequencies modes are triggered, we
should expect this most unstable wavelength to dominate. But it is rather
hopeful thinking, because this is the result of linear analysis, and we can’t expect
it to carry on to the non-linear regime.
1.2 Rayleigh–B´enard convection
The next system we want to study is something that looks like this:
T
0
T
0
+ TT
0
+ T
d
There is a hot plate at the bottom, a cold plate on top, and some fluid in between.
We would naturally expect heat to transmit from the bottom to the top. There
are two ways this can happen:
Conduction: Heat simply diffuses from the bottom to the top, without any
fluid motion.
Convection: The fluid at the bottom heats up, expands, becomes lighter,
and moves to the top.
The first factor is controlled purely by thermal diffusivity
κ
, while the latter
is mostly controlled by the viscosity
ν
. It is natural to expect that when the
temperature gradient
T
is small, most of the heat transfer is due to conduction,
as there isn’t enough force to overcome the viscosity to move fluid around. When
T
is large, the heat transfer will be mostly due to conduction, and we would
expect such a system to be unstable.
To understand the system mathematically, we must honestly deal with the
case where we have density variations throughout the fluid. Again, recall that
the Navier–Stokes equation
ρ
u
t
+ u · u
= −∇P gρ
ˆ
z + µ
2
u, · u = 0.
The static solution to our system is the pure conduction solution, where
u
= 0,
and there is a uniform vertical temperature gradient across the fluid. Since the
density is a function of temperature, which in the static case is just a function
of
z
, we know
ρ
=
ρ
(
z
). When we perturb the system, we allow some horizontal
and time-dependent fluctuations, and assume we can decompose our density
field into
ρ = ρ
h
(z) + ρ
0
(x, t).
We can then define the hydrostatic pressure by the equation
dP
h
dz
= gρ
h
.
This then allows us to decompose the pressure as
P = P
h
(z) + p
0
(x, t).
Then our equations of motion become
u
t
+ u · u =
1
ρ
p
0
gρ
0
ρ
ˆ
z + ν
2
u, · u = 0
We have effectively “integrated out” the hydrostatic part, and it is now the
deviation from the average that leads to buoyancy forces.
An important component of our analysis involves looking at the vorticity.
Indeed, vorticity necessarily arises if we have non-trivial density variations. For
concreteness, suppose we have a interface between two fluids with different
densities, ρ and ρ + ρ. If the interface is horizontal, then nothing happens:
ρ
ρ + ρ
However, if the interface is tilted, then there is a naturally induced torque:
ρ
ρ + ρ
In other words, vorticity is induced. If we think about it, we see that the reason
this happens is that the direction of the density gradient does not align with the
direction of gravity. More precisely, this is because the density gradient does not
align with the pressure gradient.
Let’s try to see this from our equations. Recall that the vorticity is defined
by
ω = × u.
Taking the curl of the Navier–Stokes equation, and doing some vector calculus,
we obtain the equation.
ω
t
+ u · ω = ω ·u +
1
ρ
2
ρ × P + ν
2
ω
The term on the left hand side is just the material derivative of the vorticity. So
we should interpret the terms on the right-hand-side as the terms that contribute
to the change in vorticity.
The first and last terms on the right are familiar terms that have nothing to
do with the density gradient. The interesting term is the second one. This is
what we just described whenever the pressure gradient and density gradient
do not align, we have a baroclinic torque.
In general, equations are hard to solve, and we want to make approximations.
A common approximation is the Boussinesq approximation. The idea is that
even though the density difference is often what drives the motion, from the
point of view of inertia, the variation in density is often small enough to be
ignored. To take some actual, physical examples, salt water is roughly 4% more
dense than fresh water, and every 10 degrees Celsius changes the density of air
by approximately 4%.
Thus, what we do is that we assume that the density is constant except in
the buoyancy force. The mathematically inclined people could think of this as
taking the limit g but ρ
0
0 with gρ
0
remaining finite.
Under this approximation, we can write our equations as
u
t
+ u · u =
1
ρ
0
p
0
g
0
ˆ
z + ν
2
u
ω
t
+ u · ω = ω ·u +
g
ρ
0
ˆ
z × ρ + ν
2
ω,
where we define the reduced gravity to be
g
0
=
gρ
0
ρ
0
Recall that our density is to be given by a function of temperature
T
. We
must write down how we want the two to be related. In reality, this relation is
extremely complicated, and may be affected by other external factors such as
salinity (in the sea). However, we will use a “leading-order” approximation
ρ = ρ
0
(1 α(T T
0
)).
We will also need to know how temperature
T
evolves in time. This is simply
given by the diffusion equation
T
t
+ u · T = κ
2
T.
Note that in thermodynamics, temperature and pressure are related quantities.
In our model, we will completely forget about this. The only purpose of pressure
will be to act as a non-local Lagrange multiplier that enforces incompressibility.
There are some subtleties with this approximation. Inverting the relation,
T =
1
α
1
ρ
ρ
0
+ T
0
.
Plugging this into the diffusion equation, we obtain
ρ
t
+ u · ρ = κ
2
ρ.
The left-hand side is the material derivative of density. So this equation is saying
that density, or mass, can “diffuse” along the fluid, independent of fluid motion.
Mathematically, this follows from the fact that density is directly related to
temperature, and temperature diffuses.
This seems to contradict the conservation of mass. Recall that the conserva-
tion of mass says
ρ
t
+ · (uρ) = 0.
We can than expand this to say
ρ
t
+ u · ρ = ρ · u.
We just saw that the left-hand side is given by
κ
2
ρ
, which can certainly be
non-zero, but the right-hand side should vanish by incompressibility! The issue
is that here the change in
ρ
is not due to compressing the fluid, but simply
thermal diffusion.
If we are not careful, we may run into inconsistencies if we require
· u
= 0.
For our purposes, we will not worry about this too much, as we will not use the
conservation of mass.
We can now return to our original problem. We have a fluid of viscosity
ν
and thermal diffusivity
κ
. There are two plates a distance
d
apart, with the
top held at temperature
T
0
and bottom at
T
0
+
T
. We make the Boussinesq
approximation.
T
0
T
0
+ TT
0
+ T
d
We first solve for our steady state, which is simply
U = 0
T
h
= T
0
T
(z d)
d
ρ
h
= ρ
0
1 + αT
z d
d
P
h
= P
0
gρ
0
z +
αT z
2d
(z 2d)
,
In this state, the only mode of heat transfer is conduction. What we would
like to investigate, of course, is whether this state is stable. Consider small
perturbations
u = U + u
0
T = T
h
+ θ
P = P
h
+ p
0
.
We substitute these into the Navier–Stokes equation. Starting with
u
t
+ u · u =
1
ρ
0
p
0
+ gαθ
ˆ
z + ν
2
u,
we assume the term u · u will be small, and end up with the equation
u
0
t
=
1
ρ
0
p
0
+ gαθ
ˆ
z + ν
2
u,
together with the incompressibility condition · u
0
= 0.
Similarly, plugging these into the temperature diffusion equation and writing
u = (u, v, w) gives us
θ
t
w
0
T
d
= κ
2
θ.
We can gain further understanding of these equations by expressing them in
terms of dimensionless quantities. We introduce new variables
˜
t =
κt
d
2
˜
x =
x
d
˜
θ =
θ
T
˜p =
d
2
p
0
ρ
0
κ
2
In terms of these new variables, our equations of motion become rather elegant:
˜
θ
˜
t
˜w =
˜
2
˜
θ
˜
u
˜
t
=
˜
˜p +
gαT d
3
νκ
ν
κ
˜
θ
ˆ
z +
ν
κ
˜
2
˜
u
Ultimately, we see that the equations of motion depend on two dimensionless
constants: the Rayleigh number and Prandtl number
Ra =
gαT d
3
νκ
Pr =
ν
κ
,
These are the two parameters control the behaviour of the system. In particular,
the Prandtl number measures exactly the competition between viscous and
diffusion forces. Different fluids have different Prandtl numbers:
In a gas, then
ν
κ
1, since both are affected by the mean free path of a
particle.
In a non-metallic liquid, the diffusion of heat is usually much slower than
that of momentum, so Pr 10.
In a liquid metal,
Pr
is very very low, since, as we know, metal transmits
heat quite well.
Finally, the incompressibility equation still reads
˜
·
˜
u = 0
Under what situations do we expect an unstable flow? Let’s start with some
rather more heuristic analysis. Suppose we try to overturn our system as shown
in the diagram:
Suppose we do this at a speed
U
. For each
d × d × d
block, the potential
energy released is
P E g · d · (d
3
ρ) gρ
0
αd
4
T,
and the time scale for this to happen is τ d/U .
What is the friction we have to overcome in order to achieve this? The
viscous stress is approximately
µ
U
z
µ
U
d
.
The force is the integral over the area, which is
µU
d
d
2
. The work done, being
R
F dz, then evaluates to
µU
d
· d
2
· d = µU d
2
.
We can figure out
U
by noting that the heat is diffused away on a time scale
of
τ d
2
. For convection to happen, it must occur on a time scale at least
as fast as that of diffusion. So we have
U κ/d
. All in all, for convection to
happen, we need the potential energy gain to be greater than the work done,
which amounts to
gρ
0
αT d
4
& ρ
0
νκd.
In other words, we have
Ra =
gαT d
3
νκ
=
g
0
d
3
νκ
& 1.
Now it is extremely unlikely that
Ra
1 is indeed the correct condition, as
our analysis was not very precise. However, it is generally true that instability
occurs only when Ra is large.
To work this out, let’s take the curl of the Navier–Stokes equation we
previously obtained, and dropping the tildes, we get
ω
t
= RaPrθ ×
ˆ
z + Pr
2
ω
This gets rid of the pressure term, so that’s one less thing to worry about. But
this is quite messy, so we take the curl yet again to obtain
t
2
u = RaPr
2
θ
ˆ
z
θ
z

+ Pr
4
u
Reading off the z component, we get
t
2
w = RaPr
2
x
2
+
2
y
2
θ + Pr
4
w.
We will combine this with the temperature equation
θ
t
w =
2
θ
to understand the stability problem.
We first try to figure out what boundary conditions to impose. We’ve got a
6th order partial differential equation (4 from the Navier–Stokes and 2 from the
temperature equation), and so we need 6 boundary conditions. It is reasonable
to impose that w = θ = 0 at z = 0, 1, as this gives us impermeable boundaries.
It is also convenient to assume the top and bottom surfaces are stress-free:
u
z
=
v
z
= 0, which implies
z
u
x
+
v
y
= 0.
This then gives us
2
w
z
2
= 0. This is does not make much physical sense, because
our fluid is viscous, and we should expect no-slip boundary conditions, not
no-stress. But they are mathematically convenient, and we shall assume them.
Let’s try to look for unstable solutions using separation of variables. We see
that the equations are symmetric in x and y, and so we write our solution as
w = W (z)X(x, y)e
σt
θ = Θ(z)X(x, y)e
σt
.
If we can find solutions of this form with σ > 0, then the system is unstable.
What follows is just a classic separation of variables. Plugging these into the
temperature equation yields
d
2
dz
2
σ +
2
x
2
+
2
y
2

XΘ = XW.
Or equivalently
d
2
dz
2
σ
XΘ + XW =
2
x
2
+
2
y
2
XΘ.
We see that the differential operator on the left only acts in Θ, while those on
the right only act on X. We divide both sides by XΘ to get
d
2
dy
2
σ
Θ + W
Θ
=
d
2
dx
2
+
d
2
dy
2
X
X
.
Since the LHS is purely a function of
z
, and the RHS is purely a function of
X
,
they must be constant, and by requiring that the solution is bounded, we see
that this constant must be positive. Our equations then become
2
x
2
+
2
y
2
X = λ
2
X
d
2
dz
2
λ
2
σ
Θ = W.
We now look at our other equation. Plugging in our expressions for
w
and
θ
,
and using what we just obtained, we have
σ
d
2
dz
2
λ
2
W = RaPrλ
2
Θ + Pr
d
2
dz
2
λ
2
2
W.
On the boundary, we have Θ = 0 =
W
=
2
W
z
2
= 0, by assumption. So it follows
that we must have
d
4
W
dz
4
z=0,1
= 0.
We can eliminate Θ by letting
d
2
dz
2
σ λ
2
act on both sides, which converts
the Θ into W . We are then left with a 6th order differential equation
d
2
dz
2
σ λ
2
Pr
d
2
dz
2
λ
2
2
σ
d
2
dz
2
λ
2
!
W = RaPrλ
2
W.
This is an eigenvalue problem, and we see that our operator is self-adjoint. We
can factorize and divide across by Pr to obtain
d
2
dz
2
λ
2
d
2
dz
2
σ λ
2
d
2
dz
2
λ
2
σ
Pr
W = Raλ
2
W.
The boundary conditions are that
W =
d
2
W
dz
2
=
d
4
W
dx
4
= 0 at z = 0, 1.
We see that any eigenfunction of
d
2
dz
2
gives us an eigenfunction of our big scary
differential operator, and our solutions are quantized sine functions,
W
=
sin z
with n N. In this case, the eigenvalue equation is
(n
2
π
2
+ λ
2
)(n
2
π
2
+ σ + λ
2
)
n
2
π
2
+ λ
2
+
σ
Pr
= Raλ
2
.
When
σ >
0, then, noting that the LHS is increasing with
σ
on the range [0
,
),
this tells us that we have
Ra(n)
(n
2
π
2
+ λ
2
)
3
λ
2
.
We minimize the RHS with respect to
λ
and
n
. We clearly want to set
n
= 1,
and then solve for
0 =
d
d[λ
2
]
(n
2
π
2
+ λ
2
)
3
λ
2
=
(π
2
+ λ
2
)
2
λ
4
(2λ
2
π
2
).
So the minimum value of Ra is obtained when
λ
c
=
π
2
.
So we find that we have a solution only when
Ra Ra
c
=
27π
4
4
657.5.
If we are slightly about
Ra
c
, then there is the
n
= 1 unstable mode. As
Ra
increases, the number of available modes increases. While the critical Rayleigh
number depends on boundary conditions, but the picture is generic, and the
width of the unstable range grows like
Ra Ra
c
:
kd
Ra
stable
unstable
We might ask how does convection enhance the heat flux? This is
quantified by the Nusselt number
Nu =
convective heat transfer
conductive heat transfer
=
hwTid
κT
= h˜w
˜
θi.
Since this is a non-dimensional number, we know it is a function of Ra and Pr.
There are some physical arguments we can make about the value of
Nu
. The
claim is that the convective heat transfer is independent of layer depth. If there
is no convection at all, then the temperature gradient will look approximately
like
There is some simple arguments we can make about this. If we have a purely
conductive state, then we expect the temperature gradient to be linear:
This is pretty boring. If there is convection, then we claim that the temperature
gradient will look like
The reasoning is that the way convection works is that once a parcel at the
bottom is heated to a sufficiently high temperature, it gets kicked over to the
other side; if a parcel at the top is cooled to a sufficiently low temperature, it gets
kicked over to the other side. In between the two boundaries, the temperature is
roughly constant, taking the average temperature of the two boundaries.
In this model, it doesn’t matter how far apart the two boundaries are, since
once the hot or cold parcels are shot off, they can just travel on their own until
they reach the other side (or mix with the fluid in the middle).
If we believe this, then we must have
hwTi d
0
, and hence
Nu d
. Since
only Ra depends on d, we must have N u Ra
1/3
.
Of course, the real situation is much more subtle.
1.3 Classical Kelvin–Helmholtz instability
Let’s return to the situation of Rayleigh–Taylor instability, but this time, there
is some horizontal flow in the two layers, and they may be of different velocities.
ρ
2
ρ
1
U
1
U
2
Our analysis here will be not be very detailed, partly because what we do
is largely the same as the analysis of the Rayleigh–Taylor instability, but also
because the analysis makes quite a few assumptions which we wish to examine
more deeply.
In this scenario, we can still use a velocity potential
(u, w) =
φ
x
,
φ
z
.
We shall only consider the system in 2D. The far field now has velocity
φ
1
= U
1
x + φ
0
1
φ
2
= U
2
x + φ
0
2
with φ
0
1
0 as z and φ
0
2
0 as z −∞.
The boundary conditions are the same. Continuity of vertical velocity requires
φ
1,2
z
z=η
=
Dη
Dt
.
The dynamic boundary condition is that we have continuity of pressure at the
interface if there is no surface tension, in which case Bernoulli tells us
ρ
1
φ
1
t
+
1
2
ρ
1
|∇φ
1
|
2
+ gρ
1
η = ρ
2
φ
2
t
+
1
2
ρ
2
|∇φ
2
|
2
+ gρ
2
η.
The interface conditions are non-linear, and again we want to linearize. But
since U
1,2
is of order 1, linearization will be different. We have
φ
0
1,2
z
z=0
=
t
+ U
1,2
x
η.
So the Bernoulli condition gives us
ρ
1

t
+ U
1
x
φ
0
1
+ gη
= ρ
2

t
+ U
2
x
φ
0
2
+ gη
This modifies our previous eigenvalue problem for the phase speed and wavenum-
ber
ω
=
kc
,
k
=
2π
λ
. We go exactly as before, and after some work, we find that
we have
c =
ρ
1
U
1
+ ρ
2
U
2
ρ
1
+ ρ
2
±
1
ρ
1
+ ρ
2
g(ρ
2
2
ρ
2
1
)
k
ρ
1
ρ
2
(U
1
U
2
)
2
1/2
.
So we see that we have instability if c has an imaginary part, i.e.
k >
g(ρ
2
2
ρ
2
1
)
ρ
1
ρ
2
(U
1
U
2
)
2
.
So we see that there is instability for sufficiently large wavenumbers, even for
static stability. Similar to Rayleigh–Taylor instability, the growth rate
kc
i
grows monotonically with wavenumber, and as
k
, the instability becomes
proportional to the difference
|U
1
U
2
|
(as opposed to Rayleigh–Taylor instability,
where it grows unboundedly).
How can we think about this result? If
U
1
6
=
U
2
and we have a discrete
change in velocity, then this means there is a
δ
-function of vorticity at the
interface. So it should not be surprising that the result is unstable!
Another way to think about it is to change our frame of reference so that
U
1
=
U
=
U
2
. In the Boussinesq limit, we have
c
r
= 0, and instability arises
whenever
gρλ
ρU
2
< 4π. We can see the numerator as the potential energy cost if
we move a parcel from the bottom layer to the top layer, while the denominator
as some sort of kinetic energy. So this says we are unstable if there is enough
kinetic energy to move a parcel up.
This analysis of the shear flow begs a few questions:
How might we regularize this expression? The vortex sheet is obviously
wildly unstable.
Was it right to assume two-dimensional perturbations?
What happens if we regularize the depth of the shear?
1.4 Finite depth shear flow
We now consider a finite depth shear flow. This means we have fluid moving
between two layers, with a z-dependent velocity profile:
z = L
z = L z = L
U(z)
We first show that it suffices to work in 2 dimensions. We will assume that the
mean flow points in the
ˆ
x
direction, but the perturbations can point at an angle.
The inviscid homogeneous incompressible Navier–Stokes equations are again
u
t
+ u · u
=
Du
Dt
= −∇
p
0
ρ
, · u = 0.
We linearize about a shear flow, and consider some 3D normal modes
u =
¯
U(z)
ˆ
x + u
0
(x, y, z, t),
where
(u
0
, p
0
) = [
ˆ
u(z), ˆp(z)]e
i(kx+`ykct)
.
The phase speed is then
c
p
=
ω
κ
=
ω
(k
2
+ `
2
)
1/2
=
kc
r
(k
2
+ `
2
)
1/2
and the growth rate of the perturbation is simply σ
3d
= kc
i
.
We substitute our expression of
u
and
p
0
into the Navier–Stokes equations
to obtain, in components,
ik(
¯
U c)ˆu + ˆw
d
¯
U
dz
= ikˆp
ik(
¯
U c)ˆv = ilˆp
ik(
¯
U c) ˆw =
dˆp
dz
ikˆu + i`ˆv +
d ˆw
dz
= 0
Our strategy is to rewrite everything to express things in terms of new variables
κ =
p
k
2
+ `
2
, κ˜u = kˆu + `ˆv, ˜p =
κˆp
k
and if we are successful in expressing everything in terms of
˜u
, then we could
see how we can reduce this to the two-dimensional problem.
To this end, we can slightly rearrange our first two equations to say
ik(
¯
U c)ˆu + ˆw
d
¯
U
dz
= ik
2
ˆp
k
i`(
¯
U c)ˆv = i`
2
ˆp
k
which combine to give
i(
¯
U c)κ˜u + ˆw
d
¯
U
dz
= ˜p.
We can rewrite the remaining two equations in terms of ˜p as well:
(
¯
U c) ˆw =
d
dz
˜p
κ
˜
U +
d ˆw
dz
= 0.
This looks just like a 2d system, but with an instability growth rate of
σ
2d
=
κc
i
> kc
i
=
σ
3d
. Thus, our 3d problem is “equivalent” to a 2d problem with
greater growth rate. However, whether or not instability occurs is unaffected.
One way to think about the difference in growth rate is that the
y
component
of the perturbation sees less of the original velocity
¯
U
, and so it is more stable.
This result is known as Squire’s Theorem.
We now restrict to two-dimensional flows, and have equations
ik(
¯
U c)ˆu + ˆw
d
¯
U
dz
= ikˆp
ik(
¯
U c) ˆw =
dˆp
dz
ikˆu +
d ˆw
dz
= 0.
We can use the last incompressibility equation to eliminate
ˆu
from the first
equation, and be left with
(
¯
U c)
d ˆw
dz
+ ˆw
d
¯
U
dz
= ikˆp.
We wish to get rid of the
ˆp
, and to do so, we differentiate this equation with
respect to z and use the second equation to get
(
¯
U c)
d
2
ˆw
dz
2
d ˆw
dz
d
¯
U
dz
+
d ˆw
dz
d
¯
U
dz
+ ˆw
d
2
¯
U
dz
2
= k
2
(
¯
U c) ˆw.
The terms in the middle cancel, and we can rearrange to obtain the Rayleigh
equation
(
¯
U c)
d
2
dz
2
k
2
d
2
¯
U
dz
2
ˆw = 0.
We see that when
¯
U
=
c
, we have a regular singular point. The natural boundary
conditions are that ˆw 0 at the edge of the domain.
Note that this differential operator is not self-adjoint! This is since
¯
U
has non-
trivial
z
-dependence. This means we do not have a complete basis of orthogonal
eigenfunctions. This is manifested by the fact that it can have transient growth.
To analyze this scenario further, we rewrite Rayleigh equation in conventional
form
d
2
ˆw
dz
2
k
2
ˆw
d
2
¯
U/dz
2
¯
U c
ˆw = 0.
The trick is to multiply by
w
, integrate across the domain, and apply boundary
conditions to obtain
Z
L
L
¯
U
00
¯
U c
|ˆw|
2
dz =
Z
L
L
(|ˆw
0
|
2
+ k
2
|ˆw|
2
) dz
We can split the LHS into real and imaginary parts:
Z
L
L
¯
U
00
(
¯
U c
r
)
|
¯
U c|
2
|ˆw|
2
dz + ic
i
Z
L
L
¯
U
00
|
¯
U c|
2
|ˆw|
2
dz.
But since the RHS is purely real, we know the imaginary part must vanish.
One way for the imaginary part to vanish is for
c
i
to vanish, and this
corresponds to stable flow. If we want
c
i
to be non-zero, then the integral must
vanish. So we obtain the Rayleigh inflection point criterion:
d
2
dz
2
¯
U
must change
sign at least once in L < z < L.
Of course, this condition is not sufficient for instability. If we want to get
more necessary conditions for instability to occur, it might be wise to inspect
the imaginary part, as Fjortoft noticed. If instability occurs, then we know that
we must have
Z
L
L
¯
U
00
|
¯
U c|
2
|ˆw|
2
dz = 0.
Let’s assume that there is a unique (non-degenerate) inflection point at
z
=
z
s
,
with
¯
U
s
=
¯
U
(
z
s
). We can then add (
c
r
¯
U
s
) times the above equation to the
real part to see that we must have
Z
L
L
(|ˆw
0
|
2
+ k
2
|ˆw|
2
) dz =
Z
L
L
¯
U
00
(
¯
U
¯
U
s
)
|
¯
U c|
2
|ˆw|
2
dz.
Assuming further that
¯
U
is monotonic, we see that both
¯
U
¯
U
s
and
U
00
change
sign at
z
s
, so the sign of the product is unchanged. So for the equation to be
consistent, we must have
¯
U
00
(
¯
U
¯
U
s
) 0 with equality only at z
s
.
We can look at some examples of different flow profiles:
In the leftmost example, Rayleigh’s criterion tells us it is stable, because there is
no inflection point. The second example has an inflection point, but does not
satisfy Fjortoft’s criterion. So this is also stable. The third example satisfies
both criteria, so it may be unstable. Of course, the condition is not sufficient, so
we cannot make a conclusive statement.
Is there more information we can extract from them Rayleigh equation?
Suppose we indeed have an unstable mode. Can we give a bound on the growth
rate c
i
given the phase speed c
r
?
The trick is to perform the substitution
ˆ
W =
ˆw
(
¯
U c)
.
Note that this substitution is potentially singular when
¯
U
=
c
, which is the
singular point of the equation. By expressing everything in terms of
ˆ
W
, we
are essentially hiding the singularity in the definition of
ˆ
W
instead of in the
equation.
Under this substitution, our Rayleigh equation then takes the self-adjoint
form
d
dz
(
¯
U c)
2
d
ˆ
W
dz
!
k
2
(
¯
U c)
2
ˆ
W = 0.
We can multiply by
ˆ
W
and integrate over the domain to obtain
Z
L
L
(
¯
U c)
2
d
ˆ
W
dz
2
+ k
2
|
ˆ
W
2
| {z }
Q
dz = 0.
Since Q 0, we may again take imaginary part to require
2c
i
Z
L
L
(
¯
U c
r
)Q dz = 0.
This implies that we must have
U
min
< c
r
< U
max
, and gives a bound on the
phase speed.
Taking the real part implies
Z
L
L
[(
¯
U c
r
)
2
c
2
i
]Q dz = 0.
But we can combine this with the imaginary part condition, which tells us
Z
L
L
¯
UQ dz = c
r
Z
L
L
Q dz.
So we can expand the real part to give us
Z
L
L
¯
U
2
Q dz =
Z
L
L
(c
2
r
+ c
2
i
)Q dz.
Putting this aside, we note that tautologically, we have
U
min
¯
U U
max
. So
we always have
Z
L
L
(
¯
U U
max
)(
¯
U U
min
)Q dz 0.
expanding this, and using our expression for
R
L
L
¯
U
2
Q dz, we obtain
Z
L
L
((c
2
r
+ c
2
i
) (U
max
U
min
)c
r
+ U
max
U
min
)Q dz 0.
But we see that we are just multiplying
Q
d
z
by a constant and integrating.
Since we know that
R
Q dz > 0, we must have
(c
2
r
+ c
2
i
) (U
max
+ U
min
)c
r
+ U
max
U
min
0.
By completing the square, we can rearrange this to say
c
r
U
max
+ U
min
2
2
+ (c
i
0)
2
U
max
U
min
2
2
.
This is just an equation for the circle! We can now concretely plot the region of
possible c
r
and c
i
:
c
r
c
i
U
max
+U
min
2
U
min
U
max
Of course, lying within this region is a necessary condition for instability to
occur, but not sufficient. The actual region of instability is a subset of this
semi-circle, and this subset depends on the actual
¯
U
. But this is already very
helpful, since if we way to search for instabilities, say numerically, then we know
where to look.
1.5 Stratified flows
In the Kelvin–Helmholtz scenario, we had a varying density. Let’s now try to
model the situation in complete generality.
Recall that the set of (inviscid) equations for Boussinesq stratified fluids is
ρ
u
t
+ u · u
= −∇p
0
gρ
0
ˆ
z,
with incompressibility equations
· u = 0,
Dρ
Dt
= 0.
We consider a mean base flow and a 2D perturbation as a normal mode:
u =
¯
U(z)
ˆ
x + u
0
(x, z, t)
p = ¯p(z) + p
0
(x, z, t)
ρ = ¯ρ(z) + ρ
0
(x, z, t),
with
(u
0
, p
0
, ρ
0
) = [
ˆ
u(z), ˆp(z), ˆρ(z)]e
i(kxωt)
= [
ˆ
u(z), ˆp(z), ˆρ(z)]e
ik(xct)
.
We wish to obtain an equation that involves only
ˆw
. We first linearize our
equations to obtain
¯ρ
u
0
t
+
¯
U
u
0
x
+ w
0
d
¯
U
dz
=
p
0
x
¯ρ
w
0
t
+
¯
U
w
0
x
=
p
0
z
gρ
0
ρ
0
t
+
¯
U
ρ
0
x
+ w
0
d¯ρ
dz
= 0
u
0
x
+
w
0
z
= 0.
Plugging in our normal mode solution into the four equations, we obtain
ik¯ρ(
¯
U c)ˆu + ¯ρ ˆw
d
dz
¯
U = ikˆp
ik¯ρ(
¯
U c) ˆw =
dˆp
dz
g ˆρ. ()
ik(
¯
U c)ˆρ + w
0
d¯ρ
dz
= 0 ()
ikˆu +
w
0
z
= 0.
The last (incompressibility) equation helps us eliminate
ˆu
from the first equation
to obtain
¯ρ(
¯
U c)
d ˆw
dz
+ ¯ρ ˆw
d
¯
U
dz
= ikˆp.
To eliminate
ˆp
as well, we differentiate with respect to
z
and apply (
), and then
further apply () to get rid of the ˆρ term, and end up with
¯ρ(
¯
U c)
d
2
ˆw
dz
2
+ ¯ρ ˆw
d
2
¯
U
dz
2
+ k
2
¯ρ(
¯
U c) ˆw =
g ˆw
¯
U c
d¯ρ
dz
.
This allows us to write our equation as the Taylor–Goldstein equation
d
2
dz
2
k
2
ˆw
ˆw
(
¯
U c)
d
2
¯
U
dz
2
+
N
2
ˆw
(
¯
U c)
2
= 0,
where
N
2
=
g
¯ρ
d¯ρ
dz
.
This
N
is known as the buoyancy frequency, and has dimensions
T
2
. This is
the frequency at which a slab of stratified fluid would oscillate vertically.
Indeed, if we have a slab of volume
V
, and we displace it by an infinitesimal
amount ζ in the vertical direction, then the buoyancy force is
F = V gζ
ρ
z
.
Since the mass of the fluid is
ρ
0
V
, by Newton’s second law, the acceleration of
the parcel satisfies
d
2
ζ
dt
2
+
g
ρ
0
ρ
z
ζ = 0.
This is just simple harmonic oscillation of frequency N .
Crucially, what we see from this computation is that a density stratification
of the fluid can lead to internal waves. We will later see that the presence of
these internal waves can destabilize our system.
Miles–Howard theorem
In 1961, John Miles published a series of papers establishing a sufficient condition
for infinitesimal stability in the case of a stratified fluid. When he submitted
this to review, Louis Howard read the paper, and replied with a 3-page proof of
a more general result. Howard’s proof was published as a follow-up paper titled
“Note on a paper of John W. Miles”.
Recall that we just derived the Taylor–Goldstein equation
d
2
dz
2
k
2
ˆw
ˆw
(
¯
U c)
d
2
¯
U
dz
2
+
N
2
ˆw
(
¯
U c)
2
= 0,
The magical insight of Howard was to introduce the new variable
H =
ˆ
W
(
¯
U c)
1/2
.
We can regretfully compute the derivatives
ˆw = (
¯
U c)
1/2
H
d
dz
ˆw =
1
2
(
¯
U c)
1/2
H
d
¯
U
dz
+ (
¯
U c)
1/2
dH
dz
d
2
dz
2
ˆw =
1
4
(
¯
U c)
3/2
H
d
¯
U
dz
2
+
1
2
(
¯
U c)
1/2
H
d
2
¯
U
dz
2
+ (
¯
U c)
1/2
dH
dz
d
¯
U
dz
+ (
¯
U c)
1/2
d
2
H
dz
2
.
We can substitute this into the Taylor–Goldstein equation, and after some
algebraic mess, we obtain the decent-looking equation
d
dz
(
¯
U c)
dH
dz
H
k
2
(
¯
U c) +
1
2
d
2
¯
U
dz
2
+
1
4
d
¯
U
dz
2
N
2
¯
U c
= 0.
This is now self-adjoint. Imposing boundary conditions
ˆw,
d ˆw
dz
0 in the far
field, we multiply by H
and integrate over all space. The first term gives
Z
H
d
dz
(
¯
U c)
dH
dz
dz =
Z
(
¯
U c)
dH
dz
2
dz,
while the second term is given by
Z
k
2
|H|
2
(
¯
U c)
1
2
d
2
¯
U
dz
2
|H|
2
|H|
2
1
4
d
¯
U
dz
2
N
2
(
¯
U c
)
|
¯
U c|
2
dz.
Both the real and imaginary parts of the sum of these two terms must be zero.
In particular, the imaginary part reads
c
i
Z
dH
dz
2
+ k
2
|H|
2
+ |H|
2
N
2
1
4
d
¯
U
dz
2
|
¯
U c|
2
dz = 0.
So a necessary condition for instability is that
N
2
1
4
d
¯
U
dz
2
< 0
somewhere. Defining the Richardson number to be
Ri(z) =
N
2
(d
¯
U/dz)
2
,
the necessary condition is
Ri(z) <
1
4
.
Equivalently, a sufficient condition for stability is that Ri(z)
1
4
everywhere.
How can we think about this? When we move a parcel, there is the buoyancy
force that attempts to move it back to its original position. However, if we move
the parcel to the high velocity region, we can gain kinetic energy from the mean
flow. Thus, if the velocity gradient is sufficiently high, it becomes “advantageous”
for parcels to move around.
Piecewise-linear profiles
If we want to make our lives simpler, we can restrict our profile with piecewise
linear velocity and layered density. In this case, to solve the Taylor–Goldstein
equation, we observe that away from the interfaces, we have
N
=
U
00
= 0. So
the Taylor–Goldstein equation reduces to
d
2
dz
2
k
2
ˆw = 0.
This has trivial exponential solutions in each of the individual regions, and to
find the correct solutions, we need to impose some matching conditions at the
interface.
So assume that the pressure is continuous across all interfaces. Then using
the equation
¯ρ(
¯
U c)
d ˆw
dz
+ ρ ˆw
d
¯
U
dz
= ikˆp,
we see that the left hand seems like the derivative of a product, except the sign
is wrong. So we divide by
1
(
¯
Uc)
2
, and then we can write this as
ik
¯ρ
ˆp
(
¯
U c)
2
=
d
dz
ˆw
¯
U c
.
For a general
c
, we integrate over an infinitesimal distance at the interface. We
assume
ˆp
is continuous, and that
¯ρ
and (
¯
U c
) have bounded discontinuity.
Then the integral of the LHS vanishes in the limit, and so the integral of the
right-hand side must be zero. This gives the matching condition
ˆw
¯
U c
+
= 0.
To obtain a second matching condition, we rewrite the Taylor–Goldstein equa-
tion as a derivative, which allows for the determination of the other matching
condition:
d
dz
(
¯
U c)
d ˆw
dz
ˆw
d
¯
U
dz
g ¯ρ
ρ
0
ˆw
¯
U c

= k
2
(
¯
U c) ˆw
g ¯ρ
ρ
0
d
dz
ˆw
¯
U c
.
Again integrating over an infinitesimal distance over at the interface, we see
that the LHS must be continuous along the interface. So we obtain the second
matching condition.
(
¯
U c)
d ˆw
dz
ˆw
d
¯
U
dz
g ¯ρ
ρ
0
ˆw
¯
U c

+
= 0.
We begin by applying this to a relatively simple profile with constant density,
and whose velocity profile looks like
h
U
For convenience, we scale distances by
h
2
and speeds by
U
2
, and define
˜c
and α by
c =
U
2
˜c, α =
kh
2
.
These quantities
α
and
˜c
(which we will shortly start writing as
c
instead) are
nice dimensionless quantities to work with. After the scaling, the profile can be
described by
z = 1
z = 1
¯
U = 1
¯
U = z
¯
U = 1
III
II
I
ˆw = Ae
α(z1)
ˆw = Be
αz
+ Ce
αz
ˆw = De
α(z+1)
We have also our exponential solutions in each region between the interfaces,
noting that we require our solution to vanish at the far field.
We now apply the matching conditions. Since
¯
U c
is continuous, the first
matching condition just says ˆw has to be continuous. So we get
A = Be
α
+ Ce
α
D = Be
α
+ Ce
α
.
The other matching condition is slightly messier. It says
(
¯
U c)
d ˆw
dz
ˆw
d
¯
U
dz
g ¯ρ
ρ
0
ˆw
¯
U c

+
= 0,
which gives us two equations
(Be
α
+ Ce
α
)(1 c)(α) = (Be
α
Ce
α
)(1 c)(α) (Be
α
+ Ce
α
)
(Be
α
+ Ce
α
)(1 c)(α) = (Be
α
Ce
α
)(1 c)(α) (Be
α
+ Ce
α
).
Simplifying these gives us
(2α(1 c) 1)Be
α
= Ce
α
(2α(1 + c) 1)Ce
α
= Be
α
.
Thus, we find that
(2α 1)
2
4α
2
c
2
= e
4α
,
and hence we have the dispersion relation
c
2
=
(2α 1)
2
e
4α
4α
2
.
We see that this has the possibility of instability. Indeed, we can expand the
numerator to get
c
2
=
(1 4α + 4α
2
) (1 4α + 8α
2
+ O(α
3
))
4α
2
= 1 + O(α).
So for small
α
, we have instability. This is known as Rayleigh instability. On
the other hand, as
α
grows very large, this is stable. We can plot these out in a
graph
k
ω
ω
i
ω
r
0.64
We see that the critical value of
α
is 0
.
64, and the maximally instability point
is α 0.4.
Let’s try to understand where this picture came from. For large
k
, the
wavelength of the oscillations is small, and so it seems reasonable that we can
approximate the model by one where the two interfaces don’t interact. So
consider the case where we only have one interface.
z = 1
¯
U = z
¯
U = 1
II
I
Ae
α(z1)
Be
α(z1)
We can perform the same procedure as before, solving
ˆw
¯
U c
+
= 0.
This gives the condition A = B. The other condition then tell us
(1 c)(α)A = (1 c)αA Ae
α(z1)
.
We can cancel the A, and rearrange this to say
c = 1
1
2α
.
So we see that this interface supports a wave at speed
c
= 1
1
2α
at a speed
lower than
¯
U
. Similarly, if we treated the other interface at isolation, it would
support a wave at c
= 1 +
1
2α
.
Crucially these two waves are individually stable, and when
k
is large, so that
the two interfaces are effectively in isolation, we simply expect two independent
and stable waves. Indeed, in the dispersion relation above, we can drop the
e
4α
term when α is large, and approximate
c = ±
2α 1
2α
= ±
1
1
2α
.
When
k
is small, then we expect the two modes to interact with each other,
which manifests itself as the
e
4α
term. Moreover, the resonance should be the
greatest when the two modes have equal speed, i.e. at
α
1
2
, which is quite
close to actual maximally unstable mode.
Density stratification
We next consider the case where we have density stratification. After scaling,
we can draw our region and solutions as
z = 1
z = 0
z = 1
¯
U = 1
¯
U = z
¯
U = 1
III
IV
II
I
Ae
α(z1)
Be
α(z1)
+ Ce
α(z1)
De
α(z+1)
+ Ee
α(z+1)
F e
α(z+1)
¯ρ = 1
¯ρ = +1
Note that the
¯ρ
is the relative density, since it is the density difference that
matters (fluids cannot have negative density).
Let’s first try to understand how this would behave heuristically. As before,
at the I-II and III-IV interfaces, we have waves with
c
=
±
1
1
2α
. At the
density interface, we previously saw that we can have internal gravity waves
with c
igw
= ±
q
Ri
0
α
, where
Ri
0
=
gh
ρ
0
is the bulk Richardson number (before scaling, it is Ri
0
=
gρh
ρ
0
U
2
).
We expect instability to occur when the frequency of this internal gravity
wave aligns with the frequency of the waves at the velocity interfaces, and this
is given by
1
1
2α
2
=
Ri
0
α
.
It is easy to solve this to get the condition
Ri
0
' α 1.
This is in fact what we find if we solve it exactly.
So let’s try to understand the situation more responsibly now. The procedure
is roughly the same. The requirement that ˆw is continuous gives
A = B + C
Be
α
+ Ce
α
= De
α
+ Ee
α
D + E = F.
If we work things out, then the other matching condition
(
¯
U c)
d ˆw
dz
ˆw
d
¯
U
dz
Ri
0
¯ρ
ˆw
¯
U c

+
= 0
gives
(1 c)(α)A = (1 c)(α)(B C) (B + C)
(1 c)(α)F = (1 c)(α)(D E) (D + E)
(c)(α)(Be
α
Ce
α
) +
Ri
0
(c)
(Be
α
+ Ce
α
)
= (c)(α)(De
α
Ee
α
)
Ri
0
(c)
(De
α
+ Ee
α
).
This defines a 6
×
6 matrix problem, 4th order in
c
. Writing it as
SX
= 0 with
X
= (
A, B, C, D, E, F
)
T
, the existence of non-trivial solutions is given by the
requirement that det S = 0, which one can check (!) is given by
c
4
+ c
2
e
4α
(2α 1)
2
4α
2
Ri
0
α
+
Ri
0
α
e
2α
+ (2α 1)
2α
2
= 0.
This is a biquadratic equation, which we can solve.
We can inspect the case when
Ri
0
is very small. In this case, the dispersion
relation becomes
c
4
+ c
2
e
4α
(2α 1)
2
4α
2
.
Two of these solutions are simply given by
c
= 0, and the other two are just those
we saw from Rayleigh instability. This is saying that when the density difference
is very small, then the internal gravitational waves are roughly non-existent, and
we can happily ignore them.
In general, we have instability for all
Ri
0
! This is Holmboe instability. The
scenario is particularly complicated if we look at small
α
, where we expect the
effects of Rayleigh instability to kick in as well. So let’s fix some 0
< α <
0
.
64,
and see what happens when we increase the Richardson number.
As before, we use dashed lines to denote the imaginary parts and solid lines
to denote the real part. For each
Ri
0
, any combination of imaginary part and
real part gives a solution for
c
, and, except when there are double roots, there
should be four such possible combinations.
Ri
0
c
We can undersatnd this as follows. When
Ri
0
= 0, all we have is Rayleigh
instability, which are the top and bottom curves. There are then two
c
= 0
solutions, as we have seen. As we increase
Ri
0
to a small, non-sero amount, this
mode becomes non-zero, and turns into a genuine unstable mode. As the value
of
Ri
0
increases, the two imaginary curves meet and merge to form a new curve.
The solutions then start to have a non-zero real part, which gives a non-zero
phase speed to our pertrubation.
Note that when
Ri
0
becomes very large, then our modes become unstable
again, though this is not clear from our graph.
Taylor instability
While Holmboe instability is quite interesting, our system was unstable even
without the density stratification. Since we saw that instability is in general trig-
gered by the interaction between two interfaces, it should not be surprising that
even in a Rayleigh stable scenario, if we have two layers of density stratification,
then we can still get instability.
Consider the following flow profile:
z = 1
z = 1
¯
U = z
III
II
I
Ae
α(z1)
Be
αz
+ Ce
αz
De
α(z+1)
¯ρ = R 1
¯ρ = R + 1
As before, continuity in ˆw requires
A = Be
α
+ Ce
α
D = Be
α
+ Ce
α
.
Now there is no discontinuity in vorticity, but the density field has jumps. So
we need to solve the second matching condition. They are
(1 c)(α)(Be
α
+ Ce
α
) +
Ri
0
1 c
(Be
α
+ Ce
α
) = (1 c)(α)(Be
α
Ce
α
)
(1 c)(α)(Be
α
+ Ce
α
) +
Ri
0
1 + c
(Be
α
+ Ce
α
) = (1 c)(α)(Be
α
Ce
α
)
These give us, respectively,
(2α(1 c)
2
Ri
0
)B = Ri
0
Ce
2α
(2α(1 + c)
2
Ri
0
)C = Ri
0
Be
2α
.
So we get the biquadratic equation
c
4
c
2
2 +
Ri
0
α
+
1
Ri
0
2α
2
Ri
2
0
e
4α
4α
2
= 0.
We then apply the quadratic formula to say
c
2
= 1 +
Ri
0
2α
±
r
2Ri
0
α
+
Ri
2
0
e
4α
4α
2
.
So it is possible to have instability with no inflection! Indeed, instability occurs
when c
2
< 0, which is equivalent to requiring
2α
1 + e
2α
< Ri
0
<
2α
1 e
2α
.
Thus, for any
Ri
0
>
0, we can have instability. This is known as Taylor
instability.
Heuristically, the density jumps at the interface have speed
c
±
igw
= ±1
Ri
0
2α
1/2
.
We get equality when
c
= 0, so that
Ri
0
= 2
α
. This is in very good agreement
with what our above analysis gives us.
2 Absolute and convective instabilities
So far, we have only been considering perturbations that are Fourier modes,
(u
0
, p
0
, ρ
0
) = [
ˆ
u(z), ˆp(z), ˆρ(z)]e
i(k·xωt
.
This gives rise to a dispersion relation
D
(
k, ω
) = 0. This is an eigenvalue problem
for
ω
(
k
), and the
k
th mode is unstable if the solution
ω
(
k
) has positive imaginary
part.
When we focus on Fourier modes, they are necessarily non-local. In reality,
perturbations tend to be local. We perturb the fluid at some point, and the
perturbation spreads out. In this case, we might be interested in how the
perturbations spread.
To understand this, we need the notion of group velocity. For simplicity,
suppose we have a sum of two Fourier modes of slightly different wavenumbers
k
0
± k
. The corresponding frequencies are then
ω
0
± ω
, and for small
k
, we
may approximate
ω
= k
ω
k
k=k
0
.
We can then look at how our wave propagates:
η = cos[(k
0
+ k
)x (ω
0
+ ω
)t] + cos[k
0
k
)x (ω
0
ω
)t]
= 2 cos(k
x ω
t) cos(k
0
x ω
0
t)
= 2 cos
k
w
ω
k
k=k
0
t
!!
cos(k
0
x ω
0
t)
Since
k
is small, we know the first term has a long wavelength, and determines
the “overall shape” of the wave.
η
As time evolves, these “packets” propagate with group velocity
c
g
=
ω
k
. This
is also the speed at which the energy in the wave packets propagate.
In general, there are 4 key characteristics of interest for these waves:
The energy in a wave packet propagates at a group velocity.
They can disperse (different wavelengths have different speeds)
They can be advected by a streamwise flow.
They can be unstable, i.e. their amplitude can grow in time and space.
In general, if we have an unstable system, then we expect the perturbation to
grow with time, but also “move away” from the original source of perturbation.
We can consider two possibilities we can have convective instability, where
the perturbation “goes away”; and absolute instability, where the perturbation
doesn’t go away.
We can imagine the evolution of a convective instability as looking like this:
t
whereas an absolute instability would look like this:
t
Note that even in the absolute case, the perturbation may still have non-zero
group velocity. It’s just that the perturbations grow more quickly than the group
velocity.
To make this more precise, we consider the response of the system to an
impulse. We can understand the dispersion relation as saying that in phase space,
a quantity χ (e.g. velocity, pressure, density) must satisfy
D(k, ω) ˜χ(k, ω) = 0,
where
˜χ(k, ω) =
Z
−∞
Z
−∞
χ(x, t)e
i(kxωt)
dx dt.
Indeed, this equation says we can have a non-zero (
k, ω
) mode iff they satisfy
D
(
k, ω
) = 0. The point of writing it this way is that this is now linear in
˜χ
, and
so it applies to any χ, not necessarily a Fourier mode.
Going back to position space, we can think of this as saying
D
i
x
, i
t
χ(x, t) = 0.
This equation allows us to understand how the system responds to some external
forcing F (x, t). We simply replace the above equation by
D
i
x
, i
t
χ(x, t) = F (x, t).
Usually, we want to solve this in Fourier space, so that
D(k, ω) ˜χ(k, ω) =
˜
F (k, ω).
In particular, the Green’s function (or impulse response) is given by the response
to the impulse
F
ξ,τ
(
x, t
) =
δ
(
x ξ
)
δ
(
t τ
). We may wlog assume
ξ
=
τ
= 0,
and just call this
F
. The solution is, by definition, the Green’s function
G
(
x, t
),
satisfying
D
i
x
, i
t
G(x, t) = δ(x)δ(t).
Given the Green’s function, the response to an arbitrary forcing
F
is “just” given
by
χ(x, t) =
Z
G(x ξ, t τ)F (ξ, τ) dξ dτ.
Thus, the Green’s function essentially controls the all behaviour of the system.
With the Green’s function, we may now make some definitions.
Definition (Linear stability). The base flow of a system is linearly stable if
lim
t→∞
G(x, t) = 0
along all rays
x
t
= C.
A flow is unstable if it is not stable.
Definition
(Linearly convectively unstable)
.
An unstable flow is linearly con-
vectively unstable if lim
t→∞
G(x, t) = 0 along the ray
x
t
= 0.
Definition
(Linearly absolutely unstable)
.
An unstable flow is linearly absolutely
unstable if lim
t→∞
G(x, t) 6= 0 along the ray
x
t
= 0.
The first case is what is known as an amplifier , where the instability grows
but travels away at the same time. In the second case, we have an oscillator .
Even for a general
F
, it is easy to solve for
˜χ
in Fourier space. Indeed, we
simply have
˜χ(k, ω) =
˜
F (k, ω)
D(k, ω)
.
To recover χ from this, we use the Fourier inversion formula
χ(x, t) =
1
(2π)
2
Z
L
ω
Z
F
k
˜χ(k, ω)e
i(kxωt)
dk dω.
Note that here we put some general contours for
ω
and
k
instead of integrating
along the real axis, so this is not the genuine Fourier inversion formula. However,
we notice that as we deform our contour, when we pass through a singularity,
we pick up a multiple of
e
i(kxωt)
. Moreover, since the singularities occur when
D
(
k, ω
) = 0, it follows that these extra terms we pick up are in fact solutions to
the homogeneous equation
D
(
i∂
x
, i∂
t
)
χ
= 0. Thus, no matter which contour
we pick, we do get a genuine solution to our problem.
So how should we pick a contour? The key concept is causality the
response must come after the impulse. Thus, if
F
(
x, t
) = 0 for
t <
0, then we
also need
χ
(
x, t
) = 0 in that region. To understand this, we perform only the
temporal part of the Fourier inversion, so that
˜χ(k, t) =
1
2π
Z
L
ω
˜
F (k, ω)
D(k, ω; R)
e
t
dω.
To perform this contour integral, we close our contour either upwards or down-
wards, compute residues, and then take the limit as our semi-circle tends to
infinity. For this to work, it must be the case that the contribution by the
circular part of the contour vanishes in the limit, and this determines whether
we should close upwards or downwards.
If we close upwards, then
ω
will have positive imaginary part. So for
e
t
not to blow up, we need
t <
0. Similarly, we close downwards when
t >
0. Thus,
if we want
χ
to vanish whenever
t <
0, we should pick our contour so that it
it lies above all the singularities, so that it picks up no residue when we close
upwards. This determines the choice of contour.
But the key question is which of these are causal. When
t <
0, we require
χ
(
k, t
)
<
0. By Jordan’s lemma, for
t <
0, when performing the
ω
integral, we
should close the contour upwards when when perform the integral; when
t >
0,
we close it downwards. Thus for
χ
(
k, t
) = 0 when
t <
0, we must pick
L
ω
to lie
above all singularities of ˜χ(k, ω).
Assume that
D
has a single simple zero for each
k
. Let
ω
(
k
) be the corre-
sponding value of ω. Then by complex analysis, we obtain
˜χ(k, t) = i
˜
F [k, ω(k)]
˜
D
ω
[k, ω(k)]
e
(k)t
.
We finally want to take the inverse Fourier transform with respect to x:
χ(x, t) =
1
2π
Z
F
k
˜
F [k, ω(k)]
˜
D
ω
[k, ω(k)]
e
(k)t
dk.
We are interested in the case
F
=
δ
(
x
)
δ
(
t
), i.e.
˜
F
(
k, ω
) = 1. So the central
question is the evaluation of the integral
G(x, t) =
i
2π
Z
F
k
exp(i(kx ω(k)t)
˜
D
ω
[k, ω(k)]
dk.
Recall that our objective is to determine the behaviour of
G
as
t
with
V
=
x
t
fixed. Since we are interested in the large
t
behaviour instead of obtaining
exact values at finite
t
, we may use what is known as the method of steepest
descent.
The method of steepest descent is a very general technique to approximate
integrals of the form
H(t) =
i
2π
Z
F
k
f(k) exp ((k)) dk,
in the limit t . In our case, we take
f(k) =
1
˜
D
ω
[k, ω(k)]
ρ (k) = i (kV ω(k)) .
In an integral of this form, there are different factors that may affect the
limiting behaviour when
t
. First is that as
t
gets very large, the fast
oscillations in
ω
causes the integral to cancel except at stationary phase
ρ
i
k
= 0.
On the other hand, since we’re taking the exponential of
ρ
, we’d expect the
largest contribution to the integral to come from the
k
where
ρ
r
(
k
) is the greatest.
The idea of the method of steepest descent is to deform the contour so that
we integrate along paths of stationary phase, i.e. paths with constant
ρ
i
, so that
we don’t have to worry about the oscillating phase.
To do so, observe that the Cauchy–Riemann equations tell us
ρ
r
· ρ
i
= 0,
where in
we are taking the ordinary derivative with respect to
k
r
and
k
i
,
viewing the real and imaginary parts as separate variables.
Since the gradient of a function is the normal to the contours, this tells us
the curves with constant
ρ
i
are those that point in the direction of
ρ
r
. In other
words, the stationary phase curves are exactly the curves of steepest descent of
ρ
r
.
Often, the function
ρ
has some stationary points, i.e. points
k
satisfying
ρ
0
(
k
) = 0. Generically, we expect this to be a second-order zero, and thus
ρ
looks like (k k
)
2
near k
. We can plot the contours of ρ
i
near k
as follows:
k
where the arrows denote the direction of steepest descent of
ρ
r
. Note that
since the real part satisfies Laplace’s equation, such a stationary point must
be a saddle, instead of a local maximum or minimum, even if the zero is not
second-order.
We now see that if our start and end points lie on opposite sides of the ridge,
i.e. one is below the horizontal line and the other is above, then the only way to
do so while staying on a path of stationary phase is to go through the stationary
point.
Along such a contour, we would expect the greatest contribution to the
integral to occur when ρ
r
is the greatest, i.e. at k
. We can expand ρ about k
as
ρ(k) ρ(k
) +
1
2
2
ρ
k
2
(k
)(k k
)
2
.
We can then approximate
H(t)
i
2π
Z
ε
f(k)e
(k)
dk,
where we are just integrating over a tiny portion of our contour near
k
. Putting
in our series expansion of ρ, we can write this as
H(t)
i
2π
f(k
)e
(k
)
Z
ε
exp
t
2
2
ρ
k
2
(k
)(k k
)
2
dk.
Recall that we picked our path to be the path of steepest descent on both sides
of the ridge. So we can paramretrize our path by K such that
(iK)
2
=
t
2
2
ρ
k
2
(k
)(k k
)
2
,
where K is purely real. So our approximation becomes
H(t)
f(k
)e
(k
)
p
2π
2
00
(k
)
Z
ε
ε
e
K
2
dK.
Since
e
K
2
falls of so quickly as
k
gets away from 0, we may approximate that
integral by an integral over the whole real line, which we know gives
π
. So our
final approximation is
H(t)
f(k
)e
(k
)
p
2π
00
(k
)
.
We then look at the maxima of
ρ
r
(
k
) along these paths and see which has the
greatest contribution.
Now let’s apply this to our situation. Our ρ was given by
ρ(k) = i(kV ω(k)).
So k
is given by solving
ω
k
(k
) = V. ()
Thus, what we have shown was that the greatest contribution to the Green’s
function along the
V
direction comes from the the modes whose group velocity
is V ! Note that in general, this k
is complex.
Thus, the conclusion is that given any
V
, we should find the
k
such that (
)
is satisfied. The temporal growth rate along
x
t
= V is then
σ(V ) = ω
i
(k
) k
i
V.
This is the growth rate we would experience if we moved at velocity
V
. However,
it is often more useful to consider the absolute growth rate. In this case, we
should try to maximize
ω
i
. Suppose this is achieved at
k
=
k
max
, possibly
complex. Then we have
ω
i
k
(k
max
) = 0.
But this means that
c
g
=
ω
k
is purely real. Thus, this maximally unstable mode
can be realized along some physically existent V = c
g
.
We can now say
If ω
i,max
< 0, then the flow is linearly stable.
If ω
i,max
> 0, then the flow is linearly unstable. In this case,
If ω
0,i
< 0, then the flow is convectively unstable.
If ω
0,i
> 0, then the flow is absolutely unstable.
Here
ω
0
is defined by first solving
ω
k
(
k
0
) = 0, thus determining the
k
0
that
leads to a zero group velocity, and then setting
ω
0
=
ω
(
k
0
). These quantities
are known as the absolute frequency and absolute growth rate respectively.
Example.
We can consider a “model dispersion relation” given by the linear
complex Ginzburg–Landau equation
t
+ U
x
χ µχ (1 + ic
d
)
2
x
2
χ = 0.
This has advection (
U
), dispersion (
c
d
) and instability (
µ
). Indeed, if we replace
t
and
x
ik, then we have
i(ω + Uk)χ µχ + (1 + ic
d
)k
2
χ = 0.
This gives
ω = Uk + c
d
k
2
+ i(µ k
2
).
We see that we have temporal instability for
|k| <
µ
, where we assume
µ >
0.
This has
c
r
=
ω
r
k
= U + c
d
k.
On the other hand, if we force ω R, then we have spatial instability. Solving
k
2
(c
d
i) + Uk + ( ω) = 0,
the quadratic equation gives us two branches
k
±
=
U ±
p
k
2
4( ω)(c
d
i)
2(c
d
i)
.
To understand whether this instability is convective or absolute, we can complete
the square to obtain
ω = ω
0
+
1
2
ω
kk
(k k
0
)
2
,
where
ω
kk
= 2(c
d
i)
k
0
=
U
2(i c
d
)
ω
0
=
c
d
U
2
4(1 + c
2
d
)
+ i
µ
U
2
4(1 + c
2
d
)
.
These k
0
are then the absolute wavenumber and absolute frequency.
Note that after completing the square, it is clera that
k
0
is a double root at
ω
=
ω
0
. Of course, there is not nothing special about this example, since
k
0
was
defined to solve
ω
k
(k
0
) = 0!
I really ought to say something about Bers’ method at this point.
In practice, it is not feasible to apply a
δ
function of perturbation and see
how it grows. Instead, we can try to understand convective versus absolute
instabilities using a periodic and switched on forcing
F (x, t) = δ(x)H(t)e
f
t
,
where H is the Heaviside step function. The response in spectral space is then
˜χ(k, ω) =
˜
F (k, ω)
D(k, ω)
=
i
D(k, ω)(ω ω
f
)
.
There is a new (simple) pole precisely at
ω
=
ω
f
on the real axis. We can invert
for t to obtain
˜χ(k, t) =
i
2π
Z
L
ω
e
t
D(k, ω)(ω ω
F
)
dω =
e
f
t
D(k, ω
f
)
+
e
(k)t
(ω(k) ω
f
)
˜
D
ω
[k, ω(k)]
.
We can then invert for x to obtain
χ(x, t) =
e
f
t
2π
Z
F
k
e
ikx
D(k, ω)
dk
| {z }
χ
F
(x,t)
+
1
2π
Z
F
k
e
i[kxω(k)t]
[ω(k) ω
f
]
˜
D
ω
[k, ω(k)]
| {z }
χ
T
(x,t)
.
The second term is associated with the switch-on transients, and is very similar
to what we get in the previous steepest descent.
If the flow is absolutely unstable, then the transients dominate and invade
the entire domain. But if the flow is convectively unstable, then this term is
swept away, and all that is left is
χ
F
(
x, t
). This gives us a way to distinguish
between
Note that in the forcing term, we get contributions at the singularities
k
±
(
ω
f
).
Which one contributes depends on causality. One then checks that the correct
result is
χ
F
(x, t) = iH(x)
e
i(k
+
(ω
f
)xω
f
t)
D
k
[k
+
(ω
f
), ω
f
]
iH(x)
e
i(k
(ω
f
)xω
f
t)
D
k
[k
(ω
f
), ω
f
]
.
Note that
k
±
(
ω
f
) may have imaginary parts! If there is some
ω
f
such that
k
+
i
(
ω
f
)
>
0, then we will see spatially growing waves in
x >
0. Similarly, if
there exists some
ω
f
such that
k
i
(
ω
f
)
<
0, then we see spatially growing
waves in x < 0.
Note that we will see this effect only when we are convectively unstable.
Perhaps it is wise to apply these ideas to an actual fluid dynamics problem.
We revisit the broken line shear layer profile, scaled with the velocity jump, but
allow a non-zero mean U:
z = 1
z = 1
¯
U = 1 + U
m
¯
U = z + U
m
¯
U = 1 + U
m
III
II
I
Ae
α(z1)
Be
αz
+ Ce
αz
De
α(z+1)
We do the same interface matching conditions, and after doing the same compu-
tations (or waving your hands with Galilean transforms), we get the dispersion
relation
4(ω U
m
α)
2
= (2α 1)
2
e
4α
.
It is now more natural to scale with
U
m
rather than
U/
2, and this involves
expressing everything in terms of the velocity ratio
R
=
U
2U
m
. Then we can write
the dispersion relation as
D(k, ω; R) = 4(ω k)
2
R
2
[(2k 1)
2
e
4k
] = 0.
Note that under this scaling, the velocity for
z <
1 is
u
= 1
R
. In particular,
if
R <
1, then all of the fluid is flowing in the same direction, and we might
expect the flow to “carry away” the perturbations, and this is indeed true.
The absolute/convective boundary is given by the frequency at wavenumber
for zero group velocity:
ω
k
(k
0
) = 0.
This gives us
ω
0
= k
0
R
2
2
[2k
0
1 + e
4k
0
].
Plugging this into the dispersion relations, we obtain
R
2
[2k
0
1 + e
4k
0
] [(2k
0
1)
2
e
4k
0
] = 0.
We solve for
k
0
, which is in general complex, and then solve for
ω
to see if it
leads to
ω
0,i
>
0. This has to be done numerically, and when we do this, we see
that the convective/absolute boundary occurs precisely at R = 1.
Gaster relation
Temporal and spatial instabilities are related close to a marginally stable state.
This is flow at a critical parameter
R
c
with critical (real) wavenumber and
frequency: D(k
c
, ω
c
; R
c
) = 0 with ω
c,i
= k
c,i
= 0.
We can Taylor expand the dispersion relation
ω = ω
c
+
ω
k
(k
c
; R
c
)[k k
c
].
We take the imaginary part
ω
i
=
ω
i
k
r
(k
c
, R
c
)(k
r
k
c
) +
ω
r
k
r
(k
c
, R
c
)k
i
.
For the temporal mode, we have k
i
= 0, and so
ω
(T )
i
=
ω
i
k
r
(k
c
, R
c
)(k
r
k
c
).
For the spatial mode, we have ω
i
= 0, and so
0 =
ω
i
k
r
(k
c
, R
c
)[k
r
k
c
] +
ω
r
k
r
(k
c
, R
c
)k
(S)
i
.
Remembering that c
g
=
ω
r
k
r
. So we find that
ω
(T )
i
= c
g
k
(S)
i
.
This gives us a relation between the growth rates of the temporal mode and
spatial mode when we are near the marginal stable state.
Bizarrely, this is often a good approximation when we are far from the
marginal state.
Global instabilities
So far, we have always been looking at flows that were parallel, i.e. the base
flow depends on
z
alone. However, in real life, flows tend to evolve downstream.
Thus, we want to consider base flows U = U(x, z).
Let
λ
be the characteristic wavelength of the perturbation, and
L
be the
characteristic length of the change in
U
. If we assume
ε
λ
L
1, then we may
want to perform some local analysis.
To leading order in
ε
, evolution is governed by frozen dispersion relation at
each
X
=
εx
. We can then extend notions of stability/convective/absolute to
local notions, e.g. a flow is locally convectively unstable if there is some
X
such
that ω
i,max
(X) > 0, but ω
0,i
(X) < 0 for all X.
However, we can certainly imagine some complicated interactions between
the different region. For example, a perturbation upstream may be swept
downstream by the flow, and then get “stuck” somewhere down there. In general,
we can define
Definition
(Global stability)
.
A flow is globally stable if
lim
t→∞
G
(
x, t
) = 0 for
all x.
A flow is globally unstable if there is some
x
such that
lim
t→∞
G
(
x, t
)
.
For a steady spatially developing flow, global modes are
χ(x, t) = φ(x)e
G
t
.
The complex global frequency
ω
G
is determined analogously to before using
complex integration.
It can be established that
ω
G,i
ω
0,i,max
. This then gives a necessary
condition for global instability: there must be a region of local absolute instability
within the flow.
Sometimes this is a good predictor, i.e.
R
G
c
' R
t
. For example, with mixing
layers, we have
R
t
= 1
.
315 while
R
G
c
= 1
.
34. On the other hand, it is sometimes
poor. For example, for bluff-body wakes, we have Re
t
= 25 while Re
Gc
= 48.5.
3 Transient growth
3.1 Motivation
So far, our model of stability is quite simple. We linearize our theory, look at
the individual perturbation modes, and say the system is unstable if there is
exponential growth. In certain circumstances, it works quite well. In others,
they are just completely wrong.
There are six billion kilometers of pipes in the United States alone, where the
flow is turbulent. A lot of energy is spent pumping fluids through these pipes,
and turbulence is not helping. So we might think we should try to understand
flow in a pipe mathematically, and see if it gives ways to improve the situation.
Unfortunately, we can prove that flow in a pipe is linearly stable for all
Reynolds numbers. The analysis is not wrong. The flow is indeed linearly stable.
The real problem is that linear stability is not the right thing to consider.
We get similar issues with plane Poiseuille flow, i.e. a pressure driven flow
between horizontal plates. As we know from IB Fluids, the flow profile is a
parabola:
One can analyze this and prove that it is linearly unstable when
Re =
U
c
d
ν
> 5772.
However, it is observed to be unstable at much lower Re.
We have an even more extreme issue for plane Couette flow. This is flow
between two plates at
z
=
±
1 (after rescaling) driven at speeds of
±
1 (after
scaling). Thus, the base flow is given by
¯
U = z, |z| 1.
Assuming the fluid is inviscid, the Rayleigh equation then tells us perturbations
obey
(
¯
U c)
d
2
dz
2
k
2
d
2
dz
2
¯
U
ˆw = 0.
Since
¯
U = z, the second derivative term drops out and this becomes
(
¯
U c)
d
2
dz
2
k
2
ˆw = 0.
If we want our solution to be smooth, or even just continuously differentiable,
then we need
d
2
dz
2
k
2
ˆw = 0. So the solution is of the form
ˆw = A sinh k(z + 1) + B sinh k(z 1).
However, to satisfy the boundary conditions
ˆw
(
±
1) = 0, then we must have
A = B = 0, i.e. ˆw = 0.
Of course, it is not true that there can be no perturbations. Instead, we have
to relax the requirement that the eigenfunction is smooth. We shall allow it
to be non-differentiable at certain points, but still require that it is continuous
(alternatively, we relax differentiability to weak differentiability).
The fundamental assumption that the eigenfunction is smooth must be
relaxed. Let’s instead consider a solution of the form
ˆw
+
= A
+
sinh k(z 1) z > z
c
ˆw
= A
sinh k(z + 1) z < z
c
.
If we require the vertical velocity to be continuous at the critical layer, then we
must have the matching condition
A
+
sinh k(z z
c
) = A
sinh k(z + z
c
).
This still satisfies the Rayleigh equation if
¯
U
=
c
at the critical layer. Note that
u is discontinuous at the critical layer, because incompressibility requires
w
z
=
u
x
= iku.
c
So for all
|c|
=
|ω/k| <
1, there is a (marginally stable) mode. The spectrum
is continuous. There is no discrete spectrum. This is quite weird, compared to
what we have previously seen.
But still, we have only found modes with a real
c
, since
¯
U
is real! Thus, we
conclude that inviscid plane Couette flow is stable! While viscosity regularizes
the flow, but it turns out that does not linearly destabilize the flow at any
Reynolds number (Romanov, 1973).
Experimentally, and numerically, plane Couette flow is known to exhibit a
rich array of dynamics.
Up to Re 280, we have laminar flow.
Up to Re 325, we have transient spots.
Up to Re 415, we have sustained spots and stripes.
For Re > 415, we have fully-developed turbulence.
In this chapter, we wish to understand transient growth. This is the case when
small perturbations can grow up to some visible, significant size, and then die
off.
3.2 A toy model
Let’s try to understand transient dynamics in a finite-dimensional setting. Ul-
timately, the existence of transient growth is due to the non-normality of the
operator.
Recall that a matrix
A
is normal iff
A
A
=
AA
. Of course, self-adjoint
matrices are examples of normal operators. The spectral theorem says a normal
operator has a complete basis of orthonormal eigenvectors, which is the situation
we understand well. However, if our operator is not normal, then we don’t
necessarily have a basis of eigenvectors, and even if we do, they need not be
orthonormal.
So suppose we are in a 2-dimensional world, and we have two eigenvectors
that are very close to each other:
Φ
2
Φ
1
Now suppose we have a small perturbation given by
ε
= (
ε,
0)
T
. While this
perturbation is very small in magnitude, if we want to expand this in the basis
Φ
1
and Φ
2
, we must use coefficients that are themselves quite large. In this case,
we might have ε = Φ
2
Φ
1
, as indicated in red in the diagram above.
Let’s let this evolve in time. Suppose both Φ
1
and Φ
2
are stable modes, but
Φ
1
decays much more quickly than Φ
2
. Then after some time, the perturbation
will grow like
Note that the perturbation grows to reach a finite, large size, until it vanishes
again as the Φ
1
component goes away.
Let’s try to put this down more concretely in terms of equations. We shall
consider a linear ODE of the form
˙
x = Ax.
We first begin by considering the matrix
A =
0 1
0 0
,
which does not exhibit transient growth. We first check that
A
is not normal.
Indeed,
AA
T
1 0
0 0
6=
0 0
0 1
= A
T
A.
Note that the matrix
A
has a repeated eigenvalue of 0 with a single eigenvector
of (1 0).
To solve this system, write the equations more explicitly as
˙x
1
= x
2
˙x
2
= 0.
If we impose the initial condition
(x
1
, x
2
)(0) = (x
10
, x
20
),
then the solution is
x
1
(t) = x
20
t + x
10
x
2
(t) = x
20
.
This exhibits linear, algebraic growth instead of the familiar exponential growth.
Let’s imagine we perturb this system slightly. We can think of our previous
system as the
Re
=
approximation, and this small perturbation as the effect
of a large but finite Reynolds number. For ε > 0, set
A
ε
=
ε 1
0 2ε
.
We then have eigenvalues
λ
1
= ε
λ
2
= 2ε,
corresponding to eigenvectors
e
1
=
1
0
, e
2
=
1
ε
.
Notice that the system is now stable. However, the eigenvectors are very close
to being parallel.
As we previously discussed, if we have an initial perturbation (
ε,
0), then it
is expressed in this basis as
e
1
e
2
. As we evolve this in time, the
e
2
component
decays more quickly than
e
1
. So after some time, the
e
2
term is mostly gone,
and what is left is a finite multiple of
e
1
, and generically, we expect this to have
magnitude larger than ε!
We try to actually solve this. The second row in
˙
x = Ax gives us
˙x
2
= 2εx
2
.
This is easy to solve to get
x
2
= x
20
e
2εt
.
Plugging this into the first equation, we have
˙x
1
= εx
1
+ x
20
e
2εt
.
We would expect a solution of the form
x
1
=
Ae
εt
Be
2εt
, where the first term
is the homogeneous solution and the second comes from a particular solution.
Plugging this into the equation and applying our initial conditions, we need
x
1
=
x
10
+
x
20
ε
e
εt
x
20
ε
e
2εt
.
Let us set
y
10
= x
10
+
x
20
ε
y
20
=
x
20
ε
.
Then the full solution is
x = y
10
e
εt
e
1
+ y
20
e
2εt
e
2
.
For
εt
1, we know that
x y
10
e
εt
e
2
. So our solution is an exponentially
decaying solution.
But how about early times? Let’s consider the magnitude of x. We have
kxk
2
= y
2
10
e
2εt
e
1
· e
2
+ 2y
10
y
20
e
3εt
e
1
· e
2
+ y
2
20
e
4εt
e
2
e
2
= y
2
10
e
2εt
+ 2y
10
y
20
e
3εt
+ (1 + ε
2
)y
2
20
e
4εt
.
If y
10
= 0 or y
20
= 0, then this corresponds to pure exponential decay.
Thus, consider the situation where
y
20
=
ay
10
for
a 6
= 0. Doing some
manipulations, we find that
kxk
2
= y
2
10
(1 2a + a
2
(1 + ε
2
)) + y
2
10
(2 + 6a 4a
2
(1 + ε
2
))εt + O(ε
2
t
2
).
Therefore we have initial growth if
4a
2
(1 + ε
2
) a + 2 < 0.
Equivalently, if
a
< a < a
+
with
a
±
=
3 ±
1 8ε
2
4(1 + ε
2
)
.
Expanding in ε, we have
a
+
= 1 2ε
2
+ O(ε
4
),
a
=
1 + ε
2
2
+ O(ε
4
).
These correspond to
s
20
'
x
10
2ε
and
x
20
' εx
10
respectively, for
ε
1. This is
interesting, since in the first case, we have
x
20
x
10
, while in the second case,
we have the opposite. So this covers a wide range of possible x
10
, x
20
.
What is the best initial condition to start with if we want the largest possible
growth? Let’s write everything in terms of
a
for convenience, so that the energy
is given by
E =
x · x
2
=
y
2
10
2
(e
2εt
2ae
3εt
+ a
2
(1 + ε
2
)e
4εt
).
Take the time derivative of this to get
dE
dt
= ε
y
2
10
2
e
2εt
(2 6(ae
εt
) + 4(1 + ε
2
)(ae
εt
)
2
).
Setting ˆa = ae
εt
, we have
dE
dt
> 0 iff
2 6ˆa + 4ˆa
2
(1 + ε
2
) < 0.
When
t
= 0, then
ˆa
=
a
, and we saw that there is an initial growth if
a
< a < a
+
.
We now see that we continue to have growth as long as
ˆa
lies in this region. To
see when E reaches a maximum, we set
dE
dt
= 0, and so we have
(a
ae
εt
)(ae
εt
a
+
) = 0.
So a priori, this may happen when
ˆa
=
a
or
a
+
. However, we know it must
occur when
ˆa
hits
a
, since
ˆa
is decreasing with time. Call the time when this
happens t
max
, which we compute to be given by
εt
max
= log
a
a
.
Now consider the energy gain
G =
E(t
max
)
E(0)
=
a
2
a
2
a
2
(1 + ε
2
) 2a
1
+ 1
a
2
(1 + ε
2
) 2a + 1
.
Setting
dG
da
= 0, we find that we need a = a
+
, and so
max
a
G =
(3a
1)(1 a
)
(3a
+
1)(1 a
+
)
.
We can try to explicitly compute the value of the maximum energy gain,
using our first-order approximations to a
±
;
max
a
G =
3
1+ε
2
2

1
1+ε
2
2
(3(1 + 2ε
2
) 1)(1 (1 2ε
2
))
=
(1 + 3ε
2
)(1 ε
2
)
16(1 3ε
2
)ε
2
1
16ε
2
.
So we see that we can have very large transient growth for a small ε.
How about the case where there is an unstable mode? We can consider a
perturbation of the form
A =
ε
1
1
0 ε
2
,
with eigenvectors
e
1
=
1
0
, e
2
=
1
p
1 + (ε
1
+ ε
2
)
2
1
(ε
1
+ ε
2
)
We then have one growing mode and another decaying mode. Again, we have
two eigenvectors that are very close to being parallel. We can do very similar
computations, and see that what this gives us is the possibility of a large initial
growth despite the fact that
ε
1
is very small. In general, this growth scales as
1
1e
1
·e
2
.
3.3 A general mathematical framework
Let’s try to put this phenomenon in a general framework, which would be
helpful since the vector spaces we deal with in fluid dynamics are not even
finite-dimensional. Suppose x evolves under an equation
˙
x = Ax.
Given an inner product
h·, ·i
on our vector space, we can define the adjoint of
A by requiring
hx, Ayi = hA
x, yi
for all
x, y
. To see why we should care about the adjoint, note that in our
previous example, the optimal perturbation came from an eigenvector of
A
for
λ
1
=
ε
, namely
ε
1
+ ε
2
1
and we might conjecture that this is a general
phenomenon.
For our purposes, we assume
A
and hence
A
have a basis of eigenvectors.
First of all, observe that the eigenvalues of
A
are the complex conjugates of
the eigenvalues of
A
. Indeed, let
v
1
, . . . , v
n
be a basis of eigenvectors of
A
with eigenvalues
λ
1
, . . . , λ
n
, and
w
1
, . . . , w
n
a basis of eigenvectors of
A
with
eigenvalues µ
1
, . . . , µ
n
. Then we have
λ
j
hw
i
, v
j
i = hw
i
, Av
j
i = hA
w
i
, v
j
i = µ
i
hw
i
, v
j
i.
But since the inner product is non-degenerate, for each
i
, it cannot be that
hw
i
, v
j
i = 0 for all j. So there must be some j such that λ
j
= µ
i
.
By picking appropriate basis for each eigenspace, we can arrange the eigen-
vectors so that
hw
i
, v
j
i
= 0 unless
i
=
j
, and
kw
i
k
=
kv
j
k
= 1. This is the
biorthogonality property. Crucially, the basis is not orthonormal.
Now if we are given an initial condition
x
0
, and we want to solve the equation
˙
x
=
Ax
. Note that this is trivial to solve if
x
0
=
v
j
for some
j
. Then the
solution is simply
x = e
λ
j
t
v
j
.
Thus, for a general
x
0
, we should express
x
0
as a linear combination of the
v
j
’s.
If we want to write
x
0
=
k
X
i=1
α
i
v
i
,
then using the biorthogonality condition, we should set
α
i
=
hx
0
, w
i
i
hv
i
, w
i
i
.
Note that since we normalized our eigenvectors so that each eigenvector has
norm 1, the denominator
hv
i
, w
i
i
can be small, hence
α
i
can be large, even if
the norm of
x
0
is quite small, and as we have previously seen, this gives rise to
transient growth if the eigenvalue of
v
i
is also larger than the other eigenvalues.
In our toy example, our condition for the existence of transient growth is that
we have two eigenvectors that are very close to each other. In this formulation,
the requirement is that
hv
i
, w
i
i
is very small, i.e.
v
i
and
w
i
are close to being
orthogonal. But these are essentially the same, since by the biorthogonality
conditions,
w
i
is normal to all other eigenvectors of
A
. So if there is some
eigenvector of
A
that is very close to
v
i
, then
w
i
must be very close to being
orthogonal to v
i
.
Now assuming we have transient growth, the natural question to ask is how
large this growth is. We can write the solution to the initial value problem as
x(t) = e
At
x
0
.
The maximum gain at time t is given by
G(t) = max
x
0
6=0
kq(t)k
2
kq(0)k
= max
x
0
6=0
ke
At
x
0
k
2
kx
0
k
2
.
This is, by definition, the matrix norm of B.
Definition
(Matrix norm)
.
Let
B
be an
n ×n
matrix. Then the matrix norm
is
kBk = max
v6=0
kBvk
kvk
.
To understand the matrix norm, we may consider the eigenvalues of the matrix.
Order the eigenvalues of
A
by their real parts, so that
Re
(
λ
1
)
··· Re
(
λ
n
).
Then the gain is clearly bounded below by
G(t) e
2 Re(λ
1
)t
,
achieved by the associated eigenvector.
If
L
is normal, then this is the complete answer. We know the eigenvectors
form an orthonormal basis, so we can write
L
=
V
Λ
V
1
for Λ =
diag
(
λ
1
, . . . , λ
n
)
and V is unitary. Then we have
G(t) = ke
Lt
k
2
= kV e
Λt
V
1
k
2
= ke
Λt
k
2
= e
2 Re(λ
1
)t
.
But life gets enormously more complicated when matrices are non-normal. As a
simple example, the matrix
1 1
0 1
has 1 as the unique eigenvalue, but applying
it to (0, 1) results in a vector of length
2.
In the non-normal case, it would still be convenient to be able to diagonalize
e
At
in some sense, so that we can read off its norm. To do so, we must relax
what we mean by diagonalizing. Instead of finding a
U
such that
U
e
At
U
is
diagonal, we find unitary matrices U, V such that
U
e
At
V = Σ = diag(σ
1
, . . . , σ
n
),
where
σ
i
R
and
σ
1
> ··· > σ
n
0. We can always do this. This is known
as the singular value decomposition, and the diagonal entries
σ
i
are called the
singular values. We then have
G(t) = ke
At
k
2
= max
x6=0
(e
At
x, e
At
x)
(x, x)
= max
x6=0
(UΣV
x, U ΣV
x)
(x, x)
= max
x6=0
V
x, ΣV
x)
(x, x)
= max
y6=0
y, Σy)
(y, y)
= σ
2
1
(t).
If we have an explicit singular value decomposition, then this tells us the optimal
initial condition if we want to maximize G(t), namely the first column of v.
3.4 Orr-Sommerfeld and Squire equations
Let’s now see how this is relevant to our fluid dynamics problems. For this
chapter, we will use the “engineering” coordinate system, so that the
y
direction
is the vertical direction. The
x, y, z
directions are known as the streamwise
direction, wall-normal direction and spanwise direction respectively.
y
x
z
Again suppose we have a base shear flow
U
(
y
) subject to some small perturbations
(u, v, w). We can write down our equations as
u
x
+
v
y
+
w
z
= 0
u
t
+ U
u
x
+ vU
0
=
p
x
+
1
Re
2
u
v
t
+ U
v
x
=
p
y
+
1
Re
2
v
w
t
+ U
w
x
=
p
z
+
1
Re
2
w.
Again our strategy is to reduce these to a single, higher order equation in
v
.
To get rid of the pressure term, we differentiate the second, third and fourth
equation with respect to
x, y
and
z
respectively and apply incompressibility to
obtain
2
p = 2U
0
v
x
.
By applying
2
to the third equation, we get an equation for the wall-normal
velocity:

t
+ U
x
2
U
00
x
1
Re
4
v = 0.
We would like to impose the boundary conditions
v
=
v
y
= 0, but together with
initial conditions, this is not enough to specify the solution uniquely as we have
a fourth order equation. Thus, we shall require the vorticity to vanish at the
boundary as well.
The wall-normal vorticity is defined by
η = ω
y
=
u
z
w
x
.
By taking
z
of the second equation and then subtracting the
x
of the last, we
then get the equation
t
+ U
x
1
Re
2
η = U
0
v
z
.
As before, we decompose our perturbations into Fourier modes:
v(x, y, z, t) = ˆv(y)e
i(αx+βzωt)
η(x, y, z, t) = ˆη(y)e
i(αx+βzωt)
.
For convenience, we set
k
2
=
α
2
+
β
2
and write
D
for the
y
derivative. We
then obtain the Orr-Sommerfeld equation and Squire equation with boundary
conditions ˆv = Dˆv = η = 0:
( + iαU)(D
2
k
2
) iαU
00
1
Re
(D
2
k
2
)
2
ˆv = 0
( + iαU)
1
Re
(D
2
k
2
)
ˆη = U
0
ˆv.
Note that if we set Re = , then this reduces to the Rayleigh equation.
Let’s think a bit more about the Squire equation. Notice that there is an
explicit
ˆv
term on the right. Thus, the equation is forced by the wall-normal
velocity. In general, we can distinguish between two classes of modes:
(i)
Squire modes, which are solutions to the homogeneous problem with
ˆv
= 0;
(ii) Orr-Sommerfeld modes, which are particular integrals for the actual ˆv;
The Squire modes are always damped. Indeed, set
ˆv
= 0, write
ω
=
αc
, multiply
the Squire equation by η
and integrate across the domain:
c
Z
1
1
ˆη
ˆη dy =
Z
1
1
U ˆη
η dy
i
αRe
Z
1
1
ˆη
(k
2
D
2
)ˆη dy.
Taking the imaginary part and integrating by parts, we obtain
c
i
Z
1
1
|ˆη|
2
dy =
1
αRe
Z
1
1
|Dˆη|
2
+ k
2
|ˆη|
2
dy < 0.
So we see that the Squire modes are always stable, and instability in vorticity
comes from the forcing due to the velocity.
It is convenient to express the various operators in compact vector form.
Define
ˆ
q =
ˆ
v
ˆη
, M =
k
2
D
2
0
0 1
, L =
L
0S
0
U
0
L
SQ
,
where
L
OS
= iαU(k
2
D
2
) + iαU
0
+
1
Re
(k
2
D
2
)
2
L
SQ
= iαU +
1
Re
(k
2
D)
2
.
We can then write our equations as
L
ˆ
q = M
ˆ
q.
This form of equation reminds us of what we saw in Sturm–Liouville theory,
where we had an eigenvalue equation of the form
Lu
=
λwu
for some weight
function
w
. Here
M
is not just a weight function, but a differential operator.
However, the principle is the say. First of all, this tells us the correct inner
product to use on our space of functions is
hp, qi =
Z
p
Mq dy., ()
We can make the above equation look like an actual eigenvalue problem by
writing it as
M
1
L
ˆ
q =
ˆ
q.
We want to figure out if the operator
M
1
L
is self-adjoint under (
), because
this tells us something about its eigenvalues and eigenfunctions. So in particular,
we should be able to figure out what the adjoint should be. By definition, we
have
hp, M
1
Lqi =
Z
p
MM
1
Lq dy
=
Z
p
Lq dy
=
Z
(q
(L
p))
dy
=
Z
(q
M(M
1
L
p))
dy,
where
L
is the adjoint of
L
under the
L
2
norm. So the adjoint eigenvalue
equation is
L
q = Mq.
Here we introduced a negative sign in the right hand side, which morally comes
from the fact we “took the conjugate” of
i
. Practically, adopting this sign
convention makes, for example, the statement of biorthogonality cleaner.
So mathematically, we know we should take the inner product as
R
p
Mq
.
Physically, what does this mean? From incompressibility
u
x
+
v
y
+
w
z
= 0,
plugging in our series expansions of v, η etc. gives us
ˆu + ˆw = −Dˆv, ˆu ˆw = ˆη.
So we find that
ˆu =
i
k
2
(αDˆv βη), ˆw =
i
k
2
(βDˆv + αη).
Thus, we have
1
2
(|u|
2
+ |w|
2
) =
1
2k
2
|Dˆv|
2
+ |η|
2
,
and the total energy is
E =
Z
1
2k
2
|Dˆv|
2
+ k
2
|ˆv|
2
+ |η|
2
=
1
2k
2
Z
1
1
(ˆv
ˆη
)
k
2
D
2
0
0 1
ˆv
ˆη
dy =
1
2k
2
hq, qi.
So the inner product the mathematics told us to use is in fact, up to a constant
scaling, the energy!
We can now try to discretize this, do SVD (numerically), and find out what
the growth rates are like. However, in the next chapter, we shall see that there
is a better way of doing so. Thus, in the remainder of this chapter, we shall look
at ways in which transient growth can physically manifest themselves.
Orr’s mechanisms
Suppose we have a simple flow profile that looks like this:
Recall that the z-direction vorticity is given by
ω
3
=
v
x
u
y
,
and evolves as
ω
3
t
+ U
ω
3
x
= U
0
u
x
+
v
y
+ vU
00
+
1
Re
2
ω
3
.
Assuming constant shear and
β
= 0 at high Reynolds number, we have
Dω
3
Dt
'
0.
Suppose we have some striped patch of vorticity:
In this stripe, we always have
ω
3
>
0. Now over time, due to the shear flow, this
evolves to become something that looks like
Now by incompressibility, the area of this new region is the same as the area
of the old. We also argued that
ω
3
does not change. So the total quantity
R
D
ω
3
dA =
R
D
× u · dA does not change. But Stokes’ theorem says
Z
D
× u · dA =
Z
D
u · d`.
Since the left-hand-side didn’t change, the same must be true for the right hand
side. But since the boundary
D
decreased in length, this implies
u
must have
increased in magnitude! This growth is only transient, since after some further
time, the vorticity patch gets sheared into
But remember that Stokes theorem tells us
Z
D
[ × u] dA =
Z
D
u · d`.
Thus, if we have vortices that are initially tilted into the shear, then this gets
advected by the mean shear. In this process, the perimeter of each vortex sheet
gets smaller, then grows again. Since
R
D
u ·
d
`
is constant, we know
u
grows
transiently, and then vanishes again.
Lift up
Another mechanism is lift-up. This involves doing some mathematics. Suppose
we instead expand our solutions as
v = ˜v(y, t)e
iαx+z
, η = ˜η(y, t)e
iαx+z
.
Focusing on the
α
= 0 and
Re
=
case, the Orr-Sommerfeld and Squire
equations are
t
˜η(y, t) = U
0
˜v
t
(D
2
k
2
)˜v = 0.
Since we have finite depth, except for a few specific values of
k
, the only solution
to the second equation is ˜v(y, t) = ˜v
0
(y), and then
˜η = ˜η
0
U
0
˜v
0
t.
This is an algebraic instability. The constant
˜v
(
y, t
) means fluid is constantly
being lifted up, and we can attribute this to the effects of the
ω
1
of streamwise
rolls.
We should be a bit careful when we consider the case where the disturbance
is localized. In this case, we should consider quantities integrated over all
x
. We
use a bar to denote this integral, so that, for example,
¯v
=
R
−∞
v
d
x
. Of course,
this makes sense only if the disturbance is local, so that the integral converges.
Ultimately, we want to understand the growth in the energy, but it is convenient
to first understand ¯v, ¯u, and the long-forgotten ¯p. We have three equations
2
p = 2U
0
v
x
v
t
+ U
v
x
=
p
y
u
t
+ U
u
x
+ vU
0
=
p
x
.
Note that
U
does not depend on
x
, and all our (small lettered) variables vanish
at infinity since the disturbance is local. Thus, integrating the first equation
over all
x
, we get
2
¯p
= 0. So
¯p
= 0. Integrating the second equation then tells
us
¯v
t
= 0. Finally, plugging this into the integral of the last equation tells us
¯u = ¯u
0
¯v
0
U
0
t.
Thus,
¯u
grows linearly with time. However, this does not immediately imply
that the energy is growing, since the domain is growing as well, and the velocity
may be spread over a larger region of space.
Let’s suppose
u
(
x,
0) = 0 for
|x| > δ
. Then at time
t
, we would expect
u
to
be non-zero only at
U
min
t δ < x < U
max
t
+
δ
. Recall that Cauchy–Schwarz
says
Z
−∞
f(x)g(x) dx
2
Z
−∞
|f(x)|
2
dx
Z
−∞
|g(x)|
2
dx
.
Here we can take f = u, and
g(x) =
(
1 U
min
t δ < x < U
max
t + δ
0 otherwise
.
Then applying Cauchy–Schwarz gives us
¯u
2
[∆U + t + 2δ]
Z
−∞
u
2
dx.
So we can bound
E
[¯u]
2
2∆U
t =
[¯v
0
U
0
]
2
t
2∆U
provided
t
2δ
U
. Therefore energy grows at least as fast as
t
, but not necessarily
as fast as t
2
.
4 A variational point of view
In this chapter, we are going to learn about a more robust and powerful way
to approach and understand transient growth. Instead of trying to think about
G(t) as a function of t, let us fix some target time T , and just look at G(T ).
For any two times t
i
< t
f
, we can define a propagator function such that
u(t
f
) = Φ(t
f
, t
i
)u(t
i
)
for any solution
u
of our equations. In the linear approximation, this propagator
is a linear function between appropriate function spaces. Writing Φ = Φ(
T, T
0
),
the gain problem is then equivalent to maximizing
G(T, T
0
) =
E(T )
E(T
0
)
=
hu
p
(T ), u
p
(T )i
hu
p
(T
0
), u
p
(T
0
)i
=
hΦu
p
(T
0
), Φu
p
(T
0
)i
hu
p
(T
0
), u
p
(T
0
)i
=
hu
p
(T
0
), Φ
Φu
p
(T
0
)i
hu
p
(T
0
), u
p
(T
0
)i
.
Here the angled brackets denote natural inner product leading to the energy
norm. Note that the operator Φ
Φ is necessarily self-adjoint, and so this
G
is maximized when
u
p
(
T
0
) is chosen to be the eigenvector of Φ of maximum
eigenvalue.
There is a general method to find the maximum eigenvector of a (self-adjoint)
operator Φ. We start with a random vector
x
. Then we have Φ
n
x λ
n
1
v
1
as
n
, where
λ
1
is the maximum eigenvalue with associated eigenvector
v
1
.
Indeed, if we write
x
as the linear combination of eigenvectors, then as we apply
Φ many times, the sum is dominated by the term with largest eigenvalue.
So if we want to find the mode with maximal transient growth, we only need
to be able to compute Φ
Φ. The forward propagator Φ(
T, T
0
) is something we
know how to compute (at least numerically). We simply numerically integrate
the Navier–Stokes equation. So we need to understand Φ(T, T
0
)
.
Let u(t) be a solution to the linear equation
du
dt
(t) = L(t)u(t).
Here we may allow
L
to depend on time. Let
L
be the adjoint, and suppose
v(t) satisfies
dv
dt
(t) = L(t)
v(t).
Then the chain rule tells us
d
dt
hv(t), u(t)i = h−L(t)
v(t), u(t)i+ hv(t), L(t)
u(t)i = 0.
So we know that
hv(T
0
), u(T
0
)i = hv(T ), u(T )i = hv(T ), Φ(T, T
0
)u(T
0
)i.
Thus, given a
v
0
, to compute Φ(
T, T
0
)
v
0
, we have to integrate the adjoint
dv
dt
=
L
(
t
)
v
backwards in time, using the “initial” condition
v
(
T
) =
v
0
, and
then we have
Φ(T, T
0
)
v
0
= v(T
0
).
What does the adjoint equation look like? For a time-dependent background
shear flow, the linearized forward/direct equation for a perturbation
u
p
is given
by
u
p
t
+ (U(t) · )u
p
= −∇p
p
(u
p
· )u(t) + Re
1
2
u
p
.
The adjoint (linearized) Navier–Stokes equation is then
u
a
t
= × u
a
× (U × u
σ
) p
a
+ Re
1
2
u
a
,
where we again have
· u
a
= 0, = × U.
This PDE is ill-posed if we wanted to integrate it forwards in time, but that
does not concern us, because to compute Φ
, we have to integrate it backwards
in time.
Thus, to find the maximal transient mode, we have to run the direct-adjoint
loop.
u
p
(T
0
) u
p
(T )
Φ
Φ(u
p
(T
0
)) u
p
(T )
Φ
Φ
,
and keep running this until it converges.
Using these analysis, we can find some interesting results. For example, in
a shear layer flow, 3-dimensional modes with both streamwise and spanwise
perturbations are the most unstable. However, in the long run, the Kelvin-
Helmholtz modes dominate.
Variational formulation
We can also use variational calculus to find the maximally unstable mode. We
think of the calculation as a constrained opt imization problem with the following
requirements:
(i) For all T
0
t T , we have
q
t
= D
t
q = Lq.
(ii) The initial state is given by q(0) = q
0
.
We will need Lagrangian multipliers to impose these constraints, and so the
augmented Lagrangian is
G =
hq
T
, q
T
i
hq
0
, q
0
i
Z
T
0
h
˜
q, (D
t
L)qi dt + h
˜
q
0
, q(0) q
0
i.
Taking variations with respect to
˜
q
and
˜
q
0
recover the evolution equation and
the initial condition. The integral can then be written as
Z
T
0
h
˜
q, (D
t
L)qi dt =
Z
T
0
hq, (D
t
+ L
)
˜
qi dt + h
˜
q
0
, q
0
i h
˜
q
T
, q
T
i.
Now if we take a variation with respect to q, we see that
˜
q has to satisfy
(D
t
+ L
)
˜
q = 0.
So the Lagrange multiplier within such a variational problem evolves according
to the adjoint equation!
Taking appropriate adjoints, we can write
G =
hq
T
, q
T
i
hq
0
, q
0
i
+
Z
T
0
hq, (D
t
+ L
)
˜
qi dt + h
˜
q
0
, q
0
i h
˜
q
T
, q
T
i+ boundary terms.
But if we take variations with respect to our initial conditions, then
δG
δq
0
= 0
gives
˜
q
0
=
2hq
T
, q
T
i
hq
0
, q
0
i
2
q
0
.
Similarly, setting
δG
δq
T
= 0, we get
˜
q
T
=
2
hq
0
, q
0
i
q
T
.
Applying h·, q
0
i to the first equation and h·, q
T
i to the second, we find that
h
˜
q
0
, q
0
i = h
˜
q
T
, q
T
i.
This is the equation we previously saw for the adjoint equation, and this provides
a consistency condition to see if we have found the optimal solution.
Previously our algorithm used power iteration, which requires us to integrate
forwards and backwards many times. However, we now have gradient information
for gain:
δG
q
0
=
˜
q
0
2hq
T
, q
T
i
hq
0
, q
0
i
2
q
0
.
This allows us to exploit optimization algorithms (steepest descent, conjugate
gradient etc.). This has an opportunity for faster convergence.
There are many ways we can modify this. One way is to modify the inner
product. Note that we actually had to use the inner product in three occasions:
(i) Inner product in the objective functional J
(ii) Inner product in initial normalization/constraint of the state vector.
(iii) Inner product in the definition of adjoint operator.
We used the same energy norm for all of these, but there is no reason we have
to use the same norm for all of these. However, there are strong arguments that
an energy norm is natural for norm 2. It is a wide open research question as to
whether there is an appropriate choice for inner product 3. On the other hand,
investigation of variation of inner product 1 has been widely explored.
As an example, consider p-norms
J =
1
V
Z
e(x, T )
p
dΩ
1/p
, e(x, T ) =
1
2
|u(x, T )|
2
.
This has the attraction that for large values of
p
, this would be dominated by
peak values of e, not average.
When we set
p
= 1, i.e. we use the usual energy norm, then we get a beautiful
example of the Orr mechanism, in perfect agreement with what SVD tells us.
For, say, p = 50, we get more exotic center/wall modes.
Non-linear adjoints
In the variational formulation, there is nothing in the world that stops us from
using a non-linear evolution equation! We shall see that this results in slightly
less pleasant formulas, but it can still be done.
Consider plane Couette flow with
Re
= 1000, and allow arbitrary amplitude
in perturbation
u
tot
= U + u,
t
u + (u + U) · (u + U) = −∇p + Re
1
2
u,
where U = yˆx. We can define a variational problem with Lagrangian
L =
E(T )
E
0
[
t
u+N(u)+p, v][·u, q]
1
2
hu
0
, u
0
i E
0
chu(0)u
0
, v
0
i,
where
N(u
i
) = U
j
j
u
i
+ u
i
i
U
j
+ u
j
j
u
i
1
Re
j
j
u
i
, [v, u] =
1
T
Z
T
0
hv, ui dt.
Variations with respect to the direct variable
u
can once again define a non-linear
adjoint equation
δL
δu
=
t
v + N
(v, u) + q +
u
E
0
v
t=T
+ (v v
0
)|
t=0
= 0,
where
N
(v
i
, u) =
j
(u
j
, v
i
) v
j
i
u
j
+
j
(U
j
v
i
) v
j
i
U
j
v
j
i
U
j
+
1
Re
j
j
v
i
.
We also have
δL
δp
= · v = 0,
δL
δu
0
= v
0
cu
0
= 0.
Note that the equation for the adjoint variable
v
depends on the direct variable
u
, but is linear in
v
. Computationally, this means that we need to remember
our solution to
u
when we do our adjoint loop. If
T
is large, then it may not be
feasible to store all information about
u
across the whole period, as that is too
much data. Instead, the method of checkpointing is used:
(i) Pick “checkpoints” 0 = T
0
< ··· < T
K
= T .
(ii) When integrating u, we remember high resolution data for u(x, T
K
).
(iii)
When integrating
v
backwards in the interval (
T
K1
, T
K
), we use the data
remembered at
T
K1
to re-integrate to obtain detailed information about
u in the interval (T
k1
, T
K
).
This powerful algorithmic approach allows us to identify minimal seed of turbu-
lence, i.e. the minimal, finite perturbation required to enter a turbulence mode.
Note that this is something that cannot be understood by linear approximations!
This is rather useful, because in real life, it allows us to figure out how to modify
our system to reduce the chance of turbulence arising.