1Multivariate calculus
IB Variational Principles
1.3 Legendre transform
The Legendre transform is an important tool in classical dynamics and thermo-
dynamics. In classical dynamics, it is used to transform between the Lagrangian
and the Hamiltonian. In thermodynamics, it is used to transform between
the energy, Helmholtz free energy and enthalpy. Despite its importance, the
definition is slightly awkward.
Suppose that we have a function
f
(
x
), which we’ll assume is differentiable.
For some reason, we want to transform it into a function of the conjugate variable
p
=
df
dx
instead. In most applications to physics, this quantity has a particular
physical significance. For example, in classical dynamics, if
L
is the Lagrangian,
then
p
=
∂L
∂ ˙x
is the (conjugate) momentum.
p
also has a context-independent
geometric interpretation, which we will explore later. For now, we will assume
that p is more interesting than x.
Unfortunately, the obvious option
f
∗
(
p
) =
f
(
x
(
p
)) is not the transform we
want. There are various reasons for this, but the major reason is that it is ugly.
It lacks any mathematical elegance, and has almost no nice properties at all.
In particular, we want our f
∗
(p) to satisfy the property
df
∗
dp
= x.
This says that if
p
is the conjugate of
x
, then
x
is the conjugate of
p
. We will
soon see how this is useful in the context of thermodynamics.
The symmetry is better revealed if we write in terms of differentials. The
differential of the function f is
df =
df
dx
dx = p dx.
So we want our f
∗
to satisfy
df
∗
= x dp.
How can we obtain this? From the product rule, we know that
d(xp) = x dp + p dx.
So if we define
f
∗
=
xp −f
(more explicitly written as
f
∗
(
p
) =
x
(
p
)
p −f
(
x
(
p
))),
then we obtain the desired relation d
f
∗
=
x
d
p
. Alternatively, we can say
df
∗
dp
= x.
The actual definition we give will not be exactly this. Instead, we define it in
a way that does not assume differentiability. We’ll also assume that the function
takes the more general form R
n
→ R.
Definition
(Legendre transform)
.
Given a function
f
:
R
n
→ R
, its Legendre
transform f
∗
(the “conjugate” function) is defined by
f
∗
(p) = sup
x
(p · x − f(x)),
The domain of
f
∗
is the set of
p ∈ R
n
such that the supremum is finite.
p
is
known as the conjugate variable.
This relation can also be written as
f
∗
(
p
) +
f
(
x
) =
px
, where
x
(
p
) is the
value of x that maximizes the function.
To show that this is the same as what we were just talking about, note
that the supremum of
p · x − f
(
x
) is obtained when its derivative is zero, i.e.
p
=
∇f
(
x
). In particular, in the 1D case,
f
∗
(
p
) =
px − f
(
x
), where
x
satisfies
f
0
(x) = p. So p is indeed the derivative of f with respect to x.
From the definition, we can immediately conclude that
Lemma. f
∗
is always convex.
Proof.
f
∗
((1 − t)p + tq) = sup
x
((1 − t)p · x + tq · x − f(x)
.
= sup
x
(1 − t)(p · x − f(x)) + t(q · x − f(x))
≤ (1 − t) sup
x
[p · x − f(x)] + t sup
x
[q · x − f(x)]
= (1 − t)f
∗
(p) + tf
∗
(q)
Note that we cannot immediately say that
f
∗
is convex, since we have to show
that the domain is convex. But by the above bounds,
f
∗
((1
−t
)
p
+
tq
) is bounded
by the sum of two finite terms, which is finite. So (1
− t
)
p
+
tq
is also in the
domain of f
∗
.
This transformation can be given a geometric interpretation. We will only
consider the 1D case, because drawing higher-dimensional graphs is hard. For
any fixed
x
, we draw the tangent line of
f
at the point
x
. Then
f
∗
(
p
) is the
intersection between the tangent line and the y axis:
x
y
slope = p
−f
∗
(p)
px
f
∗
(p) = px − f(x)
f(x)
Example.
(i)
Let
f
(
x
) =
1
2
ax
2
for
a >
0. Then
p
=
ax
at the maximum of
px −f
(
x
). So
f
∗
(p) = px − f(x) = p ·
p
a
−
1
2
a
p
a
2
=
1
2a
p
2
.
So the Legendre transform maps a parabola to a parabola.
(ii) f (v) = −
√
1 − v
2
for |v| < 1 is a lower semi-circle. We have
p = f
0
(v) =
v
√
1 − v
2
So
v =
p
p
1 + p
2
and exists for all p ∈ R. So
f
∗
(p) = pv − f(v) =
p
2
p
1 + p
2
+
1
p
1 + p
2
=
p
1 + p
2
.
A circle gets mapped to a hyperbola.
(iii)
Let
f
=
cx
for
c >
0. This is convex but not strictly convex. Then
px − f
(
x
) = (
p − c
)
x
. This has no maximum unless
p
=
c
. So the domain
of f
∗
is simply {c}. One point. So f
∗
(p) = 0. So a line goes to a point.
Finally, we prove that applying the Legendre transform twice gives the
original function.
Theorem.
If
f
is convex, differentiable with Legendre transform
f
∗
, then
f
∗∗
= f.
Proof.
We have
f
∗
(
p
) = (
p ·x
(
p
)
−f
(
x
(
p
)) where
x
(
p
) satisfies
p
=
∇f
(
x
(
p
)).
Differentiating with respect to p, we have
∇
i
f
∗
(p) = x
i
+ p
j
∇
i
x
j
(p) − ∇
i
x
j
(p)∇
j
f(x)
= x
i
+ p
j
∇
i
x
j
(p) − ∇
i
x
j
(p)p
j
= x
i
.
So
∇f
∗
(p) = x.
This means that the conjugate variable of p is our original x. So
f
∗∗
(x) = (x · p − f
∗
(p))|
p=p(x)
= x · p − (p · x − f(x))
= f(x).
Note that strict convexity is not required. For example, in our last example
above with the straight line,
f
∗
(
p
) = 0 for
p
=
c
. So
f
∗∗
(
x
) = (
xp −f
∗
(
p
))
|
p=c
=
cx = f(x).
However, convexity is required. If
f
∗∗
=
f
is true, then
f
must be convex,
since it is a Legendre transform. Hence
f
∗∗
=
f
cannot be true for non-convex
functions.
Application to thermodynamics
Given a system of a fixed number of particles, the energy of a system is usually
given as a function of entropy and volume:
E = E(S, V ).
We can think of this as a gas inside a piston with variable volume.
There are two things that can affect the energy: we can push in the piston
and modify the volume. This corresponds to a work done of
−p
d
V
, where
p
is
the pressure. Alternatively, we can simply heat it up and create a heat change
of T dS, where T is the temperature. Then we have
dE = T dS − p dV.
Comparing with the chain rule, we have
∂E
∂S
= T, −
∂E
∂V
= p
However, the entropy is a mysterious quantity no one understands. Instead, we
like temperature, defined as
T
=
∂E
∂S
. Hence we use the (negative) Legendre
transform to obtain the conjugate function Helmholtz free energy.
F (T, V ) = inf
S
[E(S, V ) − T S] = E(S, V ) − S
∂E
∂S
= E − ST,
Note that the Helmholtz free energy satisfies
dF = −S dT −p dV.
Just as we could recover
T
and
p
from
E
via taking partial derivatives with
respect to
S
and
V
, we are able to recover
S
and
p
from
F
by taking partial
derivatives with respect to
T
and
V
. This would not be the case if we simply
defined F (T, V ) = E(S(T, V ), V ).
If we take the Legendre transform with respect to
V
, we get the enthalpy
instead, and if we take the Legendre transform with respect to both, we get the
Gibbs free energy.