Part II Statistical Physics
Based on lectures by H. S. Reall
Notes taken by Dexter Chua
Lent 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Part IB Quantum Mechanics and “Multiparticle Systems” from Part II Principles of
Quantum Mechanics are essential
Fundamentals of statistical mechanics
Micro canonical ensemble. Entropy, temperature and pressure. Laws of thermody-
namics. Example of paramagnetism. Boltzmann distribution and canonical ensemble.
Partition function. Free energy. Specific heats. Chemical Potential. Grand Canonical
Ensemble. 
Classical gases
Density of states and the classical limit. Ideal gas. Maxwell distribution. Equipartition
of energy. Diatomic gas. Interacting gases. Virial expansion. Van der Waal’s equation
of state. Basic kinetic theory. 
Quantum gases
Density of states. Planck distribution and black body radiation. Debye model of
phonons in solids. Bose–Einstein distribution. Ideal Bose gas and Bose–Einstein
condensation. Fermi-Dirac distribution. Ideal Fermi gas. Pauli paramagnetism. 
Thermodynamics
Thermodynamic temperature scale. Heat and work. Carnot cycle. Applications of laws
of thermo dynamics. Thermodynamic potentials. Maxwell relations. 
Phase transitions
Liquid-gas transitions. Critical point and critical exponents. Ising model. Mean field
theory. First and second order phase transitions. Symmetries and order parameters. 
Contents
0 Introduction
1 Fundamentals of statistical mechanics
1.1 Microcanonical ensemble
1.2 Pressure, volume and the first law of thermodynamics
1.3 The canonical ensemble
1.4 Helmholtz free energy
1.5 The chemical potential and the grand canonical ensemble
1.6 Extensive and intensive properties
2 Classical gases
2.1 The classical partition function
2.2 Monoatomic ideal gas
2.3 Maxwell distribution
2.4 Diatomic gases
2.5 Interacting gases
3 Quantum gases
3.1 Density of states
3.3 Phonons and the Debye model
3.4 Quantum ideal gas
3.5 Bosons
3.6 Bose–Einstein condensation
3.7 Fermions
3.8 Pauli paramagnetism
4 Classical thermodynamics
4.1 Zeroth and first law
4.2 The second law
4.3 Carnot cycles
4.4 Entropy
4.5 Thermodynamic potentials
4.6 Third law of thermodynamics
5 Phase transitions
5.1 Liquid-gas transition
5.2 Critical point and critical exponents
5.3 The Ising model
5.4 Landau theory
0 Introduction
In all of our previous physics courses, we mostly focused on “microscopic” physics.
For example, we used the Schr¨odinger equation to describe the mechanics of
a single hydrogen atom. We had to do quite a bit of hard work to figure out
exactly how it behaved. But what if we wanted to study a much huger system?
Let’s say we have a box of hydrogen gas, consisting of
10
23
molecules. If we
tried to solve the Schr¨odinger equation for this system, then, even numerically,
it is completely intractable.
So how can we understand such a system? Of course, given these
10
23
molecules, we are not interested in the detailed dynamics of each molecule. We
are only interested in some “macroscopic” phenomena. For example, we might
want to know its pressure and temperature. So what we want to do is to describe
the whole system using just a few “macroscopic” variables, and still capture the
main properties we are interested.
In the first part of the course, we will approach this subject rather “rigorously”.
We will start from some microscopic laws we know about, and use them to deduce
properties of the macroscopic system. This turns out to be rather successful,
at least if we treat things sufficiently properly. Surprisingly, it even applies to
scenarios where it is absolutely not obvious it should apply!
Historically, this was not how statistical physics was first developed. Instead,
we only had the “macroscopic” properties we tried to understand. Back in
the days, we didn’t even know things are made up of atoms! We will try to
understand statistical phenomena in purely macroscopic and “wordy” terms,
and it turns out we can reproduce the same predictions as before.
Finally, we will turn our attention to something rather different in nature
phase transitions. As we all know, water turns from liquid to gas as we raise the
temperature. This transition is an example of a phase transition. Of course, this
is still a “macroscopic” phenomena, fitting in line with the theme of the course.
It doesn’t make sense to talk about a single molecule transitioning from liquid
to gas. Interestingly, when we look at many different systems undergoing phase
transitions, they seem to all behave “in the same way” near the phase transition.
We will figure out why. At least, partially.
Statistical physics is important. As we mentioned, we can use this to make
macroscopic predictions from microscopic laws. Thus, this allows us to put our
microscopic laws to experimental test. It turns out methods of statistical physics
have some far-reaching applications elsewhere. It can be used to study black
holes, or even biology!
1 Fundamentals of statistical mechanics 1.1 Microcanonical ensemble
We begin by considering a rather general system. Suppose we have an isolated
system containing
N
particles, where
N
is a Large Number
TM
. The canonical
example to keep in mind is a box of gas detached from reality.
Definition
(Microstate)
.
The microstate of a system is the actual (quantum)
state of the system. This gives a complete description of the system.
As one would expect, the microstate is very complicated and infeasible to
describe, especially when we have many particles. In statistical physics, we
observe that many microstates are indistinguishable macroscopically. Thus, we
only take note of some macroscopically interesting quantities, and use these
macroscopic quantities to put a probability distribution on the microstates.
More precisely, we let {|ni} be a basis of normalized eigenstates, say
ˆ
H |ni = E
n
|ni.
We let
p
(
n
) be the probability that the microstate is
|ni
. Note that this
probability is not the quantum probability we are used to. It is some probability
assigned to reflect our ignorance of the system. Given such probabilities, we can
define the expectation of an operator in the least imaginative way:
Definition
(Expectation value)
.
Given a probability distribution
p
(
n
) on the
states, the expectation value of an operator O is
hOi =
X
n
p(n) hn|O|ni.
If one knows about density operators, we can describe the system as a mixed
state with density operator
ρ =
X
n
p(n) |nihn|.
There is an equivalent way of looking at this. We can consider an ensemble
consisting of
W
1 independent copies of our system such that
W p
(
n
) many
copies are in the microstate
|ni
. Then the expectation is just the average over
We shall further assume our system is in equilibrium, i.e. the probability
distribution
p
(
n
) does not change in time. So in particular
hOi
is independent of
time. Of course, this does not mean the particles stop moving. The particles are
still whizzing around. It’s just that the statistical distribution does not change.
In this course, we will mostly be talking about equilibrium systems. When we
get out of equilibrium, things become very complicated.
The idea of statistical physics is that we have some partial knowledge about
the system. For example, we might know its total energy. The microstates that
are compatible with this partial knowledge are called accessible. The fundamental
assumption of statistical mechanics is then
An isolated system in equilibrium is equally likely to be in any of the
accessible microstates.
Thus, different probability distributions, or different ensembles, are distinguished
by the partial knowledge we know.
Definition
(Microcanonical ensemble)
.
In a microcanonical ensemble, we know
the energy is between
E
and
E
+
δE
, where
δE
is the accuracy of our measuring
device. The accessible microstates are those with energy
E E
n
E
+
δE
. We
let Ω(E) be the number of such states.
In practice,
δE
is much much larger than the spacing of energy levels, and so
Ω(
E
)
1. A priori, it seems like our theory will depend on what the value of
δE is, but as we develop the theory, we will see that this doesn’t really matter.
It is crucial here that we are working with a quantum system, so the possible
states is discrete, and it makes sense to count the number of systems. We need
to do more quite a bit work if we want to do this classically.
Example.
Suppose we have
N
= 10
23
particles, and each particle can occupy
two states
|↑i
and
|↓i
, which have the same energy
ε
. Then we always have
Nε
total energy, and we have
Ω(Nε) = 2
10
23
.
This is a fantastically huge, mind-boggling number. This is the kind of number
By the fundamental assumption, we can write
p(n) =
(
1
Ω(E)
if E E
n
E + δE
0 otherwise
.
This is the characteristic distribution of the microcanonical ensemble.
It turns out it is not very convenient to work with Ω(
E
). In particular, Ω(
E
)
is not linear in
N
, the number of particles. Instead, it scales as an exponential
of N . So we take the logarithm.
Definition (Boltzmann entropy). The (Boltzmann) entropy is defined as
S(E) = k log Ω(E),
where k = 1.381 ×10
23
J K
1
is Boltzmann’s constant.
This annoying constant
k
is necessary because when people started doing
thermodynamics, they didn’t know about statistical physics, and picked weird
conventions.
We wrote our expressions as
S
(
E
S
(
E, δE
). As promised, the
value of
δE
doesn’t really matter. We know that Ω(
E
) will scale approximately
linearly with
δE
. So if we, say, double
δE
, then
S
(
E
) will increase by
k log
2,
which is incredibly tiny compared to
S
(
E
) =
k log
Ω(
E
). So it doesn’t matter
which value of δE we pick.
Even if you are not so convinced that multiplying 10
10
23
by a factor of 2 or
log
2 to 10
23
do not really matter, you should be reassured that at the
end, we will rarely talk about Ω(
E
) or
S
(
E
) itself. Instead, we will often divide
two different Ω’s to get probabilities, or differentiate
S
to get other interesting
quantities. In these cases, the factors really do not matter.
The second nice property of the entropy is that it is additive if we have
two non-interacting systems with energies
E
(1)
, E
(2)
. Then the total number of
states of the combined system is
Ω(E
(1)
, E
(2)
) = Ω
1
(E
(1)
)Ω
2
(E
(2)
).
So when we take the logarithm, we find
S(E
(1)
, E
(2)
) = S(E
(1)
) + S(E
(2)
).
Of course, this is not very interesting, until we bring our systems together and
let them interact with each other.
Interacting systems
Suppose we bring the two systems together, and let them exchange energy. Then
the energy of the individual systems is no longer fixed, and only the total energy
E
total
= E
(1)
+ E
(2)
is fixed. Then we find that
Ω(E
total
) =
X
E
i
1
(E
i
)Ω
2
(E
total
E
i
),
where we sum over all possible energy levels of the first system. In terms of the
entropy of the system, we have
Ω(E
total
) =
X
E
i
exp
S
1
(E
i
)
k
+
S
2
(E
total
E
i
)
k
We can be a bit more precise with what the sum means. We are not summing
over all eigenstates. Recall that we have previously fixed an accuracy
δE
. So
we can imagine dividing the whole energy spectrum into chunks of size
δE
, and
here we are summing over the chunks.
We know that
S
1,2
/k N
1,2
10
23
, which is a ridiculously large number.
So the sum is overwhelmingly dominated by the term with the largest exponent.
Suppose this is maximized when E
i
= E
. Then we have
S(E
total
) = k log Ω(E
total
) S
1
(E
) + S
2
(E
total
E
).
Again, we are not claiming that only the factor coming from
E
has significant
contribution. Maybe one or two energy levels next to
E
are also very significant,
but taking these into account will only multiply Ω(
E
total
) by a (relatively) small
constant, hence contributes a small additive factor to
S
(
E
total
), which can be
neglected.
Now given any
E
(1)
, what is the probability that the actual energy of the
first system is E
(1)
? For convenience, we write E
(2)
= E
total
E
(1)
, the energy
of the second system. Then the probability desired is
1
(E
(1)
)Ω
2
(E
(2)
)
Ω(E
total
)
= exp
1
k
S
1
(E
(1)
) + S
2
(E
(2)
) S(E
total
)
.
Again recall that the numbers at stake are unimaginably huge. So if
S
1
(
E
(1)
) +
S
2
(
E
(2)
) is even slightly different from
S
(
E
total
), then the probability is effectively
zero. And by above, for the two quantities to be close, we need
E
(1)
=
E
. So
for all practical purposes, the value of E
(1)
is fixed into E
.
Now imagine we prepare two systems separately with energies
E
(1)
and
E
(2)
such that
E
(1)
6
=
E
, and then bring the system together, then we are no longer
in equilibrium.
E
(1)
will change until it takes value
E
, and then entropy of the
system will increase from
S
1
(
E
(1)
) +
S
2
(
E
(2)
) to
S
1
(
E
) +
S
2
(
E
total
E
). In
particular, the entropy increases.
Law
(Second law of thermodynamics)
.
The entropy of an isolated system
increases (or remains the same) in any physical process. In equilibrium, the
entropy attains its maximum value.
This prediction is verified by virtually all observations of physics.
While our derivation did not show it is impossible to violate the second law
of thermodynamics, it is very very very very very very very very unlikely to be
violated.
Temp erature
Having defined entropy, the next interesting thing we can define is the temperature.
We assume that
S
is a smooth function in
E
. Then we can define the temperature
as follows:
Definition (Temperature). The temperature is defined to be
1
T
=
dS
dE
.
Why do we call this the temperature? Over the course, we will see that
this quantity we decide to call “temperature” does behave as we would expect
temperature to behave. It is difficult to give further justification of this definition,
because even though we vaguely have some idea what temperature is like in
daily life, those ideas are very far from anything we can concretely write down
or even describe.
One reassuring property we can prove is the following:
Proposition.
Two interacting systems in equilibrium have the same tempera-
ture.
Proof. Recall that the equilibrium energy E
is found by maximizing
S
1
(E
i
) + S
2
(E
total
E
i
)
over all possible
E
i
. Thus, at an equilibrium, the derivative of this expression
has to vanish, and the derivative is exactly
dS
1
dE
E
(1)
=E
dS
i
dE
E
(2)
=E
total
E
= 0
So we need
1
T
1
=
1
T
2
.
In other words, we need
T
1
= T
2
.
Now suppose initially, our systems have different temperature. We would
expect energy to flow from the hotter system to the cooler system. This is indeed
the case.
Proposition.
Suppose two systems with initial energies
E
(1)
, E
(2)
and temper-
atures
T
1
, T
2
are put into contact. If
T
1
> T
2
, then energy will flow form the
first system to the second.
Proof.
Since we are not in equilibrium, there must be some energy transfer from
one system to the other. Suppose after time δt, the energy changes by
E
(1)
7→ E
(1)
+ δE
E
(2)
7→ E
(2)
δE,
keeping the total energy constant. Then the change in entropy is given by
δS =
dS
1
dE
δE
(1)
+
dS
2
dE
δE
(2)
=
1
T
1
1
T
2
δE.
By assumption, we know
1
T
1
1
T
2
< 0,
but by the second law of thermodynamics, we know
δS
must increase. So we
must have δE < 0, i.e. energy flows from the first system to the second.
So this notion of temperature agrees with the basic properties of temperature
we expect.
Note that these properties we’ve derived only depends on the fact that
1
T
is a monotonically decreasing function of T . In principle, we could have picked
any monotonically decreasing function of
T
, and set it to
dS
dE
. We will later see
that this definition will agree with the other definitions of temperature we have
previously seen, e.g. via the ideal gas law, and so this is indeed the “right” one.
Heat capacity
As we will keep on doing later, we can take different derivatives to get different
interesting quantities. This time, we are going to get heat capacity. Recall that
T
was a function of energy,
T
=
T
(
E
). We will assume that we can invert this
function, at least locally, to get E as a function of T .
Definition (Heat capacity). The heat capacity of a system is
C =
dE
dT
.
The specific heat capacity is
C
mass of system
.
The specific heat capacity is a property of the substance that makes up the
system, and not how much stuff there is, as both
C
and the mass scale linearly
with the size of the system.
This is some quantity we can actually physically measure. We can measure
the temperature with a thermometer, and it is usually not too difficult to see how
much energy we are pumping into a system. Then by measuring the temperature
change, we can figure out the heat capacity.
In doing so, we can indirectly measure the entropy, or at least the changes in
entropy. Note that we have
dS
dT
=
dS
dE
dE
dT
=
C
T
.
Integrating up, if the temperature changes from T
1
to T
2
, we know
S =
Z
T
2
T
1
C(T )
T
dT.
As promised, by measuring heat capacity experimentally, we can measure the
change in entropy.
The heat capacity is useful in other ways. Recall that to find the equilibrium
energy E
, a necessary condition was that it satisfies
dS
1
dE
dS
2
dE
= 0.
However, we only know that the solution is an extrema, and not necessarily
maximum. To figure out if it is the maximum, we take the second derivative.
Note that for a single system, we have
d
2
S
dE
2
=
d
dE
1
T
=
1
T
2
C
.
Applying this to two systems, one can check that entropy is maximized at
E
(1)
=
E
if
C
1
, C
2
>
0. The actual computations is left as an exercise on the
first example sheet.
Let’s look at some actual systems and try to calculate these quantities.
Example.
Consider a 2-state system, where we have
N
non-interacting particles
with fixed positions. Each particle is either in
|↑i
or
|↓i
. We can think of these
as spins, for example. These two states have different energies
E
= ε, E
= 0.
We let
N
and
N
be the number of particles in
|↑i
and
|↓i
respectively. Then
the total energy of the system is
E = εN
.
We want to calculate this quantity Ω(
E
). Here in this very contrived example,
it is convenient to pick
δE < ε
, so that Ω(
E
) is just the number of ways of
choosing N
particles from N. By basic combinatorics, we have
Ω(E) =
N!
N
!(N N
)!
,
and
S(E) = k log
N!
N
!(N N
)!
.
This is not an incredibly useful formula. Since we assumed that
N
and
N
are
huge, we can use Stirling’s approximation
N! =
2πNN
N
e
N
1 + O
1
N

.
Then we have
log N! = N log N N +
1
2
log(2πN) + O
1
N
.
We just use the approximation three times to get
S(E) = k (N log N N N
log N
+ N
(N N
) log(N N
) + N N
)
= k
(N N
) log
N N
N
+ N
log
N
N

= kN

1
E
Nε
log
1
E
Nε
+
E
Nε
log
E
Nε

.
This is better, but we can get much more out of it if we plot it:
E
S(E)
0
Nε
Nε/2
Nk log 2
The temperature is
1
T
=
dS
dT
=
k
ε
log
Nε
E
1
,
and we can invert to get
N
N
=
E
Nε
=
1
e
ε/kT
+ 1
.
Suppose we get to control the temperature of the system, e.g. if we put it with a
heat reservoir. What happens as we vary our temperature?
As
T
0, we have
N
0. So the states all try to go to the ground state.
As T , we find N
/N
1
2
, and E N ε/2.
The second result is a bit weird. As
T
, we might expect all things to go
the maximum energy level, and not just half of them.
To confuse ourselves further, we can plot another graph, for
1
T
vs
E
. The
graph looks like
E
1
T
0
Nε
Nε/2
We see that having energy
> Nε/
2 corresponds to negative temperature, and to
go from positive temperature to negative temperature, we need to pass through
infinite temperature. So in some sense, negative temperature is “hotter” than
infinite temperature.
What is going on? By definition, negative
T
means Ω(
E
) is a decreasing
function of energy. This is a very unusual situation. In this system, all the
particles are fixed, and have no kinetic energy. Consequently, the possible energy
levels are bounded. If we included kinetic energy into the system, then kinetic
energy can be arbitrarily large. In this case, Ω(
E
) is usually an increasing
function of E.
Negative
T
has indeed been observed experimentally. This requires setups
where the kinetic energy is not so important in the range of energies we are
talking about. One particular scenario where this is observed is in nuclear spins
of crystals in magnetic fields. If we have a magnetic field, then naturally, most
of the spins will align with the field. We now suddenly flip the field, and then
most of the spins are anti-aligned, and this can give us a negative temperature
state.
Now we can’t measure negative temperature by sticking a thermometer into
the material and getting a negative answer. Something that can be interestingly
measured is the heat capacity
C =
dE
dT
=
Nε
2
kT
2
e
ε/kT
(e
ε/kT
+ 1)
2
.
This again exhibits some peculiar properties. We begin by looking at a plot:
T
C
0
kT ε
By looking at the formula, we see that the maximum
kT
is related to the
microscopic
ε
. If we know about the value of
k
, then we can use the macroscopic
observation of C to deduce something about the microscopic ε.
Note that C is proportional to N. As T 0, we have
C T
2
e
ε/kT
,
and this is a function that decreases very rapidly as
T
0, and in fact this is
one of the favorite examples in analysis where all derivatives of the function at 0
vanish. Ultimately, this is due to the energy gap between the ground state and
the first excited state.
Another peculiarity of this plot is that the heat capacity vanishes at high
temperature, but this is due to the peculiar property of the system at high
temperature. In a general system, we expect the heat capacity to increase with
temperature.
How much of this is actually physical? The answer is “not much”. This is
not surprising, because we didn’t really do much physics in these computations.
For most solids, the contribution to
C
from spins is swamped by other effects
such as contributions of phonons (quantized vibrations in the solid) or electrons.
In this case, C(T ) is monotonic in T .
However, there are some very peculiar materials for which we obtain a small
local maximum in
C
(
T
) for very small
T
, before increasing monotonically, which
is due to the contributions of spin:
T
C
0 1.2 Pressure, volume and the first law of thermodynamics
So far, our system only had one single parameter the energy. Usually, our
systems have other external parameters which can be varied. Recall that our
“standard” model of a statistical system is a box of gas. If we allow ourselves
to move the walls of the box, then the volume of the system may vary. As we
change the volume, the allowed energies eigenstates will change. So now Ω, and
hence S are functions of energy and volume:
S(E, V ) = k log Ω(E, V ).
We now need to modify our definition of temperature to account for this depen-
dence:
Definition
(Temperature)
.
The temperature of a system with variable volume
is
1
T
=
S
E
V
,
with V fixed.
But now we can define a different thermodynamic quantity by taking the
derivative with respect to V .
Definition
(Pressure)
.
We define the pressure of a system with variable volume
to be
p = T
S
V
E
.
Is this thing we call the “pressure” any thing like what we used to think of
as pressure, namely force per unit area? We will soon see that this is indeed the
case.
We begin by deducing some familiar properties of pressure.
Proposition.
Consider as before two interacting systems where the total volume
V
=
V
1
+
V
2
is fixed by the individual volumes can vary. Then the entropy of
the combined system is maximized when T
1
= T
2
and p
1
= p
2
.
Proof. We have previously seen that we need T
1
= T
2
. We also want
dS
dV
E
= 0.
So we need
dS
1
dV
E
=
dS
2
dV
E
.
Since the temperatures are equal, we know that we also need p
1
= p
2
.
For a single system, we can use the chain rule to write
dS =
S
E
V
dE +
S
V
E
dV.
Then we can use the definitions of temperature and pressure to write
Proposition (First law of thermodynamics).
dE = T dS p dV.
This law relates two infinitesimally close equilibrium states. This is sometimes
called the fundamental thermodynamics relation.
Example.
Consider a box with one side a movable piston of area
A
. We apply
a force F to keep the piston in place.
= F
dx
What happens if we move the piston for a little bit? If we move through a
distance d
x
, then the volume of the gas has increased by
A
d
x
. We assume
S
is
constant. Then the first law tells us
dE = pA dx.
This formula should be very familiar to us. This is just the work done by the
force, and this must be
F
=
pA
. So our definition of pressure in terms of partial
derivatives reproduces the mechanics definition of force per unit area.
One has to be cautious here. It is not always true that
p
d
V
can be equated
with the word done on a system. For this to be true, we require the change to
be reversible, which is a notion we will study more in depth later. For example,
this is not true when there is friction.
In the case of a reversible change, if we equate
p
d
V
with the work done,
then there is only one possible thing
T
d
S
can be it is the heat supplied to
the system.
It is important to remember that the first law holds for any change. It’s just
that this interpretation does not.
Example.
Consider the irreversible change, where we have a “free expansion”
of gas into vacuum. We have a box
gas
vacuum
We have a valve in the partition, and as soon as we open up the valve, the gas
flows to the other side of the box.
In this system, no energy has been supplied. So d
E
= 0. However, d
V >
0,
as volume clearly increased. But there is no work done on or by the gas. So in
this case,
p
d
V
is certainly not the work done. Using the first law, we know
that
T dS = p dV.
So as the volume increases, the entropy increases as well.
We now revisit the concept of heat capacity. We previously defined it as
d
E/
d
T
, but now we need to decide what we want to keep fixed. We can keep
V
fixed, and get
C
V
=
E
T
V
= T
S
T
V
.
While this is the obvious generalization of what we previously had, it is not a
very useful quantity. We do not usually do experiments with fixed volume. For
example, if we do a chemistry experiment in a test tube, say, then the volume is
not fixed, as the gas in the test tube is free to go around. Instead, what is fixed
is the pressure. We can analogously define
C
p
= T
S
T
p
.
Note that we cannot write this as some sort of
E
T
. 1.3 The canonical ensemble
So far, we have been using the microcanonical ensemble. The underlying as-
sumption is that our system is totally isolated, and we know what the energy of
the system is. However, in real life, this is most likely not the case. Even if we
produce a sealed box of gas, and try to do experiments with it, the system is
not isolated. It can exchange heat with the environment.
On the other hand, there is one thing that is fixed the temperature.
The box is in thermal equilibrium with the environment. If we assume the
environment is “large”, then we can assume that the environment is not really
affected by the box, and so the box is forced to have the same temperature as
the environment.
Let’s try to study this property. Consider a system
S
interacting with a
much larger system
R
. We call this
R
a heat reservoir. Since
R
is assumed to
be large, the energy of
S
is negligible to
R
, and we will assume
R
always has
a fixed temperature
T
. Then in this set up, the systems can exchange energy
without changing T .
As before, we let
|ni
be a basis of microstates with energy
E
n
. We suppose
we fix a total energy
E
total
, and we want to find the total number of microstates
of the combined system with this total energy. To do so, we fix some state
|ni
of
S
, and ask how many states of
S
+
R
there are for which
S
is in
|ni
. We then
later sum over all |ni.
By definition, we can write this as
R
(E
total
E
n
) = exp
k
1
S
R
(E
total
E
n
)
.
By assumption, we know
R
is a much larger system than
S
. So we only get
significant contributions when
E
n
E
total
. In these cases, we can Taylor expand
to write
R
(E
total
E
n
) = exp
k
1
S
R
(E
total
) k
1
S
R
E
V
E
n
.
But we know what
S
R
E
is it is just T
1
. So we finally get
R
(E
total
E
n
) = e
k
1
S
R
(E
total
)
e
βE
n
,
where we define
Notation (β).
β =
1
kT
.
Note that while we derived this this formula under the assumption that
E
n
is small, it is effectively still valid when
E
n
is large, because both sides are very
tiny, and even if they are very tiny in different ways, it doesn’t matter when we
Now we can write the total number of microstates of S + R as
Ω(E
total
) =
X
n
R
(E
total
E
n
) = e
k
1
S
R
(E
total
)
X
n
e
βE
n
.
Note that we are summing over all states, not energy.
We now use the fundamental assumption of statistical mechanics that all
states of
S
+
R
are equally likely. Then we know the probability that
S
is in
state |ni is
p(n) =
R
(E
total
E
n
)
Ω(E
total
)
=
e
βE
n
P
k
e
βE
k
.
This is called the Boltzmann distribution for the canonical ensemble. Note that
at the end, all the details have dropped out apart form the temperature. This
describes the energy distribution of a system with fixed temperature.
Note that if
E
n
kT
=
1
β
, then the exponential is small. So only states
with
E
n
kT
have significant probability. In particular, as
T
0, we have
β , and so only the ground state can be occupied.
We now define an important quantity.
Definition (Partition function). The partition function is
Z =
X
n
e
βE
n
.
It turns out most of the interesting things we are interested in can be expressed
in terms of
Z
and its derivatives. Thus, to understand a general system, what
we will do is to compute the partition function and express it in some familiar
form. Then we can use standard calculus to obtain quantities we are interested
in. To begin with, we have
p(n) =
e
βE
n
Z
.
Proposition. For two non-interacting systems, we have Z(β) = Z
1
(β)Z
2
(β).
Proof. Since the systems are not interacting, we have
Z =
X
n,m
e
β(E
(1)
n
+E
(2)
n
)
=
X
n
e
βE
(1)
n
!
X
n
e
βE
(2)
n
!
= Z
1
Z
2
.
Note that in general, energy is not fixed, but we can compute the average
value:
hEi =
X
n
p(n)E
n
=
X
E
n
e
βE
n
Z
=
β
log Z.
This partial derivative is taken with all
E
i
fixed. Of course, in the real world,
we don’t get to directly change the energy eigenstates and see what happens.
However, they do depend on some “external” parameters, such as the volume
V
,
the magnetic field
B
etc. So when we take this derivative, we have to keep all
those parameters fixed.
We look at the simple case where
V
is the only parameter we can vary. Then
Z = Z(β, V ). We can rewrite the previous formula as
hEi =
β
log Z
V
.
This gives us the average, but we also want to know the variance of
E
. We have
E
2
= h(E hEi)
2
i = hE
2
i hEi
2
.
On the first example sheet, we calculate that this is in fact
E
2
=
2
β
2
log Z
V
=
hEi
β
V
.
We can now convert
β
-derivatives to
T
-derivatives using the chain rule. Then
we get
E
2
= kT
2
hEi
T
V
= kT
2
C
V
.
From this, we can learn something important. We would expect
hEi N
, the
number of particles of the system. But we also know C
V
N. So
E
hEi
1
N
.
Therefore, the fluctuations are negligible if
N
is large enough. This is called the
thermodynamic limit
N
. In this limit, we can ignore the fluctuations in
energy. So we expect the microcanonical ensemble and the canonical ensemble to
give the same result. And for all practical purposes,
N
10
23
is a large number.
Because of that, we are often going to just write E instead of hEi.
Example. Suppose we had particles with
E
= ε, E
= 0.
So for one particle, we have
Z
1
=
X
n
e
βE
n
= 1 + e
βε
= 2e
βε/2
cosh
βε
2
.
If we have
N
non-interacting systems, then since the partition function is
multiplicative, we have
Z = Z
N
1
= 2
n
e
βεN/2
cosh
N
βε
2
.
From the partition function, we can compute
hEi =
d log Z
dβ
=
Nε
2
1 tanh
βε
2
.
We can check that this agrees with the value we computed with the microcanon-
ical ensemble (where we wrote the result using expo