Part II — Statistical Physics

Based on lectures by H. S. Reall

Notes taken by Dexter Chua

Lent 2017

These notes are not endorsed by the lecturers, and I have modified them (often

significantly) after lectures. They are nowhere near accurate representations of what

was actually lectured, and in particular, all errors are almost surely mine.

Part IB Quantum Mechanics and “Multiparticle Systems” from Part II Principles of

Quantum Mechanics are essential

Fundamentals of statistical mechanics

Micro canonical ensemble. Entropy, temperature and pressure. Laws of thermody-

namics. Example of paramagnetism. Boltzmann distribution and canonical ensemble.

Partition function. Free energy. Specific heats. Chemical Potential. Grand Canonical

Ensemble. [5]

Classical gases

Density of states and the classical limit. Ideal gas. Maxwell distribution. Equipartition

of energy. Diatomic gas. Interacting gases. Virial expansion. Van der Waal’s equation

of state. Basic kinetic theory. [3]

Quantum gases

Density of states. Planck distribution and black body radiation. Debye model of

phonons in solids. Bose–Einstein distribution. Ideal Bose gas and Bose–Einstein

condensation. Fermi-Dirac distribution. Ideal Fermi gas. Pauli paramagnetism. [8]

Thermodynamics

Thermodynamic temperature scale. Heat and work. Carnot cycle. Applications of laws

of thermo dynamics. Thermodynamic potentials. Maxwell relations. [4]

Phase transitions

Liquid-gas transitions. Critical point and critical exponents. Ising model. Mean field

theory. First and second order phase transitions. Symmetries and order parameters. [4]

Contents

0 Introduction

1 Fundamentals of statistical mechanics

1.1 Microcanonical ensemble

1.2 Pressure, volume and the first law of thermodynamics

1.3 The canonical ensemble

1.4 Helmholtz free energy

1.5 The chemical potential and the grand canonical ensemble

1.6 Extensive and intensive properties

2 Classical gases

2.1 The classical partition function

2.2 Monoatomic ideal gas

2.3 Maxwell distribution

2.4 Diatomic gases

2.5 Interacting gases

3 Quantum gases

3.1 Density of states

3.2 Black-body radiation

3.3 Phonons and the Debye model

3.4 Quantum ideal gas

3.5 Bosons

3.6 Bose–Einstein condensation

3.7 Fermions

3.8 Pauli paramagnetism

4 Classical thermodynamics

4.1 Zeroth and first law

4.2 The second law

4.3 Carnot cycles

4.4 Entropy

4.5 Thermodynamic potentials

4.6 Third law of thermodynamics

5 Phase transitions

5.1 Liquid-gas transition

5.2 Critical point and critical exponents

5.3 The Ising model

5.4 Landau theory

0 Introduction

In all of our previous physics courses, we mostly focused on “microscopic” physics.

For example, we used the Schr¨odinger equation to describe the mechanics of

a single hydrogen atom. We had to do quite a bit of hard work to figure out

exactly how it behaved. But what if we wanted to study a much huger system?

Let’s say we have a box of hydrogen gas, consisting of

∼

10

23

molecules. If we

tried to solve the Schr¨odinger equation for this system, then, even numerically,

it is completely intractable.

So how can we understand such a system? Of course, given these

∼

10

23

molecules, we are not interested in the detailed dynamics of each molecule. We

are only interested in some “macroscopic” phenomena. For example, we might

want to know its pressure and temperature. So what we want to do is to describe

the whole system using just a few “macroscopic” variables, and still capture the

main properties we are interested.

In the first part of the course, we will approach this subject rather “rigorously”.

We will start from some microscopic laws we know about, and use them to deduce

properties of the macroscopic system. This turns out to be rather successful,

at least if we treat things sufficiently properly. Surprisingly, it even applies to

scenarios where it is absolutely not obvious it should apply!

Historically, this was not how statistical physics was first developed. Instead,

we only had the “macroscopic” properties we tried to understand. Back in

the days, we didn’t even know things are made up of atoms! We will try to

understand statistical phenomena in purely macroscopic and “wordy” terms,

and it turns out we can reproduce the same predictions as before.

Finally, we will turn our attention to something rather different in nature —

phase transitions. As we all know, water turns from liquid to gas as we raise the

temperature. This transition is an example of a phase transition. Of course, this

is still a “macroscopic” phenomena, fitting in line with the theme of the course.

It doesn’t make sense to talk about a single molecule transitioning from liquid

to gas. Interestingly, when we look at many different systems undergoing phase

transitions, they seem to all behave “in the same way” near the phase transition.

We will figure out why. At least, partially.

Statistical physics is important. As we mentioned, we can use this to make

macroscopic predictions from microscopic laws. Thus, this allows us to put our

microscopic laws to experimental test. It turns out methods of statistical physics

have some far-reaching applications elsewhere. It can be used to study black

holes, or even biology!

1 Fundamentals of statistical mechanics

1.1 Microcanonical ensemble

We begin by considering a rather general system. Suppose we have an isolated

system containing

N

particles, where

N

is a Large Number

TM

. The canonical

example to keep in mind is a box of gas detached from reality.

Definition

(Microstate)

.

The microstate of a system is the actual (quantum)

state of the system. This gives a complete description of the system.

As one would expect, the microstate is very complicated and infeasible to

describe, especially when we have many particles. In statistical physics, we

observe that many microstates are indistinguishable macroscopically. Thus, we

only take note of some macroscopically interesting quantities, and use these

macroscopic quantities to put a probability distribution on the microstates.

More precisely, we let {|ni} be a basis of normalized eigenstates, say

ˆ

H |ni = E

n

|ni.

We let

p

(

n

) be the probability that the microstate is

|ni

. Note that this

probability is not the quantum probability we are used to. It is some probability

assigned to reflect our ignorance of the system. Given such probabilities, we can

define the expectation of an operator in the least imaginative way:

Definition

(Expectation value)

.

Given a probability distribution

p

(

n

) on the

states, the expectation value of an operator O is

hOi =

X

n

p(n) hn|O|ni.

If one knows about density operators, we can describe the system as a mixed

state with density operator

ρ =

X

n

p(n) |nihn|.

There is an equivalent way of looking at this. We can consider an ensemble

consisting of

W

1 independent copies of our system such that

W p

(

n

) many

copies are in the microstate

|ni

. Then the expectation is just the average over

the ensemble. For most purposes, how we think about this doesn’t really matter.

We shall further assume our system is in equilibrium, i.e. the probability

distribution

p

(

n

) does not change in time. So in particular

hOi

is independent of

time. Of course, this does not mean the particles stop moving. The particles are

still whizzing around. It’s just that the statistical distribution does not change.

In this course, we will mostly be talking about equilibrium systems. When we

get out of equilibrium, things become very complicated.

The idea of statistical physics is that we have some partial knowledge about

the system. For example, we might know its total energy. The microstates that

are compatible with this partial knowledge are called accessible. The fundamental

assumption of statistical mechanics is then

An isolated system in equilibrium is equally likely to be in any of the

accessible microstates.

Thus, different probability distributions, or different ensembles, are distinguished

by the partial knowledge we know.

Definition

(Microcanonical ensemble)

.

In a microcanonical ensemble, we know

the energy is between

E

and

E

+

δE

, where

δE

is the accuracy of our measuring

device. The accessible microstates are those with energy

E ≤ E

n

≤ E

+

δE

. We

let Ω(E) be the number of such states.

In practice,

δE

is much much larger than the spacing of energy levels, and so

Ω(

E

)

1. A priori, it seems like our theory will depend on what the value of

δE is, but as we develop the theory, we will see that this doesn’t really matter.

It is crucial here that we are working with a quantum system, so the possible

states is discrete, and it makes sense to count the number of systems. We need

to do more quite a bit work if we want to do this classically.

Example.

Suppose we have

N

= 10

23

particles, and each particle can occupy

two states

|↑i

and

|↓i

, which have the same energy

ε

. Then we always have

Nε

total energy, and we have

Ω(Nε) = 2

10

23

.

This is a fantastically huge, mind-boggling number. This is the kind of number

we are talking about.

By the fundamental assumption, we can write

p(n) =

(

1

Ω(E)

if E ≤ E

n

≤ E + δE

0 otherwise

.

This is the characteristic distribution of the microcanonical ensemble.

It turns out it is not very convenient to work with Ω(

E

). In particular, Ω(

E

)

is not linear in

N

, the number of particles. Instead, it scales as an exponential

of N . So we take the logarithm.

Definition (Boltzmann entropy). The (Boltzmann) entropy is defined as

S(E) = k log Ω(E),

where k = 1.381 ×10

−23

J K

−1

is Boltzmann’s constant.

This annoying constant

k

is necessary because when people started doing

thermodynamics, they didn’t know about statistical physics, and picked weird

conventions.

We wrote our expressions as

S

(

E

), instead of

S

(

E, δE

). As promised, the

value of

δE

doesn’t really matter. We know that Ω(

E

) will scale approximately

linearly with

δE

. So if we, say, double

δE

, then

S

(

E

) will increase by

k log

2,

which is incredibly tiny compared to

S

(

E

) =

k log

Ω(

E

). So it doesn’t matter

which value of δE we pick.

Even if you are not so convinced that multiplying 10

10

23

by a factor of 2 or

adding

log

2 to 10

23

do not really matter, you should be reassured that at the

end, we will rarely talk about Ω(

E

) or

S

(

E

) itself. Instead, we will often divide

two different Ω’s to get probabilities, or differentiate

S

to get other interesting

quantities. In these cases, the factors really do not matter.

The second nice property of the entropy is that it is additive — if we have

two non-interacting systems with energies

E

(1)

, E

(2)

. Then the total number of

states of the combined system is

Ω(E

(1)

, E

(2)

) = Ω

1

(E

(1)

)Ω

2

(E

(2)

).

So when we take the logarithm, we find

S(E

(1)

, E

(2)

) = S(E

(1)

) + S(E

(2)

).

Of course, this is not very interesting, until we bring our systems together and

let them interact with each other.

Interacting systems

Suppose we bring the two systems together, and let them exchange energy. Then

the energy of the individual systems is no longer fixed, and only the total energy

E

total

= E

(1)

+ E

(2)

is fixed. Then we find that

Ω(E

total

) =

X

E

i

Ω

1

(E

i

)Ω

2

(E

total

− E

i

),

where we sum over all possible energy levels of the first system. In terms of the

entropy of the system, we have

Ω(E

total

) =

X

E

i

exp

S

1

(E

i

)

k

+

S

2

(E

total

− E

i

)

k

We can be a bit more precise with what the sum means. We are not summing

over all eigenstates. Recall that we have previously fixed an accuracy

δE

. So

we can imagine dividing the whole energy spectrum into chunks of size

δE

, and

here we are summing over the chunks.

We know that

S

1,2

/k ∼ N

1,2

∼

10

23

, which is a ridiculously large number.

So the sum is overwhelmingly dominated by the term with the largest exponent.

Suppose this is maximized when E

i

= E

∗

. Then we have

S(E

total

) = k log Ω(E

total

) ≈ S

1

(E

∗

) + S

2

(E

total

− E

∗

).

Again, we are not claiming that only the factor coming from

E

∗

has significant

contribution. Maybe one or two energy levels next to

E

∗

are also very significant,

but taking these into account will only multiply Ω(

E

total

) by a (relatively) small

constant, hence contributes a small additive factor to

S

(

E

total

), which can be

neglected.

Now given any

E

(1)

, what is the probability that the actual energy of the

first system is E

(1)

? For convenience, we write E

(2)

= E

total

− E

(1)

, the energy

of the second system. Then the probability desired is

Ω

1

(E

(1)

)Ω

2

(E

(2)

)

Ω(E

total

)

= exp

1

k

S

1

(E

(1)

) + S

2

(E

(2)

) − S(E

total

)

.

Again recall that the numbers at stake are unimaginably huge. So if

S

1

(

E

(1)

) +

S

2

(

E

(2)

) is even slightly different from

S

(

E

total

), then the probability is effectively

zero. And by above, for the two quantities to be close, we need

E

(1)

=

E

∗

. So

for all practical purposes, the value of E

(1)

is fixed into E

∗

.

Now imagine we prepare two systems separately with energies

E

(1)

and

E

(2)

such that

E

(1)

6

=

E

∗

, and then bring the system together, then we are no longer

in equilibrium.

E

(1)

will change until it takes value

E

∗

, and then entropy of the

system will increase from

S

1

(

E

(1)

) +

S

2

(

E

(2)

) to

S

1

(

E

∗

) +

S

2

(

E

total

− E

∗

). In

particular, the entropy increases.

Law

(Second law of thermodynamics)

.

The entropy of an isolated system

increases (or remains the same) in any physical process. In equilibrium, the

entropy attains its maximum value.

This prediction is verified by virtually all observations of physics.

While our derivation did not show it is impossible to violate the second law

of thermodynamics, it is very very very very very very very very unlikely to be

violated.

Temp erature

Having defined entropy, the next interesting thing we can define is the temperature.

We assume that

S

is a smooth function in

E

. Then we can define the temperature

as follows:

Definition (Temperature). The temperature is defined to be

1

T

=

dS

dE

.

Why do we call this the temperature? Over the course, we will see that

this quantity we decide to call “temperature” does behave as we would expect

temperature to behave. It is difficult to give further justification of this definition,

because even though we vaguely have some idea what temperature is like in

daily life, those ideas are very far from anything we can concretely write down

or even describe.

One reassuring property we can prove is the following:

Proposition.

Two interacting systems in equilibrium have the same tempera-

ture.

Proof. Recall that the equilibrium energy E

∗

is found by maximizing

S

1

(E

i

) + S

2

(E

total

− E

i

)

over all possible

E

i

. Thus, at an equilibrium, the derivative of this expression

has to vanish, and the derivative is exactly

dS

1

dE

E

(1)

=E

∗

−

dS

i

dE

E

(2)

=E

total

−E

∗

= 0

So we need

1

T

1

=

1

T

2

.

In other words, we need

T

1

= T

2

.

Now suppose initially, our systems have different temperature. We would

expect energy to flow from the hotter system to the cooler system. This is indeed

the case.

Proposition.

Suppose two systems with initial energies

E

(1)

, E

(2)

and temper-

atures

T

1

, T

2

are put into contact. If

T

1

> T

2

, then energy will flow form the

first system to the second.

Proof.

Since we are not in equilibrium, there must be some energy transfer from

one system to the other. Suppose after time δt, the energy changes by

E

(1)

7→ E

(1)

+ δE

E

(2)

7→ E

(2)

− δE,

keeping the total energy constant. Then the change in entropy is given by

δS =

dS

1

dE

δE

(1)

+

dS

2

dE

δE

(2)

=

1

T

1

−

1

T

2

δE.

By assumption, we know

1

T

1

−

1

T

2

< 0,

but by the second law of thermodynamics, we know

δS

must increase. So we

must have δE < 0, i.e. energy flows from the first system to the second.

So this notion of temperature agrees with the basic properties of temperature

we expect.

Note that these properties we’ve derived only depends on the fact that

1

T

is a monotonically decreasing function of T . In principle, we could have picked

any monotonically decreasing function of

T

, and set it to

dS

dE

. We will later see

that this definition will agree with the other definitions of temperature we have

previously seen, e.g. via the ideal gas law, and so this is indeed the “right” one.

Heat capacity

As we will keep on doing later, we can take different derivatives to get different

interesting quantities. This time, we are going to get heat capacity. Recall that

T

was a function of energy,

T

=

T

(

E

). We will assume that we can invert this

function, at least locally, to get E as a function of T .

Definition (Heat capacity). The heat capacity of a system is

C =

dE

dT

.

The specific heat capacity is

C

mass of system

.

The specific heat capacity is a property of the substance that makes up the

system, and not how much stuff there is, as both

C

and the mass scale linearly

with the size of the system.

This is some quantity we can actually physically measure. We can measure

the temperature with a thermometer, and it is usually not too difficult to see how

much energy we are pumping into a system. Then by measuring the temperature

change, we can figure out the heat capacity.

In doing so, we can indirectly measure the entropy, or at least the changes in

entropy. Note that we have

dS

dT

=

dS

dE

dE

dT

=

C

T

.

Integrating up, if the temperature changes from T

1

to T

2

, we know

∆S =

Z

T

2

T

1

C(T )

T

dT.

As promised, by measuring heat capacity experimentally, we can measure the

change in entropy.

The heat capacity is useful in other ways. Recall that to find the equilibrium

energy E

∗

, a necessary condition was that it satisfies

dS

1

dE

−

dS

2

dE

= 0.

However, we only know that the solution is an extrema, and not necessarily

maximum. To figure out if it is the maximum, we take the second derivative.

Note that for a single system, we have

d

2

S

dE

2

=

d

dE

1

T

= −

1

T

2

C

.

Applying this to two systems, one can check that entropy is maximized at

E

(1)

=

E

∗

if

C

1

, C

2

>

0. The actual computations is left as an exercise on the

first example sheet.

Let’s look at some actual systems and try to calculate these quantities.

Example.

Consider a 2-state system, where we have

N

non-interacting particles

with fixed positions. Each particle is either in

|↑i

or

|↓i

. We can think of these

as spins, for example. These two states have different energies

E

↑

= ε, E

↓

= 0.

We let

N

↑

and

N

↓

be the number of particles in

|↑i

and

|↓i

respectively. Then

the total energy of the system is

E = εN

↑

.

We want to calculate this quantity Ω(

E

). Here in this very contrived example,

it is convenient to pick

δE < ε

, so that Ω(

E

) is just the number of ways of

choosing N

↑

particles from N. By basic combinatorics, we have

Ω(E) =

N!

N

↑

!(N − N

↑

)!

,

and

S(E) = k log

N!

N

↑

!(N − N

↑

)!

.

This is not an incredibly useful formula. Since we assumed that

N

and

N

↑

are

huge, we can use Stirling’s approximation

N! =

√

2πNN

N

e

−N

1 + O

1

N

.

Then we have

log N! = N log N − N +

1

2

log(2πN) + O

1

N

.

We just use the approximation three times to get

S(E) = k (N log N − N −N

↑

log N

↑

+ N

↑

− (N −N

↑

) log(N − N

↑

) + N −N

↑

)

= −k

(N − N

↑

) log

N − N

↑

N

+ N

↑

log

N

↑

N

= −kN

1 −

E

Nε

log

1 −

E

Nε

+

E

Nε

log

E

Nε

.

This is better, but we can get much more out of it if we plot it:

E

S(E)

0

Nε

Nε/2

Nk log 2

The temperature is

1

T

=

dS

dT

=

k

ε

log

Nε

E

− 1

,

and we can invert to get

N

↑

N

=

E

Nε

=

1

e

ε/kT

+ 1

.

Suppose we get to control the temperature of the system, e.g. if we put it with a

heat reservoir. What happens as we vary our temperature?

–

As

T →

0, we have

N

↑

→

0. So the states all try to go to the ground state.

– As T → ∞, we find N

↑

/N →

1

2

, and E → N ε/2.

The second result is a bit weird. As

T → ∞

, we might expect all things to go

the maximum energy level, and not just half of them.

To confuse ourselves further, we can plot another graph, for

1

T

vs

E

. The

graph looks like

E

1

T

0

Nε

Nε/2

We see that having energy

> Nε/

2 corresponds to negative temperature, and to

go from positive temperature to negative temperature, we need to pass through

infinite temperature. So in some sense, negative temperature is “hotter” than

infinite temperature.

What is going on? By definition, negative

T

means Ω(

E

) is a decreasing

function of energy. This is a very unusual situation. In this system, all the

particles are fixed, and have no kinetic energy. Consequently, the possible energy

levels are bounded. If we included kinetic energy into the system, then kinetic

energy can be arbitrarily large. In this case, Ω(

E

) is usually an increasing

function of E.

Negative

T

has indeed been observed experimentally. This requires setups

where the kinetic energy is not so important in the range of energies we are

talking about. One particular scenario where this is observed is in nuclear spins

of crystals in magnetic fields. If we have a magnetic field, then naturally, most

of the spins will align with the field. We now suddenly flip the field, and then

most of the spins are anti-aligned, and this can give us a negative temperature

state.

Now we can’t measure negative temperature by sticking a thermometer into

the material and getting a negative answer. Something that can be interestingly

measured is the heat capacity

C =

dE

dT

=

Nε

2

kT

2

e

ε/kT

(e

ε/kT

+ 1)

2

.

This again exhibits some peculiar properties. We begin by looking at a plot:

T

C

0

kT ∼ ε

By looking at the formula, we see that the maximum

kT

is related to the

microscopic

ε

. If we know about the value of

k

, then we can use the macroscopic

observation of C to deduce something about the microscopic ε.

Note that C is proportional to N. As T → 0, we have

C ∝ T

−2

e

−ε/kT

,

and this is a function that decreases very rapidly as

T →

0, and in fact this is

one of the favorite examples in analysis where all derivatives of the function at 0

vanish. Ultimately, this is due to the energy gap between the ground state and

the first excited state.

Another peculiarity of this plot is that the heat capacity vanishes at high

temperature, but this is due to the peculiar property of the system at high

temperature. In a general system, we expect the heat capacity to increase with

temperature.

How much of this is actually physical? The answer is “not much”. This is

not surprising, because we didn’t really do much physics in these computations.

For most solids, the contribution to

C

from spins is swamped by other effects

such as contributions of phonons (quantized vibrations in the solid) or electrons.

In this case, C(T ) is monotonic in T .

However, there are some very peculiar materials for which we obtain a small

local maximum in

C

(

T

) for very small

T

, before increasing monotonically, which

is due to the contributions of spin:

T

C

0

1.2 Pressure, volume and the first law of thermodynamics

So far, our system only had one single parameter — the energy. Usually, our

systems have other external parameters which can be varied. Recall that our

“standard” model of a statistical system is a box of gas. If we allow ourselves

to move the walls of the box, then the volume of the system may vary. As we

change the volume, the allowed energies eigenstates will change. So now Ω, and

hence S are functions of energy and volume:

S(E, V ) = k log Ω(E, V ).

We now need to modify our definition of temperature to account for this depen-

dence:

Definition

(Temperature)

.

The temperature of a system with variable volume

is

1

T

=

∂S

∂E

V

,

with V fixed.

But now we can define a different thermodynamic quantity by taking the

derivative with respect to V .

Definition

(Pressure)

.

We define the pressure of a system with variable volume

to be

p = T

∂S

∂V

E

.

Is this thing we call the “pressure” any thing like what we used to think of

as pressure, namely force per unit area? We will soon see that this is indeed the

case.

We begin by deducing some familiar properties of pressure.

Proposition.

Consider as before two interacting systems where the total volume

V

=

V

1

+

V

2

is fixed by the individual volumes can vary. Then the entropy of

the combined system is maximized when T

1

= T

2

and p

1

= p

2

.

Proof. We have previously seen that we need T

1

= T

2

. We also want

dS

dV

E

= 0.

So we need

dS

1

dV

E

=

dS

2

dV

E

.

Since the temperatures are equal, we know that we also need p

1

= p

2

.

For a single system, we can use the chain rule to write

dS =

∂S

∂E

V

dE +

∂S

∂V

E

dV.

Then we can use the definitions of temperature and pressure to write

Proposition (First law of thermodynamics).

dE = T dS − p dV.

This law relates two infinitesimally close equilibrium states. This is sometimes

called the fundamental thermodynamics relation.

Example.

Consider a box with one side a movable piston of area

A

. We apply

a force F to keep the piston in place.

⇐= F

dx

What happens if we move the piston for a little bit? If we move through a

distance d

x

, then the volume of the gas has increased by

A

d

x

. We assume

S

is

constant. Then the first law tells us

dE = −pA dx.

This formula should be very familiar to us. This is just the work done by the

force, and this must be

F

=

pA

. So our definition of pressure in terms of partial

derivatives reproduces the mechanics definition of force per unit area.

One has to be cautious here. It is not always true that

−p

d

V

can be equated

with the word done on a system. For this to be true, we require the change to

be reversible, which is a notion we will study more in depth later. For example,

this is not true when there is friction.

In the case of a reversible change, if we equate

−p

d

V

with the work done,

then there is only one possible thing

T

d

S

can be — it is the heat supplied to

the system.

It is important to remember that the first law holds for any change. It’s just

that this interpretation does not.

Example.

Consider the irreversible change, where we have a “free expansion”

of gas into vacuum. We have a box

gas

vacuum

We have a valve in the partition, and as soon as we open up the valve, the gas

flows to the other side of the box.

In this system, no energy has been supplied. So d

E

= 0. However, d

V >

0,

as volume clearly increased. But there is no work done on or by the gas. So in

this case,

−p

d

V

is certainly not the work done. Using the first law, we know

that

T dS = p dV.

So as the volume increases, the entropy increases as well.

We now revisit the concept of heat capacity. We previously defined it as

d

E/

d

T

, but now we need to decide what we want to keep fixed. We can keep

V

fixed, and get

C

V

=

∂E

∂T

V

= T

∂S

∂T

V

.

While this is the obvious generalization of what we previously had, it is not a

very useful quantity. We do not usually do experiments with fixed volume. For

example, if we do a chemistry experiment in a test tube, say, then the volume is

not fixed, as the gas in the test tube is free to go around. Instead, what is fixed

is the pressure. We can analogously define

C

p

= T

∂S

∂T

p

.

Note that we cannot write this as some sort of

∂E

∂T

.

1.3 The canonical ensemble

So far, we have been using the microcanonical ensemble. The underlying as-

sumption is that our system is totally isolated, and we know what the energy of

the system is. However, in real life, this is most likely not the case. Even if we

produce a sealed box of gas, and try to do experiments with it, the system is

not isolated. It can exchange heat with the environment.

On the other hand, there is one thing that is fixed — the temperature.

The box is in thermal equilibrium with the environment. If we assume the

environment is “large”, then we can assume that the environment is not really

affected by the box, and so the box is forced to have the same temperature as

the environment.

Let’s try to study this property. Consider a system

S

interacting with a

much larger system

R

. We call this

R

a heat reservoir. Since

R

is assumed to

be large, the energy of

S

is negligible to

R

, and we will assume

R

always has

a fixed temperature

T

. Then in this set up, the systems can exchange energy

without changing T .

As before, we let

|ni

be a basis of microstates with energy

E

n

. We suppose

we fix a total energy

E

total

, and we want to find the total number of microstates

of the combined system with this total energy. To do so, we fix some state

|ni

of

S

, and ask how many states of

S

+

R

there are for which

S

is in

|ni

. We then

later sum over all |ni.

By definition, we can write this as

Ω

R

(E

total

− E

n

) = exp

k

−1

S

R

(E

total

− E

n

)

.

By assumption, we know

R

is a much larger system than

S

. So we only get

significant contributions when

E

n

E

total

. In these cases, we can Taylor expand

to write

Ω

R

(E

total

− E

n

) = exp

k

−1

S

R

(E

total

) − k

−1

∂S

R

∂E

V

E

n

.

But we know what

∂S

R

∂E

is — it is just T

−1

. So we finally get

Ω

R

(E

total

− E

n

) = e

k

−1

S

R

(E

total

)

e

−βE

n

,

where we define

Notation (β).

β =

1

kT

.

Note that while we derived this this formula under the assumption that

E

n

is small, it is effectively still valid when

E

n

is large, because both sides are very

tiny, and even if they are very tiny in different ways, it doesn’t matter when we

add over all states.

Now we can write the total number of microstates of S + R as

Ω(E

total

) =

X

n

Ω

R

(E

total

− E

n

) = e

k

−1

S

R

(E

total

)

X

n

e

−βE

n

.

Note that we are summing over all states, not energy.

We now use the fundamental assumption of statistical mechanics that all

states of

S

+

R

are equally likely. Then we know the probability that

S

is in

state |ni is

p(n) =

Ω

R

(E

total

− E

n

)

Ω(E

total

)

=

e

−βE

n

P

k

e

−βE

k

.

This is called the Boltzmann distribution for the canonical ensemble. Note that

at the end, all the details have dropped out apart form the temperature. This

describes the energy distribution of a system with fixed temperature.

Note that if

E

n

kT

=

1

β

, then the exponential is small. So only states

with

E

n

∼ kT

have significant probability. In particular, as

T →

0, we have

β → ∞, and so only the ground state can be occupied.

We now define an important quantity.

Definition (Partition function). The partition function is

Z =

X

n

e

−βE

n

.

It turns out most of the interesting things we are interested in can be expressed

in terms of

Z

and its derivatives. Thus, to understand a general system, what

we will do is to compute the partition function and express it in some familiar

form. Then we can use standard calculus to obtain quantities we are interested

in. To begin with, we have

p(n) =

e

−βE

n

Z

.

Proposition. For two non-interacting systems, we have Z(β) = Z

1

(β)Z

2

(β).

Proof. Since the systems are not interacting, we have

Z =

X

n,m

e

−β(E

(1)

n

+E

(2)

n

)

=

X

n

e

−βE

(1)

n

!

X

n

e

−βE

(2)

n

!

= Z

1

Z

2

.

Note that in general, energy is not fixed, but we can compute the average

value:

hEi =

X

n

p(n)E

n

=

X

E

n

e

−βE

n

Z

= −

∂

∂β

log Z.

This partial derivative is taken with all

E

i

fixed. Of course, in the real world,

we don’t get to directly change the energy eigenstates and see what happens.

However, they do depend on some “external” parameters, such as the volume

V

,

the magnetic field

B

etc. So when we take this derivative, we have to keep all

those parameters fixed.

We look at the simple case where

V

is the only parameter we can vary. Then

Z = Z(β, V ). We can rewrite the previous formula as

hEi = −

∂

∂β

log Z

V

.

This gives us the average, but we also want to know the variance of

E

. We have

∆E

2

= h(E − hEi)

2

i = hE

2

i − hEi

2

.

On the first example sheet, we calculate that this is in fact

∆E

2

=

∂

2

∂β

2

log Z

V

= −

∂hEi

∂β

V

.

We can now convert

β

-derivatives to

T

-derivatives using the chain rule. Then

we get

∆E

2

= kT

2

∂hEi

∂T

V

= kT

2

C

V

.

From this, we can learn something important. We would expect

hEi ∼ N

, the

number of particles of the system. But we also know C

V

∼ N. So

∆E

hEi

∼

1

√

N

.

Therefore, the fluctuations are negligible if

N

is large enough. This is called the

thermodynamic limit

N → ∞

. In this limit, we can ignore the fluctuations in

energy. So we expect the microcanonical ensemble and the canonical ensemble to

give the same result. And for all practical purposes,

N ∼

10

23

is a large number.

Because of that, we are often going to just write E instead of hEi.

Example. Suppose we had particles with

E

↑

= ε, E

↓

= 0.

So for one particle, we have

Z

1

=

X

n

e

−βE

n

= 1 + e

−βε

= 2e

−βε/2

cosh

βε

2

.

If we have

N

non-interacting systems, then since the partition function is

multiplicative, we have

Z = Z

N

1

= 2

n

e

−βεN/2

cosh

N

βε

2

.

From the partition function, we can compute

hEi = −

d log Z

dβ

=

Nε

2

1 − tanh

βε

2

.

We can check that this agrees with the value we computed with the microcanon-

ical ensemble (where we wrote the result using expo