1Fundamentals of statistical mechanics

II Statistical Physics

1.3 The canonical ensemble

So far, we have been using the microcanonical ensemble. The underlying as-

sumption is that our system is totally isolated, and we know what the energy of

the system is. However, in real life, this is most likely not the case. Even if we

produce a sealed box of gas, and try to do experiments with it, the system is

not isolated. It can exchange heat with the environment.

On the other hand, there is one thing that is fixed — the temperature.

The box is in thermal equilibrium with the environment. If we assume the

environment is “large”, then we can assume that the environment is not really

affected by the box, and so the box is forced to have the same temperature as

the environment.

Let’s try to study this property. Consider a system

S

interacting with a

much larger system

R

. We call this

R

a heat reservoir. Since

R

is assumed to

be large, the energy of

S

is negligible to

R

, and we will assume

R

always has

a fixed temperature

T

. Then in this set up, the systems can exchange energy

without changing T .

As before, we let

|ni

be a basis of microstates with energy

E

n

. We suppose

we fix a total energy

E

total

, and we want to find the total number of microstates

of the combined system with this total energy. To do so, we fix some state

|ni

of

S

, and ask how many states of

S

+

R

there are for which

S

is in

|ni

. We then

later sum over all |ni.

By definition, we can write this as

Ω

R

(E

total

− E

n

) = exp

k

−1

S

R

(E

total

− E

n

)

.

By assumption, we know

R

is a much larger system than

S

. So we only get

significant contributions when

E

n

E

total

. In these cases, we can Taylor expand

to write

Ω

R

(E

total

− E

n

) = exp

k

−1

S

R

(E

total

) − k

−1

∂S

R

∂E

V

E

n

.

But we know what

∂S

R

∂E

is — it is just T

−1

. So we finally get

Ω

R

(E

total

− E

n

) = e

k

−1

S

R

(E

total

)

e

−βE

n

,

where we define

Notation (β).

β =

1

kT

.

Note that while we derived this this formula under the assumption that

E

n

is small, it is effectively still valid when

E

n

is large, because both sides are very

tiny, and even if they are very tiny in different ways, it doesn’t matter when we

add over all states.

Now we can write the total number of microstates of S + R as

Ω(E

total

) =

X

n

Ω

R

(E

total

− E

n

) = e

k

−1

S

R

(E

total

)

X

n

e

−βE

n

.

Note that we are summing over all states, not energy.

We now use the fundamental assumption of statistical mechanics that all

states of

S

+

R

are equally likely. Then we know the probability that

S

is in

state |ni is

p(n) =

Ω

R

(E

total

− E

n

)

Ω(E

total

)

=

e

−βE

n

P

k

e

−βE

k

.

This is called the Boltzmann distribution for the canonical ensemble. Note that

at the end, all the details have dropped out apart form the temperature. This

describes the energy distribution of a system with fixed temperature.

Note that if

E

n

kT

=

1

β

, then the exponential is small. So only states

with

E

n

∼ kT

have significant probability. In particular, as

T →

0, we have

β → ∞, and so only the ground state can be occupied.

We now define an important quantity.

Definition (Partition function). The partition function is

Z =

X

n

e

−βE

n

.

It turns out most of the interesting things we are interested in can be expressed

in terms of

Z

and its derivatives. Thus, to understand a general system, what

we will do is to compute the partition function and express it in some familiar

form. Then we can use standard calculus to obtain quantities we are interested

in. To begin with, we have

p(n) =

e

−βE

n

Z

.

Proposition. For two non-interacting systems, we have Z(β) = Z

1

(β)Z

2

(β).

Proof. Since the systems are not interacting, we have

Z =

X

n,m

e

−β(E

(1)

n

+E

(2)

n

)

=

X

n

e

−βE

(1)

n

!

X

n

e

−βE

(2)

n

!

= Z

1

Z

2

.

Note that in general, energy is not fixed, but we can compute the average

value:

hEi =

X

n

p(n)E

n

=

X

E

n

e

−βE

n

Z

= −

∂

∂β

log Z.

This partial derivative is taken with all

E

i

fixed. Of course, in the real world,

we don’t get to directly change the energy eigenstates and see what happens.

However, they do depend on some “external” parameters, such as the volume

V

,

the magnetic field

B

etc. So when we take this derivative, we have to keep all

those parameters fixed.

We look at the simple case where

V

is the only parameter we can vary. Then

Z = Z(β, V ). We can rewrite the previous formula as

hEi = −

∂

∂β

log Z

V

.

This gives us the average, but we also want to know the variance of

E

. We have

∆E

2

= h(E − hEi)

2

i = hE

2

i − hEi

2

.

On the first example sheet, we calculate that this is in fact

∆E

2

=

∂

2

∂β

2

log Z

V

= −

∂hEi

∂β

V

.

We can now convert

β

-derivatives to

T

-derivatives using the chain rule. Then

we get

∆E

2

= kT

2

∂hEi

∂T

V

= kT

2

C

V

.

From this, we can learn something important. We would expect

hEi ∼ N

, the

number of particles of the system. But we also know C

V

∼ N. So

∆E

hEi

∼

1

√

N

.

Therefore, the fluctuations are negligible if

N

is large enough. This is called the

thermodynamic limit

N → ∞

. In this limit, we can ignore the fluctuations in

energy. So we expect the microcanonical ensemble and the canonical ensemble to

give the same result. And for all practical purposes,

N ∼

10

23

is a large number.

Because of that, we are often going to just write E instead of hEi.

Example. Suppose we had particles with

E

↑

= ε, E

↓

= 0.

So for one particle, we have

Z

1

=

X

n

e

−βE

n

= 1 + e

−βε

= 2e

−βε/2

cosh

βε

2

.

If we have

N

non-interacting systems, then since the partition function is

multiplicative, we have

Z = Z

N

1

= 2

n

e

−βεN/2

cosh

N

βε

2

.

From the partition function, we can compute

hEi = −

d log Z

dβ

=

Nε

2

1 − tanh

βε

2

.

We can check that this agrees with the value we computed with the microcanon-

ical ensemble (where we wrote the result using exponentials directly), but the

calculation is much easier.

Entropy

When we first began our journey to statistical physics, the starting point of

everything was the entropy. When we derived the canonical ensemble, we used

the entropy of the everything, including that of the reservoir. However, we are

not really interested in the reservoir, so we need to come up with an alternative

definition of the entropy.

We can motivate our new definition as follows. We use our previous picture

of an ensemble. We have

W

1 many worlds, and our probability distribution

says there are

W p

(

n

) copies of the world living in state

|ni

. We can ask what is

the number of ways of picking a state for each copy of the world so as to reach

this distribution.

We apply the Boltzmann definition of entropy to this counting:

S = k log Ω

This time, Ω is given by

Ω =

W !

Q

n

(W p(n))!

.

We can use Stirling’s approximation, and find that

S

ensemble

= −kW

X

n

p(n) log p(n).

This suggests that we should define the entropy of a single copy as follows:

Definition

(Gibbs entropy)

.

The Gibbs entropy of a probability distribution

p(n) is

S = −k

X

n

p(n) log p(n).

If the density operator is given by

ˆρ =

X

n

p(n) |nihn|,

then we have

S = −Tr(ˆρ log ˆρ).

We now check that this definition makes sense, in that when we have a micro-

canonical ensemble, we do get what we expect.

Example. In the microcanonical ensemble, we have

p(n) =

(

1

Ω(E)

E ≤ E

n

≤ E + δE

0 otherwise

Then we have

S = −k

X

n:E≤E

n

≤E+δE

1

Ω(E)

log

1

Ω(E)

= −kΩ(E) ·

1

Ω(E)

log

1

Ω(E)

= k log Ω(E).

So the Gibbs entropy reduces to the Boltzmann entropy.

How about the canonical ensemble?

Example. In the canonical ensemble, we have

p(n) =

e

−βE

n

Z

.

Plugging this into the definition, we find that

S = −k

X

n

p(n) log

e

−βE

n

Z

= −k

X

n

p(n)(−βE

n

− log Z)

= kβhEi + k log Z,

using the fact that

P

p(n) = 1.

Using the formula of the expected energy, we find that this is in fact

S = k

∂

∂T

(T log Z)

V

.

So again, if we want to compute the entropy, it suffices to find a nice closed

form of Z.

Maximizing entropy

It turns out we can reach the canonical ensemble in a different way. The second

law of thermodynamics suggests we should always seek to maximize entropy. Now

if we take the optimization problem of “maximizing entropy”, what probability

distribution will we end up with?

The answer depends on what constraints we put on the optimization problem.

We can try to maximize

S

Gibbs

over all probability distributions such that

p

(

n

) = 0 unless

E ≤ E

n

≤ E

+

δE

. Of course, we also have the constraint

P

p(n) = 1. Then we can use a Lagrange multiplier α and extremize

k

−1

S

Gibbs

+ α

X

n

p(n) − 1

!

,

Differentiating with respect to p(n) and solving, we get

p(n) = e

α−1

.

In particular, this is independent of

n

. So all microstates with energy in this

range are equally likely, and this gives the microcanonical ensemble.

What about the canonical ensemble? It turns out this is obtained by max-

imizing the entropy over all

p

(

n

) such that

hEi

is fixed. The computation is

equally straightforward, and is done on the first example sheet.