Part III Quantum Field Theory
Based on lectures by B. Allanach
Notes taken by Dexter Chua
Michaelmas 2016
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Quantum Field Theory is the language in which modern particle physics is formulated.
It represents the marriage of quantum mechanics with special relativity and provides the
mathematical framework in which to describe the interactions of elementary particles.
This first Quantum Field Theory course introduces the basic types of fields which play
an important role in high energy physics: scalar, spinor (Dirac), and vector (gauge)
fields. The relativistic invariance and symmetry properties of these fields are discussed
using Lagrangian language and Noether’s theorem.
The quantisation of the basic non-interacting free fields is firstly developed using the
Hamiltonian and canonical methods in terms of operators which create and annihilate
particles and anti-particles. The associated Fock space of quantum physical states is
explained together with ideas about how particles propagate in spacetime and their
statistics. How these fields interact with a classical electromagnetic field is described.
Interactions are described using perturbative theory and Feynman diagrams. This
is first illustrated for theories with a purely scalar field interaction, and then for a
couplings between scalar fields and fermions. Finally Quantum Electrodynamics, the
theory of interacting photons, electrons and positrons, is introduced and elementary
scattering processes are computed.
Pre-requisites
You will need to be comfortable with the Lagrangian and Hamiltonian formulations of
classical mechanics and with special relativity. You will also need to have taken an
advanced course on quantum mechanics.
Contents
0 Introduction
1 Classical field theory
1.1 Classical fields
1.2 Lorentz invariance
1.3 Symmetries and Noether’s theorem for field theories
1.4 Hamiltonian mechanics
2 Free field theory
2.1 Review of simple harmonic oscillator
2.2 The quantum field
2.3 Real scalar fields
2.4 Complex scalar fields
2.5 The Heisenberg picture
2.6 Propagators
3 Interacting fields
3.1 Interaction Lagrangians
3.2 Interaction picture
3.3 Wick’s theorem
3.4 Feynman diagrams
3.5 Amplitudes
3.6 Correlation functions and vacuum bubbles
4 Spinors
4.1 The Lorentz group and the Lorentz algebra
4.2 The Clifford algebra and the spin representation
4.3 Properties of the spin representation
4.4 The Dirac equation
4.5 Chiral/Weyl spinors and γ
5
4.6 Parity operator
4.7 Solutions to Dirac’s equation
4.8 Symmetries and currents
5 Quantizing the Dirac field
5.1 Fermion quantization
5.2 Yukawa theory
5.3 Feynman rules
6 Quantum electrodynamics
6.1 Classical electrodynamics
6.2 Quantization of the electromagnetic field
6.3 Coupling to matter in classical field theory
6.4 Quantization of interactions
6.5 Computations and diagrams
0 Introduction
The idea of quantum mechanics is that photons and electrons behave similarly.
We can make a photon interfere with itself in double-slit experiments, and
similarly an electron can interfere with itself. However, as we know, lights are
ripples in an electromagnetic field. So photons should arise from the quantization
of the electromagnetic field. If electrons are like photons, should we then have
an electron field? The answer is yes!
Quantum field theory is a quantization of a classical field. Recall that in
quantum mechanics, we promote degrees of freedom to operators. Basic degrees
of freedom of a quantum field theory are operator-valued functions of spacetime.
Since there are infinitely many points in spacetime, there is an infinite number
of degrees of freedom. This infinity will come back and bite us as we try to
develop quantum field theory.
Quantum field theory describes creation and annihilation of particles. The
interactions are governed by several basic principles locality, symmetry and
renormalization group flow. What the renormalization group flow describes is
the decoupling of low and high energy processes.
Why quantum field theory?
It appears that all particles of the same type are indistinguishable, e.g. all
electrons are the same. It is difficult to justify why this is the case if each particle
is considered individually, but if we view all electrons as excitations of the same
field, this is (almost) automatic.
Secondly, if we want to combine special relativity and quantum mechanics,
then the number of particles is not conserved. Indeed, consider a particle trapped
in a box of size
L
. By the Heisenberg uncertainty principle, we have
p & ~/L
.
We choose a particle with small rest mass so that m E. Then we have
E = ∆p ·c &
~c
L
.
When
E &
2
mc
2
, then we can pop a particle-antiparticle pair out of the
vacuum. So when
L .
~
2mc
, we can’t say for sure that there is only one particle.
We say
λ
=
~/
(
mc
) is the “compton wavelength” the minimum distance
at which it makes sense to localize a particle. This is also the scale at which
quantum effects kick in.
This argument is somewhat circular, since we just assumed that if we have
enough energy, then particle-antiparticle pairs would just pop out of existence.
This is in fact something we can prove in quantum field theory.
To reconcile quantum mechanics and special relativity, we can try to write a
relativistic version of Schr¨odinger’s equation for a single particle, but something
goes wrong. Either the energy is unbounded from below, or we end up with some
causality violation. This is bad. These are all fixed by quantum field theory by
the introduction of particle creation and annihilation.
What is quantum field theory good for?
Quantum field theory is used in (non-relativistic) condensed matter systems.
It describes simple phenomena such as phonons, superconductivity, and the
fractional quantum hall effect.
Quantum field theory is also used in high energy physics. The standard model
of particle physics consists of electromagnetism (quantum electrodynamics),
quantum chromodynamics and the weak forces. The standard model is tested
to very high precision by experiments, sometimes up to 1 part in 10
10
. So it is
good. While there are many attempts to go beyond the standard model, e.g.
Grand Unified Theories, they are mostly also quantum field theories.
In cosmology, quantum field theory is used to explain the density pertur-
bations. In quantum gravity, string theory is also primarily a quantum field
theory in some aspects. It is even used in pure mathematics, with applications
in topology and geometry.
History of quantum field theory
In the 1930’s, the basics of quantum field theory were laid down by Jordan,
Pauli, Heisenberg, Dirac, Weisskopf etc. They encountered all sorts of infinities,
which scared them. Back then, these sorts of infinities seemed impossible to
work with.
Fortunately, in the 1940’s, renormalization and quantum electrodynamics
were invented by Tomonaga, Schwinger, Feynman, Dyson, which managed to
deal with the infinities. It was a sloppy process, and there was no understanding
of why we can subtract infinities and get a sensible finite result. Yet, they
managed to make experimental predictions, which were subsequently verified by
actual experiments.
In the 1960’s, quantum field theory fell out of favour as new particles such
as mesons and baryons were discovered. But in the 1970’s, it had a golden
age when the renormalization group was developed by Kadanoff and Wilson,
which was really when the infinities became understood. At the same time, the
standard model was invented, and a connection between quantum field theory
and geometry was developed.
Units and scales
We are going to do a lot of computations in the course, which are reasonably
long. We do not want to have loads of
~
and
c
’s all over the place when we do
the calculations. So we pick convenient units so that they all vanish.
Nature presents us with three fundamental dimensionful constants that are
relevant to us:
(i) The speed of light c with dimensions LT
1
;
(ii) Planck’s constant ~ with dimensions L
2
MT
1
;
(iii) The gravitational constant G with dimensions L
3
M
1
T
2
.
We see that these dimensions are independent. So we define units such that
c
=
~
= 1. So we can express everything in terms of a mass, or an energy, as we
now have
E
=
m
. For example, instead of
λ
=
~/
(
mc
), we just write
λ
=
m
1
.
We will work with electron volts
eV
. To convert back to the conventional SI
units, we must insert the relevant powers of
c
and
~
. For example, for a mass of
m
e
= 10
6
eV, we have λ
e
= 2 × 10
12
m.
After getting rid of all factors of
~
and
c
, if a quantity
X
has a mass dimension
d, we write [X] = d. For example, we have [G] = 2, since we have
G =
~c
M
2
p
=
1
M
2
p
,
where M
p
10
19
GeV is the Planck scale.
1 Classical field theory
Before we get into quantum nonsense, we will first start by understanding
classical fields.
1.1 Classical fields
Definition
(Field)
.
A field
φ
is a physical quantity defined at every point of
spacetime (x, t). We write the value of the field at (x, t) as φ(x, t).
The “physical quantities” can be real or complex scalars, but later we will
see that we might have to consider even more complicated stuff.
In classical point mechanics, we have a finite number of generalized coordi-
nates
q
a
(
t
). In field theory, we are interested in the dynamics of
φ
a
(
x, t
), where
a
and
x
are both labels. Here the
a
labels the different fields we can have, and
(
x, t
) labels a spacetime coordinate. Note that here position has been relegated
from a dynamical variable (i.e. one of the q
a
) to a mere label.
Example.
The electric field
E
i
(
x, t
) and magnetic field
B
i
(
x, t
), for
i
= 1
,
2
,
3,
are examples of fields. These six fields can in fact be derived from 4 fields
A
µ
(x, t), for µ = 0, 1, 2, 3, where
E
i
=
A
i
t
A
0
x
i
, B
i
=
1
2
ε
ijk
A
k
x
j
.
Often, we write
A
µ
= (φ, A).
Just like in classical dynamics, we would specify the dynamics of a field
through a Lagrangian. For particles, the Lagrangian specifies a number at each
time
t
. Since a field is a quantity at each point in space, we now have Lagrangian
densities, which gives a number at each point in spacetime (t, x).
Definition
(Lagrangian density)
.
Given a field
φ
(
x, t
), a Lagrangian density is
a function L(φ,
µ
φ) of φ and its derivative.
Note that the Lagrangian density treats space and time symmetrically. How-
ever, if we already have a favorite time axis, we can look at the “total” Lagrangian
at each time, and obtain what is known as the Lagrangian.
Definition
(Lagrangian)
.
Given a Lagrangian density, the Lagrangian is defined
by
L =
Z
d
3
x L(φ,
µ
φ).
For most of the course, we will only care about the Lagrangian density, and
call it the Lagrangian.
Definition
(Action)
.
Given a Lagrangian and a time interval [
t
1
, t
2
], the action
is defined by
S =
Z
t
2
t
1
dt L(t) =
Z
d
4
x L.
In general, we would want the units to satisfy [
S
] = 0. Since we have
[d
4
x] = 4, we must have [L] = 4.
The equations of motion are, as usual, given by the principle of least action.
Definition
(Principle of least action)
.
The equation of motion of a Lagrangian
system is given by the principle of least action we vary the field slightly,
keeping values at the boundary fixed, and require the first-order change
δS
= 0.
For a small perturbation φ
a
7→ φ
a
+ δφ
a
, we can compute
δS =
Z
d
4
x
L
φ
a
δφ
a
+
L
(
µ
φ
a
)
δ(
µ
φ
a
)
=
Z
d
4
x

L
φ
a
µ
L
(
µ
φ
a
)

δφ
a
+
µ
L
(
µ
φ
a
)
δφ
a

.
We see that the last term vanishes for any term that decays at spatial infinity,
and obeys δφ
a
(x, t
1
) = δφ
a
(x, t
2
) = 0.
Requiring δS = 0 means that we need
Proposition
(Euler-Lagrange equations)
.
The equations of motion for a field
are given by the Euler-Lagrange equations:
µ
L
(
µ
φ
a
)
L
φ
a
= 0.
We can begin by considering the simplest field one can imagine of the
Klein–Gordon field. We will later see that this is a “free field”, and “particles”
don’t interact with each other.
Example.
The Klein–Gordon equation for a real scalar field
φ
(
x, t
) is given by
the Lagrangian
L =
1
2
µ
φ∂
µ
φ
1
2
m
2
φ
2
=
1
2
˙
φ
2
1
2
(φ)
2
1
2
m
2
φ
2
.
We can view this Lagrangian as saying L = T V , where
T =
1
2
˙
φ
2
is the kinetic energy, and
V =
1
2
(φ)
2
+
1
2
m
2
φ
2
is the potential energy.
To find the Euler-Lagrange equation, we can compute
L
(
µ
φ)
=
µ
φ = (
˙
φ, −∇φ)
and
L
φ
= m
2
φ.
So the Euler-Lagrange equation says
µ
µ
φ + m
2
φ = 0.
More explicitly, this says
¨
φ
2
φ + m
2
φ = 0.
We could generalize this and add more terms to the Lagrangian. An obvious
generalization would be
L =
1
2
µ
φ∂
µ
φ V (φ),
where
V
(
φ
) is an arbitrary potential function. Then we similarly obtain the
equation
µ
µ
φ +
V
φ
= 0.
Example. Maxwell’s equations in vacuum are given by the Lagrangian
L =
1
2
(
µ
A
ν
)(
µ
A
ν
) +
1
2
(
µ
A
µ
)
2
.
To find out the Euler-Lagrange equations, we need to figure out the derivatives
with respect to each component of the A field. We obtain
L
(
µ
A
ν
)
=
µ
A
ν
+ (
ρ
A
ρ
)η
µν
.
So we obtain
µ
L
(
µ
A
ν
)
=
µ
µ
A
ν
+
ν
(
ρ
A
ρ
) =
µ
(
µ
A
ν
ν
A
µ
).
We write
F
µν
=
µ
A
ν
ν
A
µ
.
So we are left with
µ
L
(
µ
A
ν
)
=
µ
F
µν
.
It is an exercise to check that these Euler-Lagrange equations reproduce
i
E
i
= 0,
˙
E
i
= ε
ijk
j
B
k
.
Using our F
µν
, we can rewrite the Lagrangian as
L =
1
4
F
µν
F
µν
.
How did we come up with these Lagrangians? In general, it is guided by two
principles one is symmetry, and the other is renormalizability. We will discuss
symmetries shortly, and renormalizaility would be done in the III Advanced
Quantum Field Theory course.
In all of these examples, the Lagrangian is local. In other words, the terms
don’t couple φ(x, t) to φ(y, t) if x 6= y.
Example. An example of a non-local Lagrangian would be
Z
d
3
x
Z
d
3
y φ(x)φ(x y)
A priori, there is no reason for this, but it happens that nature seems to be
local. So we shall only consider local Lagrangians.
Note that locality does not mean that the Lagrangian at a point only depends
on the value at the point. Indeed, it also depends on the derivatives at
x
. So we
can view this as saying the value of
L
at
x
only depends on the value of
φ
at an
infinitesimal neighbourhood of x (formally, the jet at x).
1.2 Lorentz invariance
If we wish to construct relativistic field theories such that
x
and
t
are on an
equal footing, the Lagrangian should be invariant under Lorentz transformations
x
µ
7→ x
0µ
= Λ
µ
ν
x
ν
, where
Λ
µ
σ
η
στ
Λ
ν
τ
= η
µν
.
and η
µν
is the Minkowski metric given by
η
µν
=
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
.
Example. The transformation
Λ
µ
σ
=
1 0 0 0
0 1 0 0
0 0 cos θ sin θ
0 0 sin θ cos θ
describes a rotation by an angle around the x-axis.
Example. The transformation
Λ
µ
σ
=
γ γv 0 0
γv γ 0 0
0 0 1 0
0 0 0 1
describes a boost by v along the x-axis.
The Lorentz transformations form a Lie group under matrix multiplication —
see III Symmetries, Field and Particles.
The Lorentz transformations have a representation on the fields. For a scalar
field, this is given by
φ(x) 7→ φ
0
(x) = φ
1
x),
where the indices are suppressed. This is an active transformation say
x
0
is the point at which, say, the field is a maximum. Then after applying the
Lorentz transformation, the position of the new maximum is Λ
x
0
. The field
itself actually moved.
Alternatively, we can use passive transformations, where we just relabel the
points. In this case, we have
φ(x) 7→ φ(Λ(x)).
However, this doesn’t really matter, since if Λ is a Lorentz transformation, then
so is Λ
1
. So being invariant under active transformations is the same as being
invariant under passive transformations.
A Lorentz invariant theory should have equations of motion such that if
φ
(
x
)
is a solution, then so is
φ
1
x
). This can be achieved by requiring that the
action S is invariant under Lorentz transformations.
Example. In the Klein–Gordon field, we have
L =
1
2
µ
φ∂
µ
φ
1
2
m
2
φ
2
.
The Lorentz transformation is given by
φ(x) 7→ φ
0
(x) = φ
1
x) = φ(y),
where
y
µ
= (Λ
1
)
µ
ν
x
ν
.
We then check that
µ
φ(x) 7→
x
µ
(φ
1
x))
=
x
µ
(φ(y))
=
y
ν
x
µ
y
ν
(φ(y))
= (Λ
1
)
ν
µ
(
ν
φ)(y).
Since Λ
1
is a Lorentz transformation, we have
µ
φ∂
µ
φ =
µ
φ
0
µ
φ
0
.
In general, as long as we write everything in terms of tensors, we get Lorentz
invariant theories.
Symmetries play an important role in QFT. Different kinds of symmetries
include Lorentz symmetries, gauge symmetries, global symmetries and super-
symmetries (SUSY).
Example.
Protons and neutrons are made up of quarks. Each type of quark
comes in three flavors, which are called red, blue and green (these are arbitrary
names). If we swap around red and blue everywhere in the universe, then the
laws don’t change. This is known as a global symmetry.
However, in light of relativity, swapping around red and blue everywhere in
the universe might be a bit absurd, since the universe is so big. What if we only
do it locally? If we make the change differently at different points, the equations
don’t a priori remain invariant, unless we introduce a gauge boson. More of this
will be explored in the AQFT course.
1.3 Symmetries and Noether’s theorem for field theories
As in the case of classical dynamics, we get a Noether’s theorem that tells us
symmetries of the Lagrangian give us conserved quantities. However, we have
to be careful here. If we want to treat space and time equally, saying that a
quantity “does not change in time” is bad. Instead, what we have is a conserved
current, which is a 4-vector. Then given any choice of spacetime frame, we can
integrate this conserved current over all space at each time (with respect to the
frame), and this quantity will be time-invariant.
Theorem
(Noether’s theorem)
.
Every continuous symmetry of
L
gives rise to
a conserved current j
µ
(x) such that the equation of motion implies that
µ
j
µ
= 0.
More explicitly, this gives
0
j
0
+ · j = 0.
A conserved current gives rise to a conserved ch arge
Q =
Z
R
3
j
0
d
3
x,
since
dQ
dt
=
Z
R
3
dj
0
dt
d
3
x
=
Z
R
3
· j d
3
x
= 0,
assuming that j
i
0 as |x| .
Proof.
Consider making an arbitrary transformation of the field
φ
a
7→ φ
a
+
δφ
a
.
We then have
δL =
L
φ
a
δφ
a
+
L
(
µ
φ
a
)
δ(
µ
φ
a
)
=
L
φ
a
µ
L
(
µ
φ
a
)
δφ
a
+
µ
L
(
µ
φ
a
)
δφ
a
.
When the equations of motion are satisfied, we know the first term always
vanishes. So we are left with
δL =
µ
L
(
µ
φ
a
)
δφ
a
.
If the specific transformation
δφ
a
=
X
a
we are considering is a symmetry, then
δL
= 0 (this is the definition of a symmetry). In this case, we can define a
conserved current by
j
µ
=
L
(
µ
φ
a
)
X
a
,
and by the equations above, this is actually conserved.
We can have a slight generalization where we relax the condition for a
symmetry and still get a conserved current. We say that
X
a
is a symmetry if
δL
=
µ
F
µ
(
φ
) for some
F
µ
(
φ
), i.e. a total derivative. Replaying the calculations,
we get
j
µ
=
L
(
µ
φ
a
)
X
a
F
µ
.
Example
(Space-time invariance)
.
Recall that in classical dynamics, spatial
invariance implies the conservation of momentum, and invariance wrt to time
translation implies the conservation of energy. We’ll see something similar in
field theory. Consider x
µ
7→ x
µ
ε
µ
. Then we obtain
φ
a
(x) 7→ φ
a
(x) + ε
ν
ν
φ
a
(x).
A Lagrangian that has no explicit x
µ
dependence transforms as
L(x) 7→ L(x) + ε
ν
ν
L(x),
giving rise to 4 currents one for each ν = 0, 1, 2, 3. We have
(j
µ
)
ν
=
L
(
µ
φ
a
)
ν
φ
a
δ
µ
ν
L,
This particular current is called
T
µ
ν
, the energy-momentum tensor. This satisfies
µ
T
µ
ν
= 0,
We obtain conserved quantities, namely the energy
E =
Z
d
3
x T
00
,
and the total momentum
P
i
=
Z
d
3
x T
0i
.
Example. Consider the Klein–Gordon field, with
L =
1
2
µ
φ∂
µ
φ
1
2
m
2
φ
2
.
We then obtain
T
µν
=
µ
φ∂
ν
φ η
µν
L.
So we have
E =
Z
d
3
x
1
2
˙
φ
2
+
1
2
(φ)
2
+
1
2
m
2
φ
2
.
The momentum is given by
P
i
=
Z
d
3
x
˙
φ∂
i
φ.
In this example,
T
µν
comes out symmetric in
µ
and
ν
. In general, it would
not be, but we can always massage it into a symmetric form by adding
σ
µν
= T
µν
+
ρ
Γ
ρµν
with Γ
ρµν
a tensor antisymmetric in ρ and µ. Then we have
µ
ρ
Γ
ρµν
= 0.
So this σ
µν
is also invariant.
A symmetric energy-momentum tensor of this form is actually useful, and is
found on the RHS of Einstein’s field equation.
Example (Internal symmetries). Consider a complex scalar field
ψ(x) =
1
2
(φ
1
(x) +
2
(x)),
where φ
1
and φ
2
are real scalar fields. We put
L =
µ
ψ
µ
ψ V (ψ
ψ),
where
V (ψ
ψ) = m
2
ψ
ψ +
λ
2
(ψ
ψ)
2
+ ···
is some potential term.
To find the equations of motion, if we do all the complex analysis required,
we will figure that we will obtain the same equations as the real case if we treat
ψ and ψ
as independent variables. In this case, we obtain
µ
µ
ψ + m
2
ψ + λ(ψ
ψ)ψ + ··· = 0
and its complex conjugate. The
L
has a symmetry given by
ψ 7→ e
ψ
. Infinites-
imally, we have δψ = iαψ, and δψ
= iαψ
.
This gives a current
j
µ
= i(
µ
ψ
)ψ i(
µ
ψ)ψ
.
We will later see that associated charges of this type have an interpretation of
electric charge (or particle number, e.g. baryon number or lepton number).
Note that this symmetry is an abelian symmetry, since it is a symmetry under
the action of the abelian group U(1). There is a generalization to a non-abelian
case.
Example
(Non-abelian internal symmetries)
.
Suppose we have a theory with
many fields, with the Lagrangian given by
L =
1
2
N
X
a=1
µ
φ
a
µ
φ
a
1
2
N
X
a=1
m
2
φ
2
a
g
N
X
a=1
φ
2
a
!
2
.
This theory is invariant under the bigger symmetry group
G
=
SO
(
N
). If we
view the fields as components of complex fields, then we have a symmetry under
U(
N/
2) or even
SU
(
N/
2). For example, the symmetry group
SU
(3) gives the
8-fold way.
Example.
There is a nice trick to determine the conserved current when our
infinitesimal transformation is given by
δφ
=
αφ
for some real constant
α
.
Consider the case where we have an arbitrary perturbation
α
=
α
(
x
). In this
case,
δL
is no longer invariant, but we know that whatever formula we manage
to come up with, it has to vanish when
α
is constant. Assuming that we only
have first-order derivatives, the change in Lagrangian must be of the form
δL = (
µ
α(x))h
µ
(φ).
for some h
µ
. We claim that h
µ
is the conserved current. Indeed, we have
δS =
Z
d
4
x δL =
Z
d
4
x α(x)
µ
h
µ
,
using integration by parts. We know that if the equations of motion are satisfied,
then this vanishes for any
α
(
x
) as long as it vanishes at infinity (or the boundary).
So we must have
µ
h
µ
= 0.
1.4 Hamiltonian mechanics
We can now talk about the Hamiltonian formulation. This can be done for field
theories as well. We define
Definition
(Conjugate momentum)
.
Given a Lagrangian system for a field
φ
,
we define the conjugate momentum by
π(x) =
L
˙
φ
.
This is not to be confused with the total momentum P
i
.
Definition (Hamiltonian density). The Hamiltonian density is given by
H = π(x)
˙
φ(x) L(x),
where we replace all occurrences of
˙
φ(x) by expressing it in terms of π(x).
Example. Suppose we have a field Lagrangian of the form
L =
1
2
˙
φ
2
1
2
(φ)
2
V (φ).
Then we can compute that
π =
˙
φ.
So we can easily find
H =
1
2
π
2
+
1
2
(φ)
2
+ V (φ).
Definition (Hamiltonian). The Hamiltonian of a Hamiltonian system is
H =
Z
d
3
x H.
This agrees with the field energy we computed using Noether’s theorem.
Definition (Hamilton’s equations). Hamilton’s equations are
˙
φ =
H
π
, ˙π =
H
φ
.
These give us the equations of motion of φ.
There is an obvious problem with this, that the Hamiltonian formulation is
not manifestly Lorentz invariant. However, we know it actually is because we
derived it as an equivalent formulation of a Lorentz invariant theory. So we are
safe, so far. We will later have to be more careful if we want to quantize the
theories from the Hamiltonian formalism.
2 Free field theory
So far, we’ve just been talking about classical field theory. We now want to
quantize this, and actually do quantum field theory.
2.1 Review of simple harmonic oscillator
Michael Peskin once famously said “Physics is the subset of human experience
that can be reduced to coupled harmonic oscillators”. Thus, to understand
quantum mechanics, it is important to understand how the quantum harmonic
oscillator works.
Classically, the simple harmonic oscillator is given by the Hamiltonian
H =
1
2
p
2
+
1
2
ω
2
q
2
,
where
p
is the momentum and
q
is the position. To obtain the corresponding
quantum system, canonical quantization tells us that we should promote the
p
and
q
into complex “operators”
ˆp, ˆq
, and use the same formula for the Hamiltonian,
namely we now have
ˆ
H =
1
2
ˆp
2
+
1
2
ω
2
ˆq
2
.
In the classical system, the quantities
p
and
q
used to satisfy the Poisson brackets
{q, p} = 1.
After promoting to operators, they satisfy the commutation relation
[ˆq, ˆp] = i.
We will soon stop writing the hats because we are lazy.
There are a few things to take note of.
(i)
We said
p
and
q
are “operators”, but did not say what they actually
operate on! Instead, what we are going to do is to analyze these operators
formally, and after some careful analysis, we show that there is a space the
operators naturally act on, and then take that as our state space. This is,
in general, how we are going to do quantum field theory (except we tend
to replace the word “careful” with “sloppy”).
During the analysis, we will suppose there are indeed some states our
operators act on, and then try to figure out what properties the states
must have.
(ii)
The process of canonical quantization depends not only on the classical
system itself, but how we decide the present our system. There is no
immediate reason why if we pick different coordinates for our classical
system, the resulting quantum system would be equivalent. In particular,
before quantization, all the terms commute, but after quantization that is
no longer true. So how we decide to order the terms in the Hamiltonian
matters.
Later, we will come up with the notion of normal ordering. From then
onwards, we can have a (slightly) more consistent way of quantizing
operators.
After we’ve done this, the time evolution of states is governed by the
Schr¨odinger equation:
i
d
dt
|ψi = H |ψi.
In practice, instead of trying to solve this thing, we want to find eigenstates
|Ei
such that
H |Ei = E |Ei.
If such states are found, then defining
|ψi = e
iEt
|Ei
would give a nice, stable solution to the Schr¨odinger equation.
The trick is to notice that in the classical case, we can factorize the Hamilto-
nian as
H = ω
r
ω
2
q +
i
2ω
p
r
ω
2
q +
i
2ω
p
.
Now
H
is a product of two terms that are complex conjugates to each other,
which, in operator terms, means they are adjoints. So we have the benefit
that we only have to deal with a single complex object
p
ω
2
q
+
i
2ω
p
(and its
conjugate), rather than two unrelated real objects. Also, products are nicer than
sums. (if this justification for why this is a good idea doesn’t sound convincing,
just suppose that we had the idea of doing this via divine inspiration, and it
turns out to work well)
We now do the same factorization in the quantum case. We would not
expect the result to be exactly the above, since that working relies on
p
and
q
commuting. However, we can still try and define the operators
a =
i
2ω
p +
r
ω
2
q, a
=
i
2ω
p +
r
ω
2
q.
These are known as creation and annihilation operators for reasons that will
become clear soon.
We can invert these to find
q =
1
2ω
(a + a
), p = i
r
ω
2
(a a
).
We can substitute these equations into the commutator relation to obtain
[a, a
] = 1.
Putting them into the Hamiltonian, we obtain
H =
1
2
ω(aa
+ a
a) = ω
a
a +
1
2
[a, a
]
= ω
a
a +
1
2
.
We can now compute
[H, a
] = ωa
, [H, a] = ωa.
These ensure that a, a
take us between energy eigenstates if
H |Ei = E |Ei,
then
Ha
|Ei = (a
H + [H, a
]) |Ei = (E + ω)a
|Ei.
Similarly, we have
Ha |Ei = (E ω)a |Ei.
So assuming we have some energy eigenstate
|Ei
, these operators give us loads
more with eigenvalues
··· , E 2ω, E ω, E, E + ω, E + 2ω, ··· .
If the energy is bounded below, then there must be a ground state
|0i
satisfying
a |0i
= 0. Then the other “excited states” would come from repeated applications
of a
, labelled by
|ni = (a
)
n
|0i,
with
H |ni =
n +
1
2
ω |ni.
Note that we were lazy and ignored normalization, so we have hn|ni 6= 1.
One important feature is that the ground state energy is non-zero. Indeed,
we have
H |0i = ω
a
a +
1
2
|0i =
ω
2
|0i.
Notice that we managed to figure out what the eigenvalues of
H
must be, without
having a particular state space (assuming the state space is non-trivial). Now we
know what is the appropriate space to work in. The right space is the Hilbert
space generated by the orthonormal basis
{|0i, |1i, |2i, ···}.
2.2 The quantum field
We are now going to use canonical quantization to promote our classical fields
to quantum fields. We will first deal with the case of a real scalar field.
Definition
(Real scalar quantum field)
.
A (real, scalar) quantum field is an
operator-valued function of space
φ
, with conjugate momentum
π
, satisfying the
commutation relations
[φ(x), φ(y)] = 0 = [π(x), π(y)]
and
[φ(x), π(y)] =
3
(x y).
In case where we have many fields labelled by
a I
, the commutation relations
are
[φ
a
(x), φ
b
(y)] = 0 = [π
a
(x), π
b
(y)]
and
[φ
a
(x), π
b
(y)] =
3
(x y)δ
b
a
.
The evolution of states is again given by Schr¨odinger equation.
Definition (Schr¨odinger equation). The Schr¨odinger equation says
i
d
dt
|ψi = H |ψi.
However, we will, as before, usually not care and just look for eigenvalues of
H.
As in the case of the harmonic oscillator, our plan is to rewrite the field in
terms of creation and annihilation operators. Note that in quantum mechanics, it
is always possible to write the position and momentum in terms of some creation
and annihilation operators for any system. It’s just that if the system is not
a simple harmonic oscillator, these operators do not necessarily have the nice
properties we want them to have. So we are just going to express everything in
creation and annihilation operators nevertheless, and see what happens.
2.3 Real scalar fields
We start with simple problems. We look at free theories, where the Lagrangian
is quadratic in the field, so that the equation of motion is linear. We will see that
the whole field then decomposes into many independent harmonic oscillators.
Before we jump into the quantum case, we first look at what happens in a
classical free field.
Example.
The simplest free theory is the classic Klein–Gordon theory for a
real scalar field φ(x, t). The equations of motion are
µ
µ
φ + m
2
φ = 0.
To see why this is free, we take the Fourier transform so that
φ(x, t) =
Z
d
3
p
(2π)
3
e
ip·x
˜
φ(p, t).
We substitute this into the Klein–Gordon equation to obtain
2
t
2
+ (p
2
+ m
2
)
˜
φ(p, t) = 0.
This is just the usual equation for a simple harmonic oscillator for each
p
,
independently, with frequency
ω
p
=
p
p
2
+ m
2
. So the solution to the classical
Klein–Gordon equation is a superposition of simple harmonic oscillators, each
vibrating at a different frequency (and a different amplitude).
For completeness, we will note that the Hamiltonian density for this field is
given by
H =
1
2
(π
2
+ (φ)
2
+ m
2
φ
2
).
So to quantize the Klein–Gordon field, we just have to quantize this infinite
number of harmonic oscillators!
We are going to do this in two steps. First, we write our quantum fields
φ
(
x
)
and π(x) in terms of their Fourier transforms
φ(x) =
Z
d
3
p
(2π)
3
e
ip·x
˜
φ(p)
π(x) =
Z
d
3
p
(2π)
3
e
ip·x
˜π(p)
Confusingly,
π
represents both the conjugate momentum and the mathematical
constant, but it should be clear from the context.
If we believe in our classical analogy, then the operators
˜
φ
(
p
) and
˜π
(
p
) should
represent the position and momentum of quantum harmonic oscillators. So we
further write them as
φ(x) =
Z
d
3
p
(2π)
3
1
p
2ω
p
a
p
e
ip·x
+ a
p
e
ip·x
π(x) =
Z
d
3
p
(2π)
3
(i)
r
ω
p
2
a
p
e
ip·x
a
p
e
ip·x
,
where we have
ω
2
p
= p
2
+ m
2
.
Note that despite what we said above, we are multiplying
a
p
by
e
ip·x
, and not
e
ip·x
. This is so that φ(x) will be manifestly a real quantity.
We are now going to find the commutation relations for the
a
p
and
a
p
.
Throughout the upcoming computations, we will frequently make use of the
following result:
Proposition. We have
Z
d
3
p
(2π)
3
e
ip·x
= δ
3
(x).
Proposition. The canonical commutation relations of φ, π, namely
[φ(x), φ(y)] = 0
[π(x), π(y)] = 0
[φ(x), π(y)] =
3
(x y)
are equivalent to
[a
p
, a
q
] = 0
[a
p
, a
q
] = 0
[a
p
, a
q
] = (2π)
3
δ
3
(p q).
Proof.
We will only prove one small part of the equivalence, as the others are
similar tedious and boring computations, and you are not going to read it anyway.
We will use the commutation relations for the
a
p
to obtain the commutation
relations for φ and π. We can compute
[φ(x), π(y)]
=
Z
d
3
p d
3
q
(2π)
6
(i)
2
r
ω
q
ω
p
[a
p
, a
q
]e
ip·xiq·y
+ [a
p
, a
q
]e
ip·x+iq·y
=
Z
d
3
p d
3
q
(2π)
6
(i)
2
r
ω
q
ω
p
(2π)
3
δ
3
(p q)e
ip·xiq·y
δ
3
(q p)e
ip·x+iq·y
=
(i)
2
Z
d
3
p
(2π)
3
e
ip·(xy)
e
ip·(yx)
=
3
(x y).
Note that to prove the inverse direction, we have to invert the relation between
φ(x), π(x) and a
p
, a
p
and express a
p
and a
p
in terms of φ and π by using
Z
d
3
x φ(x) e
ip·x
=
1
p
2ω
p
a
p
+ a
p
Z
d
3
x π(x) e
ip·x
= (i)
r
ω
p
2
a
p
a
p
.
So our creation and annihilation operators do satisfy commutation relations
similar to the case of a simple harmonic oscillator.
The next thing to do is to express
H
in terms of
a
p
and
a
p
. Before we plunge
into the horrendous calculations that you are probably going to skip, it is a
good idea to stop and think what we are going to expect. If we have enough
faith, then we should expect that we are going to end up with infinitely many
decoupled harmonic oscillators. In other words, we would have
H =
Z
d
3
p
(2π)
3
(a harmonic oscillator of frequency ω
p
).
But if this were true, then we would have a serious problem. Recall that each
harmonic oscillator has a non-zero ground state energy, and this ground state
energy increases with frequency. Now that we have infinitely many harmonic
oscillators of arbitrarily high frequency, the ground state energy would add up to
infinity! What’s worse is that when we derived the ground state energy for the
harmonic oscillator, the energy is
1
2
ω
[
a, a
]. Now our [
a
p
, a
p
] is (2
π
)
3
δ
3
(
p q
),
i.e. the value is infinite. So our ground state energy is an infinite sum of infinities!
This is so bad.
These problems indeed will occur. We will discuss how we are going to avoid
them later on, after we make ourselves actually compute the Hamiltonian.
As in the classical case, we have
H =
1
2
Z
d
3
x (π
2
+ (φ)
2
+ m
2
φ
2
).
For the sake of sanity, we will evaluate this piece by piece. We have
Z
d
3
x π
2
=
Z
d
3
x d
3
p d
3
q
(2π)
6
ω
p
ω
q
2
(a
p
e
ip·x
a
p
e
ip·x
)(a
q
e
iq·x
a
q
e
iq·x
)
=
Z
d
3
x d
3
p d
3
q
(2π)
6
ω
p
ω
q
2
(a
p
a
q
e
i(p+q)·x
a
p
a
q
e
i(qp)·x
a
p
a
q
e
i(pq)·x
+ a
p
a
q
e
i(p+q)·x
)
=
Z
d
3
p d
3
q
(2π)
3
ω
p
ω
q
2
(a
p
a
q
δ
3
(p + q) a
p
a
q
δ
3
(p q)
a
p
a
q
δ
3
(p q) + a
p
a
q
δ
3
(p + q))
=
Z
d
3
p
(2π)
3
ω
p
2
((a
p
a
p
+ a
p
a
p
) (a
p
a
p
+ a
p
a
p
)).
That was tedious. We similarly compute
Z
d
3
x (φ)
2
=
Z
d
3
x d
3
p d
3
q
(2π)
6
1
2
ω
p
ω
q
(ipa
p
e
ip·x
ipa
p
e
ip·x
)
(iqa
q
e
iq·x
iqa
q
e
iq·x
)
=
Z
d
3
p
(2π)
3
p
2
2ω
p
((a
p
a
p
+ a
p
a
p
) + (a
p
a
p
+ a
p
a
p
))
Z
d
3
x m
2
φ
2
=
Z
d
3
p
(2π)
3
m
2
2ω
p
((a
p
a
p
+ a
p
a
p
) + (a
p
a
p
+ a
p
a
p
)).
Putting all these together, we have
H =
1
2
Z
d
3
x (π
2
+ (φ)
2
+ m
2
φ
2
)
=
1
4
Z
d
3
p
(2π)
3

ω
p
+
p
2
ω
p
+
m
2
ω
p
(a
p
a
p
+ a
p
a
p
)
+
ω
p
+
p
2
ω
p
+
m
2
ω
p
(a
p
a
p
+ a
p
a
p
)
Now the first term vanishes, since we have ω
2
p
= p
2
+ m
2
. So we are left with
=
1
4
Z
d
3
p
(2π)
3
1
ω
p
(ω
2
p
+ p
2
+ m
2
)(a
p
a
p
+ a
p
a
p
)
=
1
2
Z
d
3
p
(2π)
3
ω
p
(a
p
a
p
+ a
p
a
p
)
=
Z
d
3
p
(2π)
3
ω
p
a
p
a
p
+
1
2
[a
p
, a
p
]
=
Z
d
3
p
(2π)
3
ω
p
a
p
a
p
+
1
2
(2π)
3
δ
3
(0)
.
Note that if we cover the
R
d
3
p
(2π)
3
, then the final three lines are exactly what we
got for a simple harmonic oscillator of frequency ω
p
=
p
p
2
+ m
2
.
Following the simple harmonic oscillator, we postulate that we have a vacuum
state |0i such that
a
p
|0i = 0
for all p.
When
H
acts on this, the
a
p
a
p
terms all vanish. So the energy of this ground
state comes from the second term only, and we have
H |0i =
1
2
Z
d
3
p
(2π)
3
ω
p
(2π)
3
δ
3
(0) |0i = |0i.
since all ω
p
are non-negative.
Quantum field theory is always full of these infinities! But they tell us
something important. Often, they tell us that we are asking a stupid question.
Let’s take a minute to explore this infinity and see what it means. This is
bad, since what we want to do at the end is to compute some actual probabilities
in real life, and infinities aren’t exactly easy to calculate with.
Let’s analyze the infinities one by one. The first thing we want to tackle is
the δ
3
(0). Recall that our δ
3
can be thought of as
(2π)
3
δ
3
(p) =
Z
d
3
x e
ip·x
.
When we evaluate this at
p
=
0
, we are then integrating the number 1 over all
space, and thus the result is infinite! You might say, duh, space is so big. If
there is some energy everywhere, then of course the total energy is infinite. This
problem is known as infrared divergence. So the idea is to look at the energy
density, i.e. the energy per unit volume. Heuristically, since the (2
π
)
3
δ
3
(
p
) is
just measuring the “volume” of the universe, we would get rid of it by simply
throwing away the factor of (2π)
3
δ
3
(0).
If we want to make this a bit more sophisticated, we would enclose our
universe in a box, and thus we can replace the (2
π
)
3
δ
3
(
0
) with the volume
V
.
This trick is known as an infrared cutoff . Then we have
E
0
=
E
V
=
Z
d
3
p
(2π)
3
1
2
ω
p
.
We can then safely take the limit as
V
, and forget about this
δ
3
(
0
) problem.
This is still infinite, since
ω
p
gets unbounded as
p
. In other words, the
ground state energies for each simple harmonic oscillator add up to infinity.
These are high frequency divergences at short distances. These are called
ultraviolet divergences. Fortunately, our quantum field theorists are humble
beings and believe their theories are wrong! This will be a recurring theme
we will only assume that our theories are low-energy approximations of the real
world. While this might seem pessimistic, it is practically a sensible thing to do
our experiments can only access low-level properties of the universe, so what
we really see is the low-energy approximation of the real theory.
Under this assumption, we would want to cut off the integral at high mo-
mentum in some way. In other words, we just arbitrarily put a bound on the
integral over
p
, instead of integrating over all possible
p
. While the cut-off point
is arbitrary, it doesn’t really matter. In (non-gravitational) physics, we only care
about energy differences, and picking a different cut-off point would just add a
constant energy to everything. Alternatively, we can also do something slightly
more sophisticated to avoid this arbitrariness, as we will see in the example of
the Casimir effect soon.
Even more straightforwardly, if we just care about energy differences, we can
just forget about the infinite term, and write
H =
Z
d
3
p
(2π)
3
ω
p
a
p
a
p
.
Then we have
H |0i = 0.
While all these infinity cancelling sound like bonkers, the theory actually fits
experimental data very well. So we have to live with it.
The difference between this
H
and the previous one with infinities is an
ordering ambiguity in going from the classical to the quantum theory. Recall
that we did the quantization by replacing the terms in the classical Hamiltonian
with operators. However, terms in the classical Hamiltonian are commutative,
but not in the quantum theory. So if we write the classical Hamiltonian in a
different way, we get a different quantized theory. Indeed, if we initially wrote
H =
1
2
(ωq ip)(ωq + ip),
for the classical Hamiltonian for a single harmonic operator, and then did the
quantization, we would have obtained
H = ωa
a.
It is convenient to have the following notion:
Definition (Normal order). Given a string of operators
φ
1
(x
1
) ···φ
n
(x
n
),
the normal order is what you obtain when you put all the annihilation operators
to the right of (i.e. acting before) all the creation operators. This is written as
:φ
1
(x
1
) ···φ
n
(x
n
): .
So, for example,
:H: =
Z
d
3
p
(2π)
3
ω
p
a
p
a
p
.
In the future, we will assume that when we quantize our theory, we do so in a
way that the resulting operators are in normal order.
Applications The Casimir effect
Notice that we happily set
E
0
= 0, claiming that only energy differences are mea-
sured. But there exists a situation where differences in the vacuum fluctuations
themselves can be measured, when we have two separated regions of vacuum. In
this example, we will also demonstrate how the infrared and ultraviolet cutoffs
can be achieved. We are going to enclose the world in a box of length
L
, and
then do the ultraviolet cutoff in a way parametrized by a constant
a
. We then
do some computations under these cutoffs, derive an answer, and then take the
limit as
L
and
a
0. If we asked the right question, then the answer will
tend towards a finite limit as we take these limits.
To regulate the infrared divergences, we put the universe in a box again. We
make the
x
direction periodic, with a large period
L
. So this is a one-dimensional
box. We impose periodic boundary conditions
φ(x, y, z) = φ(x + L, y, z).
We are now going to put two reflecting plates in the box at some distance
d L
apart. The plates impose φ(x) = 0 on the plates.
d
L
The presence of the plates means that the momentum of the field inside them is
quantized
p =
πn
d
, p
y
, p
z
for
n Z
. For a massless scalar field, the energy per unit area between the
plates is
E(d) =
X
n=1
Z
dp
y
dp
z
(2π)
2
1
2
r
πn
d
2
+ p
2
y
+ p
2
z
.
The energy outside the plates is then E(L d). The total energy is then
E = E(d) + E(L d).
This energy (at least naively) depends on
d
. So there is a force between the
plates! This is the Casimir effect, predicted in 1945, and observed in 1958. In
the lab, this was done with the EM field, and the plates impose the boundary
conditions.
Note that as before, we had to neglect modes with
|p|
too high. More precisely,
we pick some distance scale
a d
, and ignore modes where
|p| a
1
. This is
known as the ultraviolet cut-off. This is reasonable, since for high momentum
modulus, we would break through the plates. Then we have
E(d) = A
X
n
Z
dp
y
dp
z
(2π)
2
1
2
|p|e
a|p|
.
Note that as we set a 0, we get the previous expression.
Since we are scared by this integral, we consider the special case where we
live in the 1 + 1 dimensional world. Then this becomes
E
1+1
(d) =
π
2d
X
n=1
ne
anπ/d
=
1
2
d
da
X
n
e
anπ/d
!
=
1
2
d
da
1
1 e
/d
=
π
2d
e
/d
(1 e
/d
)
2
=
d
2πa
2
π
24d
+ O(a
2
).
Our total energy is
E = E(d) + E(L d) =
L
2πa
2
π
24
1
d
+
1
L d
+ O(a
2
).
As
a
0, this is still infinite, but the infinite term does not depend on
d
. The
force itself is just
E
d
=
2π
24d
2
+ O
d
2
L
2
+ O(a
2
),
which is finite as
a
0 and
L
. So as we remove both the infrared and UV
cutoffs, we still get a sensible finite force.
In 3 + 1 dimensions, if we were to do the more complicated integral, it turns
out we get
1
A
E
d
=
π
2
480d
4
.
The actual Casimir effect for electromagnetic fields is actually double this due
to the two polarization states of the photon.
Recovering particles
We called the operators
a
p
and
a
p
the creation and annihilation operators.
Let us verify that they actually create and annihilate things! Recall that the
Hamiltonian is given by
H =
1
2
Z
d
3
q
(2π)
3
ω
q
a
q
a
q
.
Then we can compute
[H, a
p
] =
Z
d
3
q
(2π)
3
ω
q
[a
q
a
q
, a
p
]
=
Z
d
3
q
(2π)
3
ω
q
a
q
(2π)
3
δ
3
(p q)
= ω
p
a
p
.
Similarly, we obtain
[H, a
p
] = ω
p
a
p
.
which means (like SHM) we can construct energy eigenstates by acting with
a
p
.
We let
|pi = a
p
|0i.
Then we have
H |pi = ω
p
|pi,
where the eigenvalue is
ω
2
p
= p
2
+ m
2
.
But from special relativity, we also know that the energy of a particle of mass
m
and momentum p is given by
E
2
p
= p
2
+ m
2
.
So we interpret
|pi
as the momentum eigenstate of a particle of mass
m
and
momentum
p
. And we identify
m
with the mass of the quantized particle. From
now on, we will write E
p
instead of ω
p
.
Let’s check this interpretation. After normal ordering, we have
P =
Z
π(x)φ(x) d
3
x =
Z
d
3
p
(2π)
3
pa
p
a
p
.
So we have
P |pi =
Z
d
3
q
(2π)
3
qa
q
a
q
(a
p
|0i)
=
Z
d
3
q
(2π)
3
qa
q
(a
p
a
q
+ (2π)
3
δ
3
(p q)) |0i
= pa
p
|0i
= p |pi.
So the state has total momentum p.
What about multi-particle states? We just have to act with more
a
p
’s. We
have the n-particle state
|p
1
, p
2
, ··· , p
n
i = a
p
1
a
p
2
···a
p
n
|0i.
Note that
|p, qi
=
|q, pi
for any
p, q
. So any two parts are symmetric under
interchange, i.e. they are bosons.
Now we can tell what our state space is. It is given by the span of particles
of the form
|0i, a
p
|0i, a
p
a
q
|0i, ···
This is known as the Fock space. As in the case of SHM, there is also an operator
which counts the number of particles. It is given by
N =
Z
d
3
p
(2π)
3
a
p
a
p
.
So we have
N |p
1
, ··· , p
n
i = n |p
1
, ··· , p
n
i.
It is easy to compute that
[N, H] = 0
So the particle number is conserved in the free theory. This is not true in general!
Usually, when particles are allowed to interact, the interactions may create or
destroy particles. It is only because we are in a free theory that we have particle
number conservation.
Although we are calling these states “particles”, they aren’t localized
they’re momentum eigenstates. Theoretically, we can create a localized state via
a Fourier transform:
|xi =
Z
d
3
p
(2π)
3
e
ip·x
|pi.
More generally, we can create a wave-packet, and insert ψ(p) to get
|ψi =
Z
d
3
p
(2π)
3
e
ip·x
ψ(p) |pi.
Then this wave-packet can be both partially localized in space and in momentum.
For example, we can take ψ to be the Gaussian
ψ(p) = e
p
2
/2m
.
Note now that neither
|xi
nor
|ψi
are
H
-eigenstates, just like in non-relativistic
quantum mechanics.
Relativistic normalization
As we did our quantum field theory, we have completely forgotten about making
our theory Lorentz invariant. Now let’s try to recover some. We had a vacuum
state
|0i
, which we can reasonably assume to be Lorentz invariant. We then
produced some 1-particle states
|pi
=
a
p
|0i
. This is certainly not a “scalar”
quantity, so it would probably transform as we change coordinates.
However, we can still impose compatibility with relativity, by requiring that
its norm is Lorentz invariant. We have chosen |0i such that
h0|0i = 1.
We can compute
hp|qi = h0|a
p
a
q
|0i = h0|a
q
a
p
+ (2π)
3
δ
3
(p q) |0i = (2π)
3
δ
3
(p q).
There is absolutely no reason to believe that this is a Lorentz invariant quantity,
as p and q are 3-vectors. Indeed, it isn’t.
As the resulting quantity is a scalar, we might hope that we can come up
with new states
|pi = A
p
|pi
so that
hp|qi = (2π)
3
A
p
A
q
δ
3
(p q)
is a Lorentz invariant quantity. Note that we write non-bold characters for
“relativistic” things, and bold characters for “non-relativistic” things. Thus, we
think of the p in |pi as the 4-vector p, and the p in |pi as the 3-vector p.
The trick to figuring out the right normalization is by looking at an object
we know is Lorentz invariant, and factoring it into small pieces. If we know that
all but one of the pieces are Lorentz invariant, then the remaining piece must be
Lorentz invariant as well.
Our first example would be the following:
Proposition. The expression
Z
d
3
p
2E
p
is Lorentz-invariant, where
E
2
p
= p
2
+ m
2
for some fixed m.
Proof. We know
R
d
4
p certainly is Lorentz invariant, and
m
2
= p
µ
p
µ
= p
2
= p
2
0
p
2
is also a Lorentz-invariant quantity. So for any m, the expression
Z
d
4
p δ(p
2
0
p
2
m
2
)
is also Lorentz invariant. Writing
E
2
p
= p
2
0
= p
2
+ m
2
,
integrating over p
0
in the integral gives
Z
d
3
p
2p
0
=
Z
d
3
p
2E
p
,
and this is Lorentz invariant.
Thus, we have
Proposition. The expression
2E
p
δ
3
(p q)
is Lorentz invariant.
Proof. We have
Z
d
3
p
2E
p
· (2E
p
δ
3
(p q)) = 1.
Since the RHS is Lorentz invariant, and the measure is Lorentz invariant, we
know 2E
p
δ
3
(p q) must be Lorentz invariant.
From this, we learn that the correctly normalized states are
|pi =
p
2E
p
|pi =
p
2E
p
a
p
|0i.
These new states satisfy
hp|qi = (2π)
3
(2E
p
)δ
3
(p q),
and is Lorentz invariant.
We can now define relativistically normalized creation operators by
a
(p) =
p
2E
p
a
p
.
Then we can write our field as
φ(x) =
Z
d
3
p
(2π)
3
1
2E
p
(a(p)e
ip·x
+ a
(p)e
ip·x
).
2.4 Complex scalar fields
What happens when we want to talk about a complex scalar field? A classical
free complex scalar field would have Lagrangian
L =
µ
ψ
µ
ψ µ
2
ψ
ψ.
Again, we want to write the quantized
ψ
as an integral of annihilation and
creation operators indexed by momentum. In the case of a real scalar field, we
had the expression
φ(x) =
Z
d
3
p
(2π)
3
1
p
2E
p
a
p
e
ip·x
+ a
p
e
ip·x
We needed the “coefficients” of
e
ip·x
and those of
e
ip·x
to be
a
p
and its
conjugate
a
p
so that the resulting
φ
would be real. Now we know that our
ψ
is a complex quantity, so there is no reason to assert that the coefficients are
conjugates of each other. Thus, we take the more general decomposition
ψ(x) =
Z
d
3
p
(2π)
3
1
p
2E
p
(b
p
e
ip·x
+ c
p
e
ip·x
)
ψ
(x) =
Z
d
3
p
(2π)
3
1
p
2E
p
(b
p
e
ip·x
+ c
p
e
ip·x
).
Then the conjugate momentum is
π(x) =
Z
d
3
p
(2π)
3
i
r
E
p
2
(b
p
e
ip·x
c
p
e
ip·x
)
π
(x) =
Z
d
3
p
(2π)
3
(i)
r
E
p
2
(b
p
e
ip·x
c
p
e
ip·x
).
The commutator relations are
[ψ(x), π(y)] = [ψ
(x), π
(y)] =
3
(x y),
with all other commutators zero.
Similar tedious computations show that
[b
p
, b
q
] = [c
p
, c
q
] = (2π)
3
δ
3
(p q),
with all other commutators zero.
As before, we can find that the number operators
N
c
=
Z
d
3
p
(2π)
3
c
p
c
p
, N
b
=
Z
d
3
p
(2π)
3
b
p
b
p
are conserved.
But in the classical case, we had an extra conserved charge
Q = i
Z
d
3
x (
˙
ψ
ψ ψ
˙
ψ) = i
Z
d
3
x (πψ ψ
π
).
Again by tedious computations, we can replace the
π
and
ψ
with their operator
analogues, expand everything in terms of
c
p
and
b
p
, throw away the pieces of
infinity by requiring normal ordering, and then obtain
Q =
Z
d
3
p
(2π)
3
(c
p
c
p
b
p
b
p
) = N
c
N
b
,
Fortunately, after quantization, this quantity is still conserved, i.e. we have
[Q, H] = 0.
This is not a big deal, since
N
c
and
N
b
are separately conserved. However, in
the interacting theory, we will find that
N
c
and
N
b
are not separately conserved,
but Q still is.
We can think of
c
and
b
particles as particle and anti-particle, and
Q
computes
the number of particles minus the number of antiparticles. Looking back, in the
case of a real scalar field, we essentially had a system with
c
=
b
. So the particle
is equal to its anti-particle.
2.5 The Heisenberg picture
We have tried to make our theory Lorentz invariant, but it’s currently in a terrible
state. Indeed, the time evolution happens in the states, and the operators depend
on states only. Fortunately, from IID Principles of Quantum Mechanics, we
know that there is a way to encode the time evolution in the operators instead,
via the Heisenberg picture.
Recall that the equation of motion is given by the Schr¨odinger equation
i
d
dt
|ψ(t)i = H |ψ(t)i.
We can write the solution formally as
|ψ(t)i = e
iHt
|ψ(0)i.
Thus we can think of
e
iHt
as the “time evolution operator” that sends
|ψ(0)i
forward in time by t.
Now if we are given an initial state
|ψ(0)i
, and we want to know what an
operator
O
S
does to it at time
t
, we can first apply
e
iHt
to let
|ψ(0)i
evolve to
time
t
, then apply the operator
O
S
, then pull the result back to time 0. So in
total, we obtain the operator
O
H
(t) = e
iHt
O
S
e
iHt
.
To evaluate this expression, it is often convenient to note the following result:
Proposition. Let A and B be operators. Then
e
A
Be
A
= B + [A, B] +
1
2!
[A, [A, B]] +
1
3!
[A, [A, [A, B]]] + ··· .
In particular, if [A, B] = cB for some constant c, then we have
e
A
Be
A
= e
c
B.
Proof. For λ a real variable, note that
d
dλ
(e
λA
Be
λA
) = lim
ε0
e
(λ+ε)A
Be
(λ+ε)A
e
λA
Be
λA
ε
= lim
ε0
e
λA
e
εA
Be
εA
B
ε
e
λA
= lim
ε0
e
λA
(1 + εA)B(1 εA) B + o(ε)
ε
e
λA
= lim
ε0
e
λA
(ε(AB BA) + o(ε))
ε
e
λA
= e
λA
[A, B]e
λA
.
So by induction, we have
d
n
dλ
n
(e
λA
Be
λA
) = e
λA
[A, [A, ···[A, B] ···]]e
λA
.
Evaluating these at λ = 0, we obtain a power series representation
e
λA
Be
λA
= B + λ[A, B] +
λ
2
2
[A, [A, B]] + ··· .
Putting λ = 1 then gives the desired result.
In the Heisenberg picture, one can readily directly verify that the commutation
relations for our operators
φ
(
x, t
) and
π
(
x, t
) now become equal time commutation
relations
[φ(x, t), φ(y, t)] = [π(x, t), π(y, t)] = 0, [φ(x, t), π(y, t)] =
3
(x y).
For an arbitrary operator O
H
, we can compute
dO
H
dt
=
d
dt
(e
iHt
O
S
e
iHt
)
= iHe
iHt
O
S
e
iHt
+ e
iHt
O
S
(iHe
iHt
)
= i[H, O
H
],
where we used the fact that the Hamiltonian commutes with any function of
itself. This gives the time evolution for the operators.
Recall that in quantum mechanics, when we transformed into the Heisenberg
picture, a miracle occurred, and the equations of motion of the operators became
the usual classical equations of motion. This will happen again. For
O
S
=
φ
(
x
),
we have
˙
φ(x, t) = i[H, φ(x, t)]
= i
Z
d
3
y
1
2
[π
2
(y, t) + (
y
φ(y, t))
2
+ m
2
φ
2
(y, t), φ(x, t)]
= i
Z
d
3
y
1
2
[π
2
(y, t), φ(x, t)]
= i
Z
d
3
y
1
2
π(y, t)[π(y, t), φ(x, t)] + [π(y, t), φ(x, t)]π(y, t)
= i
Z
d
3
y π(y, t)(
3
(x y))
= π(x, t).
Similarly, we have
˙π(x, t) = i[H, π(x, t)]
=
i
2
Z
d
3
y
[(
y
φ(y, t))
2
, π(x, t)] + m
2
[φ
2
(y, t), π(x, t)]
Now notice that
y
does not interact with π(x, t). So this becomes
=
Z
d
3
y
(
y
φ(y, t) ·
y
δ
3
(x y)) m
2
φ(y, t)δ
3
(x y)
Integrating the first term by parts, we obtain
=
Z
d
3
y
(
2
y
φ(y, t))δ
3
(x y) m
2
φ(y, t)δ
3
(x y)
=
2
φ(x, t) m
2
φ(x, t).
Finally, noting that ˙π =
¨
φ and rearranging, we obtain
µ
µ
φ(x, t) + m
2
φ(x, t) = 0.
This is just the usual Klein–Gordon equation!
It is interesting to note that this final equation of motion is Lorentz invariant,
but our original
dφ
dt
= i[H, φ]
is not. Indeed,
dφ
dt
itself singles out time, and to define
H
we also need to single
out a preferred time direction. However, this statement tells us essentially that
H
generates time evolution, which is true in any frame, for the appropriate
definition of H and t in that frame.
What happens to the creation and annihilation operators? We use our
previous magic formula to find that
e
iHt
a
p
e
iHt
= e
iE
p
t
a
p
e
iHt
a
p
e
iHt
= e
iE
p
t
a
p
.
These extra factors of
e
±iE
p
t
might seem really annoying to carry around, but
they are not! They merge perfectly into the rest of our formulas relativistically.
Indeed, we have
φ(x) φ(x, t) =
Z
d
3
p
(2π)
3
1
p
2E
p
a
p
e
iE
p
t
e
ip·x
+ a
p
e
iE
p
t
e
ip·x
=
Z
d
3
p
(2π)
3
1
p
2E
p
a
p
e
ip·x
+ a
p
e
ip·x
,
where now
p · x
is the inner product of the 4-vectors
p
and
x
! If we use our
relativistically normalized
a
(p) =
p
2E
p
a
p
instead, then the equation becomes
φ(x) =
Z
d
3
p
(2π)
3
1
2E
p
a(p)e
ip·x
+ a(p)
e
ip·x
,
which is made up of Lorentz invariant quantities only!
Unfortunately, one should note that we are not yet completely Lorentz-
invariant, as our commutation relations are equal time commutation relations.
However, this is the best we can do, at least in this course.
Causality
Since we are doing relativistic things, it is important to ask ourselves if our
theory is causal. If two points
x, y
in spacetime are space-like separated, then a
measurement of the field at
x
should not affect the measurement of the field at
y
. So measuring
φ
(
x
) after
φ
(
y
) should give the same thing as measuring
φ
(
y
)
after φ(x). So we must have [φ(x), φ(y)] = 0.
Definition
(Causal theory)
.
A theory is causal if for any space-like separated
points x, y, and any two fields φ, ψ, we have
[φ(x), ψ(y)] = 0.
Does our theory satisfy causality? For convenience, we will write
∆(x y) = [φ(x), φ(y)].
We now use the familiar trick that our
φ
is Lorentz-invariant, so we can pick
a convenient frame to do the computations. Suppose
x
and
y
are space-like
separated. Then there is some frame where they are of the form
x = (x, t), y = (y, t)
for the same t. Then our equal time commutation relations tell us that
∆(x y) = [φ(x), φ(y)] =
3
(x y) = 0.
So our theory is indeed causal!
What happens for time-like separation? We can derive the more general
formula
∆(x y) = [φ(x), φ(y)]
=
Z
d
3
p d
3
q
(2π)
6
1
p
4E
p
E
q
[a
p
, a
q
]e
i(p·xq·y)
+ [a
p
, a
q
]e
i(p·x+q·y)
+[a
p
, a
q
]e
i(p·xq·y)
+ [a
p
, a
q
]e
i(p·x+q·y)
=
Z
d
3
p d
3
q
(2π)
6
1
p
4E
p
E
q
(2π)
3
δ
3
(p q)
e
i(p·x+q·y)
e
i(p·xq·y)
=
Z
d
3
p
(2π)
3
1
2E
p
(e
ip·(yx)
e
ip·(yx)
).
This doesn’t vanish for time-like separation, which we may wlog be of the form
x = (x, 0), y = (x, t), since we have
[φ(x, 0), φ(x, t)] e
imt
e
imt
.
2.6 Propagators
We now introduce the very important idea of a propagator. Suppose we are
living in the vacuum. We want to know the probability of a particle appearing
at
x
and then subsequently being destroyed at
y
. To quantify this, we introduce
the propagator.
Definition
(Propagator)
.
The propagator of a real scalar field
φ
is defined to
be
D(x y) = h0|φ(x)φ(y) |0i.
When we study interactions later, we will see that the propagator indeed
tells us the probability of a particle “propagating” from x to y.
It is not difficult to compute the value of the propagator. We have
h0|φ(x)φ(y) |0i =
Z
d
3
p
(2π)
3
d
3
p
0
(2π)
3
1
p
4E
p
E
0
p
h0|a
p
a
p
0
|0ie
ip·x+ip
0
·y
.
with all other terms in
φ
(
x
)
φ
(
y
) vanishing because they annihilate the vacuum.
We now use the fact that
h0|a
p
a
p
0
|0i = h0|[a
p
, a
p
0
] |0i = i(2π)
3
δ
3
(p p
0
),
since a
p
0
a
p
|0i = 0. So we have
Proposition.
D(x y) =
Z
d
3
p
(2π)
3
1
2E
p
e
ip·(xy)
.
The propagator relates to the ∆(
x y
) we previously defined by the following
simple relationship:
Proposition. We have
∆(x y) = D(x y) D(y x).
Proof.
∆(x y) = [φ(x), φ(y)] = h0|[φ(x), φ(y)] |0i = D(x y) D(y x),
where the second equality follows as [
φ
(
x
)
, φ
(
y
)] is just an ordinary function.
For a space-like separation (x y)
2
< 0, one can show that it decays as
D(x y) e
m(|xy|)
.
This is non-zero! So the field “leaks” out of the light cone a bit. However,
since there is no Lorentz-invariant way to order the events, if a particle can
travel in a spacelike direction
x y
, it can just as easily travel in the other
direction. So we know would expect
D
(
x y
) =
D
(
y x
), and in a measurement,
both amplitudes cancel. This indeed agrees with our previous computation that
∆(x y) = 0.
Feynman propagator
As we will later see, it turns out that what actually is useful is not the above
propagator, but the Feynman propagator:
Definition (Feynman propagator). The Feynman propagator is
F
(x y) = h0|T φ(x)φ(y) |0i =
(
h0|φ(x)φ(y) |0i x
0
> y
0
h0|φ(y)φ(x) |0i y
0
> x
0
Note that it seems like this is not a Lorentz-invariant quantity, because we
are directly comparing
x
0
and
y
0
. However, this is actually not a problem, since
when
x
and
y
are space-like separated, then
φ
(
x
) and
φ
(
y
) commute, and the
two quantities agree.
Proposition. We have
F
(x y) =
Z
d
4
p
(2π)
4
i
p
2
m
2
e
ip·(xy)
.
This expression is a priori ill-defined since for each
p
, the integrand over
p
0
has
a pole whenever (
p
0
)
2
=
p
2
+
m
2
. So we need a prescription for avoiding this.
We replace this with a complex contour integral with contour given by
E
p
E
p
Proof.
To compare with our previous computations of
D
(
x y
), we evaluate the
p
0
integral for each p. Writing
1
p
2
m
2
=
1
(p
0
)
2
E
2
p
=
1
(p
0
E
p
)(p
0
+ E
p
)
,
we see that the residue of the pole at p
0
= ±E
p
is ±
1
2E
p
.
When
x
0
> y
0
, we close the contour in the lower plane
p
0
i
, so
e
p
0
(x
0
t
0
)
e
−∞
= 0. Then
R
d
p
0
picks up the residue at
p
0
=
E
p
. So the
Feynman propagator is
F
(x y) =
Z
d
3
p
(2π)
4
(2πi)
2E
p
ie
iE
p
(x
0
y
0
)+ip·(xy)
=
Z
d
3
p
(2π
3
)
1
2E
p
e
ip·(xy)
= D(x y)
When x
0
< y
0
, we close the contour in the upper-half plane. Then we have
F
(x y) =
Z
d
3
p
(2π)
4
2πi
2E
p
ie
iE
p
(x
0
y
0
)+ip·(xy)
=
Z
d
3
p
(2π)
3
1
2E
p
e
iE
p
(y
0
x
0
)ip·(yx)
We again use the trick of flipping the sign of p to obtain
=
Z
d
3
p
(2π)
3
1
2E
p
e
ip·(yx)
= D(y x).
Usually, instead of specifying the contour, we write
F
(x y) =
Z
d
4
p
(2π)
4
ie
ip·(xy)
p
2
m
2
+
,
where
ε
is taken to be small, or infinitesimal. Then the poles are now located at
E
p
E
p
So integrating along the real axis would give us the contour we had above. This
is known about the -prescription”.
The propagator is in fact the Green’s function of the Klein–Gordon operator:
(
2
t
2
+ m
2
)∆
F
(x y) =
Z
d
4
p
(2π)
4
i
p
2
m
2
(p
2
+ m
2
)e
ip·(xy)
= i
Z
d
4
p
(2π)
4
e
ip·(xy)
=
4
(x y).
Other propagators
For some purposes, it’s useful to pick other contours, e.g. the retarded Green’s
function defined as follows:
E
p
E
p
This can be given in terms of operators by
Definition
(Retarded Green’s function)
.
The retarded Green’s function is given
by
R
(x y) =
(
[φ(x), φ(y)] x
0
> y
0
0 y
0
> x
0
This is useful if we have some initial field configuration, and we want to see
how it evolves in the presence of some source. This solves the “inhomogeneous
Klein–Gordon equation”, i.e. an equation of the form
µ
µ
φ(x) + m
2
φ(x) = J(x),
where J(x) is some background function.
One also defines the advanced Green’s function which vanishes when
y
0
< x
0
instead. This is useful if we know the end-point of a field configuration and want
to know where it came from.
However, in general, the Feynman propagator is the most useful in quantum
field theory, and we will only have to use this.
3 Interacting fields
Free theories are special we can determine the exact spectrum, but noth-
ing interacts. These facts are related. Free theories have at most quadratic
terms in the Lagrangian. So the equations of motion are linear. So we get
exact quantization, and we get multi-particle states with no interactions by
superposition.
3.1 Interaction Lagrangians
How do we introduce field interactions? The short answer is to throw more
things into the Lagrangian to act as the “potential”.
We start with the case where there is only one real scalar field, and the field
interacts with itself. Then the general form of the Lagrangian can be given by
L =
1
2
µ
φ∂
µ
φ
1
2
m
2
φ
2
X
n=3
λ
n
n!
φ
n
.
These λ
n
are known as the coupling constants.
It is almost impossible to work with such a Lagrangian directly, so we want to
apply perturbation theory to it. To do so, we need to make sure our interactions
are “small”. The obvious thing to require would be that
λ
n
1. However, this
makes sense only if the
λ
n
are dimensionless. So let’s try to figure out what the
dimensions are.
Recall that we have
[S] = 0.
Since we know S =
R
d
4
xL, and [d
4
x] = 4, we have
[L] = 4.
We also know that
[
µ
] = 1.
So the
µ
φ∂
µ
φ term tells us that
[φ] = 1.
From these, we deduce that we must have
[λ
n
] = 4 n.
So
λ
n
isn’t always dimensionless. What we can do is to compare it with some
energy scale
E
. For example, if we are doing collisions in the LHC, then the
energy of the particles would be our energy scale. If we have picked such an
energy scale, then looking at
λ
n
E
n4
would give us a dimensionless parameter
since [E] = 1.
We can separate this into three cases:
(i) n
= 3: here
E
n4
=
E
1
decreases with
E
. So
λ
3
E
1
would be small at
high energies, and large at low energies.
Such perturbations are called relevant perturbations at low energies. In
a relativistic theory, we always have
E > m
. So we can always make the
perturbation small by picking
λ
3
m
(at least when we are studying
general theory; we cannot go and change the coupling constant in nature
so that our maths work out better).
(ii) n
= 4: Here the dimensionless parameter is just
λ
4
itself. So this is small
whenever λ
4
1. This is known as marginal perturbation.
(iii) n >
4: This has dimensionless parameter
λ
n
E
n4
, which is an increasing
function of E. So this is small at low energies, and high at large energies.
These operators are called irrelevant perturbations.
While we call them “irrelevant”, they are indeed relevant, as it is typically difficult
to avoid high energies in quantum field theory. Indeed, we have seen that we
can have arbitrarily large vacuum fluctuations. In the Advanced Quantum Field
Theory course, we will see that these theories are “non-renormalizable”.
In this course, we will consider only weakly coupled field theories one that
can truly be considered as small perturbations of the free theory at all energies.
We will quickly describe some interaction Lagrangians that we will study in
more detail later on.
Example (φ
4
theory). Consider the φ
4
theory
L =
1
2
µ
φ∂
µ
φ
1
2
m
2
φ
2
λ
4!
φ
4
,
where
λ
1. We can already guess the effects of the final term by noting that
here we have
[H, N] 6= 0.
So particle number is not conserved. Expanding the last term in the Lagrangian
in terms of
a
p
, a
p
, we get terms involving things like
a
p
a
p
a
p
a
p
or
a
p
a
p
a
p
a
p
or
a
p
a
p
a
p
a
p
, and all other combinations you can think of. These will create or
destroy particles.
We also have a Lagrangian that involves two fields a real scalar field and
a complex scalar field.
Example
(Scalar Yukawa theory)
.
In the early days, we found things called
pions that seemed to mediate nuclear reactions. At that time, people did not
know that things are made up of quarks. So they modelled these pions in terms
of a scalar field, and we have
L =
µ
ψ
µ
ψ +
1
2
µ
φ∂
µ
φ M
2
ψ
ψ
1
2
m
2
φ
2
gψ
ψφ.
Here we have
g m, M
. Here we still have [
Q, H
] = 0, as all terms are at
most quadratic in the
ψ
’s. So the number of
ψ
-particles minus the number of
ψ
-antiparticles is still conserved. However, there is no particle conservation for
φ.
A wary the potential
V = M
2
ψ
ψ +
1
2
m
2
φ
2
+ gψ
ψφ
has a stable local minimum when the fields are zero, but it is unbounded below
for large gφ. So we can’t push the theory too far.
3.2 Interaction picture
How do we actually study interacting fields? We can imagine that the Hamil-
tonian is split into two parts, one of which is the “free field” part
H
0
, and the
other part is the “interaction” part
H
int
. For example, in the
φ
4
theory, we can
split it up as
H =
Z
d
3
x
1
2
(π
2
+ (φ)
2
+ m
2
φ
2
)
| {z }
H
0
+
λ
4!
φ
4
|{z}
H
int
.
The idea is that the interaction picture mixes the Schr¨odinger picture and the
Heisenberg picture, where the simple time evolutions remain in the operators,
whereas the complicated interactions live in the states. This is a very useful
trick if we want to deal with small interactions.
To see how this actually works, we revisit what the Schr¨odinger picture and
the Heisenberg picture are.
In the Schr¨odinger picture, we have a state
|ψ(t)i
that evolves in time, and
operators
O
S
that are fixed. In anticipation of what we are going to do next, we
are going to say that the operators are a function of time. It’s just that they are
constant functions:
d
dt
O
S
(t) = 0.
The states then evolve according to the Schr¨odinger equation
i
d
dt
|ψ(t)i
S
= H |ψ(t)i
S
.
Since
H
is constant in time, this allows us to write the solution to this equation
as
|ψ(t)i
S
= e
iHt
|ψ(0)i
S
.
Here
e
iHt
is the time evolution operator, which is a unitary operator. More
generally, we have
|ψ(t)i
S
= e
iH(tt
0
)
|ψ(0)i
S
= U
S
(t, t
0
) |ψ(t
0
)i
S
,
where we define
U
S
(t, t
0
) = e
iH(tt
0
)
.
To obtain information about the system at time
t
, we look at what happens
when we apply O
S
(t) |ψ(t)i
S
.
In the Heisenberg picture, we define a new set of operators, relating to the
Schr¨odinger picture by
O
H
(t) = e
iHt
O
S
(t)e
iHt
|ψ(t)i
H
= e
iHt
|ψ(t)i
S
.
In this picture, the states
|ψ(t)i
H
are actually constant in time, and are just
equal to |ψ(0)i
S
all the time. The equations of motion are then
d
dt
O
H
= i[H, O
H
]
d
dt
|ψ(t)i
H
= 0.
To know about the system at time t, we apply
O
H
(t) |ψ(t)i
H
= e
iHt
O
S
(t) |ψ(t)i
S
.
In this picture, the time evolution operator is just the identity map
U
H
(t, t
0
) = id .
In the interaction picture, we do a mixture of both. We only shift the “free”
part of the Hamiltonian to the operators. We define
O
I
(t) = e
iH
0
t
O
S
(t)e
iH
0
t
|ψ(t)i
I
= e
iH
0
t
|ψ(t)i
S
.
In this picture, both the operators and the states change in time!
Why is this a good idea? When we studied the Heisenberg picture of the free
theory, we have already figured out what conjugating by
e
iH
0
t
does to operators,
and it was really simple! In fact, it made things look nicer. On the other hand,
time evolution of the state is now just generated by the interaction part of the
Hamiltonian, rather than the whole thing, and this is much simpler (or at least
shorter) to deal with:
Proposition. In the interaction picture, the equations of motion are
d
dt
O
I
= i[H
0
, O
I
]
d
dt
|ψ(t)i
I
= H
I
|ψ(t)i
I
,
where H
I
is defined by
H
I
= (H
int
)
I
= e
iH
0
t
(H
int
)
S
e
iH
0
t
.
Proof.
The first part we’ve essentially done before, but we will write it out again
for completeness.
d
dt
(e
iH
0
t
O
S
e
iH
0
t
) = lim
ε0
e
iH
0
(t+ε)
O
S
e
iH
0
(t+ε)
e
iH
0
t
O
S
e
iH
0
t
ε
= lim
ε0
e
iH
0
t
e
iH
0
ε
O
S
e
iH
0
ε
O
S
ε
e
iH
0
t
= lim
ε0
e
iH
0
t
(1 + iH
0
ε)O
S
(1 iH
0
ε) O
S
+ o(ε)
ε
e
iH
0
t
= lim
ε0
e
iH
0
t
(H
0
O
S
O
S
H
0
) + o(ε)
ε
e
iH
0
t
= e
iH
0
t
i[H
0
, O
S
]e
iH
0
t
= i[H
0
, O
I
].
For the second part, we start with Schr¨odinger’s equation
i
d |ψ(t)i
S
dt
= H
S
|ψ(t)i
S
.
Putting in our definitions, we have
i
d
dt
(e
iH
0
t
|ψ(t)i
I
) = (H
0
+ H
int
)
S
e
iH
0
t
|ψ(t)i
I
.
Using the product rule and chain rule, we obtain
H
0
e
iH
0
t
|ψ(t)i
I
+ ie
iH
0
t
d
dt
|ψ(t)i
I
= (H
0
+ H
int
)
S
e
iH
0
t
|ψ(t)i
I
.
Rearranging gives us
i
d |ψ(t)i
I
dt
= e
iH
0
t
(H
int
)
S
e
iH
0
t
|ψ(t)i
I
.
To obtain information about the state at time
t
, we look at what happens
when we apply
O
I
(t) |ψ(t)i
I
= e
iH
0
t
O
S
(t) |ψ(t)i
S
.
Now what is the time evolution operator
U
(
t, t
0
)? In the case of the Schr¨odinger
picture, we had the equation of motion
i
d
dt
|ψ(t)i
S
= H |ψ(t)i
S
.
So we could immediately write down
|ψ(t)i = e
iH(tt
0
)
|ψ(t
0
)i.
In this case, we cannot do that, because
H
I
is now a function of time. The next
best guess would be
U
I
(t, t
0
) = exp
i
Z
t
t
0
H
I
(t
0
) dt
0
.
For this to be right, it has to satisfy
i
d
dt
U
I
(t, t
0
) = H
I
(t)U
I
(t, t
0
).
We can try to compute the derivative of this. If we take the series expansion of
this, then the second-order term would be
(i)
2
2
Z
t
t
0
H
I
(t
0
) dt
0
2
.
For the equation of motion to hold, the derivative should evaluate to
iH
I
times
the first order term, i.e.
iH
I
(i)
Z
t
t
0
H
I
(t
0
) dt
0
.
We can compute its derivative to be
(i)
2
2
H
I
(t)
Z
t
t
0
H
I
(t
0
) dt
0
+
(i)
2
2
Z
t
t
0
H
I
(t
0
) dt
0
H
I
(t).
This is not equal to what we wanted, because
H
I
(
t
) and
H
I
(
t
0
) do not commute
in general.
Yet, this is a good guess, and to get the real answer, we just need to make a
slight tweak:
Proposition
(Dyson’s formula)
.
The solution to the Schr¨odinger equation in
the interaction picture is given by
U(t, t
0
) = T exp
i
Z
t
t
0
H
I
(t
0
)dt
0
,
where
T
stands for time ordering: operators evaluated at earlier times appear
to the right of operators evaluated at leter times when we write out the power
series. More explicitly,
T {O
1
(t
1
)O
2
(t
2
)} =
(
O
1
(t
1
)O
2
(t
2
) t
1
> t
2
O
2
(t
2
)O
1
(t
1
) t
2
> t
1
.
We do not specify what happens when
t
1
=
t
2
, but it doesn’t matter in our case
since the operators are then equal.
Thus, we have
U(t, t
0
) = 1 i
Z
t
t
0
dt
0
H
I
(t
0
) +
(i)
2
2
Z
t
t
0
dt
0
Z
t
t
0
dt
00
H
I
(t
00
)H
I
(t
0
)
+
Z
t
t
0
dt
0
Z
t
0
t
0
dt
00
H
I
(t
0
)H
I
(t
00
)
)
+ ··· .
We now notice that
Z
t
t
0
dt
0
Z
t
t
0
dt
00
H
I
(t
00
)H
I
(t
0
) =
Z
t
t
0
dt
00
Z
t
00
t
0
dt
0
H
I
(t
00
)H
I
(t
0
),
since we are integrating over all
t
0
t
0
t
00
t
on both sides. Swapping
t
0
and
t
00
shows that the two second-order terms are indeed the same, so we have
U(t, t
0
) = 1 i
Z
t
t
0
dt
0
H
I
(t
0
) + (i)
2
Z
t
t
0
dt
0
Z
t
0
t
0
dt
00
H
I
(t
00
)H
I
(t
0
) + ··· .
Proof.
This from the last presentation above, using the fundamental theorem of
calculus.
In theory, this time-ordered exponential is very difficult to compute. However,
we have, at the beginning, made the assumption that the interactions
H
I
are
small. So often we can just consider the first one or two terms.
Note that when we evaluate the time integral of the Hamiltonian, a convenient
thing happens. Recall that our Hamiltonian is the integral of some operators
over all space. Now that we are also integrating it over time, what we are doing
is essentially integrating over spacetime a rather relativistic thing to do!
Some computations
Now let’s try to do some computations. By doing so, we will very soon realize
that our theory is very annoying to work with, and then we will later come up
with something more pleasant.
We consider the scalar Yukawa theory. This has one complex scalar field and
one real scalar field, with an interaction Hamiltonian of
H
I
= gψ
ψφ.
Here the
ψ, φ
are the “Heseinberg” versions where we conjugated by
e
iH
0
t
, so we
have
e
ip·x
rather than
e
ip·x
in the integral. We will pretend that the
φ
particles
are mesons, and that the ψ-particles are nucleons.
We now want to ask ourselves the following question suppose we start
with a meson, and we let it evolve over time. After infinite time, what is the
probability that we end up with a ψ
¯
ψ pair, where
¯
ψ is the anti-ψ particle?
φ
ψ
¯
ψ
We denote our initial state and “target” final states by
|ii =
p
2E
p
a
p
|0i
|fi =
p
4E
q
1
E
q
2
b
q
1
c
q
2
|0i.
What we now want to find is the quantity
hf|U (, −∞) |ii.
Definition (S-matrix). The S-matrix is defined as
S = U(, −∞).
Here
S
stands for “scattering”. So we often just write our desired quantity
as
hf|S |ii
instead. When we evaluate this thing, we will end up with some
numerical expression, which will very obviously not be a probability (it contains
a delta function). We will come back to interpreting this quantity later. For
the moment, we will just try to come up with ways to compute it in a sensible
manner.
Now let’s look at that expression term by term in the Taylor expansion of
U
.
The first term is the identity 1, and we have a contribution of
hf|1 |ii = hf|ii = 0
since distinct eigenstates are orthogonal. In the next term, we are integrating
over
ψ
ψφ
. If we expand this out, we will get loads of terms, each of which is a
product of some
b, c, a
(or their conjugates). But the only term that contributes
would be the term
c
q
2
b
q
1
a
p
, so that we exactly destroy the
a
p
particle and
create the
b
q
1
and
c
q
2
particles. The other terms will end up transforming
|ii
to
something that is not
|fi
(or perhaps it would kill
|ii
completely), and then the
inner product with |fi will give 0.
Before we go on to discuss what the higher order terms do, we first actually
compute the contribution of the first order term.
We want to compute
hf|S |ii = hf|1 i
Z
dt H
I
(t) + ···|ii
We know the
1
contributes nothing, and we drop the higher order terms to get
i hf|
Z
dt H
I
(t) |ii
= ig hf|
Z
d
4
x ψ
(x)ψ(x)φ(x) |ii
= ig
Z
d
4
x hf |ψ
(x)ψ(x)φ(x) |ii.
We now note that the
|ii
interacts with the
φ
terms only, and the
hf|
interacts
with the
ψ
terms only, so we can evaluate
φ
(
x
)
|ii
and
ψ
(
x
)
ψ
(
x
)
|fi
separately.
Moreover, any leftover a
, b
, c
terms in the expressions will end up killing the
other object, as the a
’s commute with the b
, c
, and h0|a
= (a |0i)
= 0.
We can expand φ in terms of the creation and annihilation operators a and
a
to get
φ(x) |ii =
Z
d
3
q
(2π)
3
p
2E
p
p
2E
q
(a
q
e
iq·x
+ a
q
e
iq·x
)a
p
|0i
We notice that nothing happens to the
a
q
terms and they will end up becoming
0 by killing h0|. Swapping a
q
and a
p
and inserting the commutator, we obtain
Z
d
3
q
(2π)
3
p
2E
p
p
2E
q
([a
q
, a
p
] + a
p
a
q
)e
iq·x
|0i
=
Z
d
3
q
(2π)
3
p
2E
p
p
2E
q
(2π)
3
δ
3
(p q)e
iq·x
|0i
= e
ip·x
|0i.
We will use this to describe the creation and destruction of mesons.
We can similarly look at
ψ
(x)ψ(x) |fi
=
Z
d
3
k
1
d
3
k
2
(2π)
6
1
p
4E
k
1
E
k
2
(b
k
1
e
ik
1
·x
+ c
k
1
e
ik
1
·x
)(b
k
2
e
ik
2
·x
+ c
k
2
e
ik
2
·x
) |f i
As before, the only interesting term for us is when we have both
c
k
1
and
b
k
2
present to “kill” b
q
1
c
q
2
in |f i. So we look at
Z
d
3
k
1
d
3
k
2
(2π)
6
p
4E
q
1
E
q
2
p
4E
k
1
E
k
2
c
k
1
b
k
2
b
q
1
c
q
2
e
i(k
1
+k
2
)·x
|0i
=
Z
d
3
k
1
d
3
k
2
(2π)
6
(2π)
6
δ
3
(q
1
k
2
)δ
3
(q
2
k
1
)e
i(k
1
+k
2
)·x
|0i
= e
i(q
1
+q
2
)·x
|0i.
So we have
hf|S |ii ig
Z
d
4
x (e
i(q
1
+q
2
)·x
|0i)
(e
ip·x
|0i)
= ig
Z
d
4
x e
i(q
1
+q
2
p)·x
= ig(2π)
4
δ
4
(q
1
+ q
2
p).
If we look carefully at what we have done, what allowed us to go from the initial
φ
state
|ii
=
p
2E
p
a
p
|0i
to the final
ψ
¯
ψ
state
|fi
=
p
4E
q
1
E
q
2
b
q
1
c
q
2
|0i
was
the presence of the term
c
q
1
b
q
2
a
p
in the
S
matrix (what we appeared to have
used in the calculation was
c
q
1
b
q
2
instead of their conjugates, but that is because
we were working with the conjugate ψ
ψ |fi rather than hf |ψ
ψ).
If we expand the
S
-matrix to, say, second order, we will, for example, find
a term that looks like (
c
b
a
)(
cba
). This corresponds to destroying a
ψ
¯
ψ
pair
to produce a
φ
, then destroying that to recover another
ψ
¯
ψ
pair, which is an
interaction of the form
ψ
¯
ψ
ψ
¯
ψ
φ
3.3 Wick’s theorem
Now the above process was still sort-of fine, because we were considering first-
order terms only. If we considered higher-order terms, we would have complicated
expressions involving a mess of creation and annihilation operators, and we would
have to commute them past each other to act on the states.
Our objective is thus to write a time-ordered product
T φ
(
x
1
)
φ
(
x
2
)
···φ
(
x
n
)
as a sum of normal-ordered things, since it is easy to read off the action of a
normal-ordered operator on a collection of states. Indeed, we know
hf|:O: |ii
is non-zero if the creation and annihilation operators
:O:
match up exactly with
those used to create |f i and |ii from the vacuum.
We start with a definition:
Definition (Contraction). The contraction of two fields φ, ψ is defined to be
φψ = T (φψ) :φψ: .
More generally, if we have a string of operators, we can contract just some of
them:
···φ(x
1
) ···φ(x
2
) ··· ,
by replacing the two fields with the contraction.
In general, the contraction will be a c-function, i.e. a number. So where we
decide to put the result of the contraction wouldn’t matter.
We now compute the contraction of our favorite scalar fields:
Proposition. Let φ be a real scalar field. Then
φ(x
1
)φ(x
2
) = ∆
F
(x
1
x
2
).
Proof. We write φ = φ
+
+ φ
, where
φ
+
(x) =
Z
d
3
p
(2π)
3
1
p
2E
p
a
p
e
ip·x
, φ
(x) =
Z
d
3
p
(2π)
3
1
p
2E
p
a
p
e
ip·x
.
Then we get normal order if all the φ
appear before the φ
+
.
When x
0
> y
0
, we have
T φ(x)φ(y) = φ(x)φ(y)
= (φ
+
(x) + φ
(x))(φ
+
(y) + φ
(y))
= φ
+
(x)φ
+
(y) + φ
(x)φ
+
(y) + [φ
+
(x), φ
(y)]
+ φ
(y)φ
+
(x) + φ
(x)φ
(y).
So we have
T φ(x)φ(y) = :φ(x)φ(y): + D(x y).
By symmetry, for y
0
> x
0
, we have
T φ(x)φ(y) = :φ(x)φ(y): + D(y x).
Recalling the definition of the Feynman propagator, we have
T φ(x)φ(y) = :φ(x)φ(y): +
F
(x y).
Similarly, we have
Proposition. For a complex scalar field, we have
ψ(x)ψ
(y) = ∆
F
(x y) = ψ
(y)ψ(x),
whereas
ψ(x)ψ(y) = 0 = ψ
(x)ψ
(y).
It turns out these computations are all we need to figure out how to turn a
time-ordered product into a sum of normal-ordered products.
Theorem
(Wick’s theorem)
.
For any collection of fields
φ
1
=
φ
1
(
x
1
)
, φ
2
=
φ
2
(x
2
), ···, we have
T (φ
1
φ
2
···φ
n
) = :φ
1
φ
2
···φ
n
: + all possible contractions
Example. We have
T (φ
1
φ
2
φ
3
φ
4
) = :φ
1
φ
2
φ
3
φ
4
: + φ
1
φ
2
:φ
3
φ
4
: + φ
1
φ
3
:φ
2
φ
4
: + ··· + φ
1
φ
2
φ
3
φ
4
+ ···
Proof sketch. We will provide a rough proof sketch.
By definition this is true for
n
= 2. Suppose this is true for
φ
2
···φ
n
, and
now add φ
1
with
x
0
1
> x
0
k
for all k {2, ··· , n}.
Then we have
T (φ
1
···φ
n
) = (φ
+
1
+ φ
1
)(:φ
2
···φ
n
: + other contractions).
The
φ
term stays where it is, as that already gives normal ordering, and the
φ
+
1
term has to make its way past the
φ
k
operators. So we can write the RHS
as a normal-ordered product. Each time we move past the
φ
k
, we pick up a
contraction φ
1
φ
k
.
Example.
We now consider the more complicated problem of nucleon scattering:
ψψ ψψ.
So we are considering interactions of the form
p
1
p
2
p
0
1
p
0
2
something happens
Then we have initial and final states
|ii =
p
4E
p
1
E
p
2
b
p
1
b
p
2
|0i
|fi =
q
4E
p
0
1
E
p
0
2
b
p
0
1
b
p
0
2
|0i
We look at the order
g
2
term in
hf|
(
S 1
)
|ii
. We remove that
1
as we are not
interested in the case with no scattering, and we look at order
g
2
because there
is no order g term.
The second order term is given by
(ig)
2
2
Z
d
4
x
1
d
4
x
2
T
ψ
(x
1
)ψ(x
1
)φ(x
1
)ψ
(x
2
)ψ(x
2
)φ(x
2
)
.
Now we can use Wick’s theorem to write the time-ordered product as a sum of
normal-ordered products. The annihilation and creation operators don’t end up
killing the vacuum only if we annihilate two
b
p
and then construct two new
b
p
.
This only happens in the term
:ψ
(x
1
)ψ(x
1
)ψ
(x
2
)ψ(x
2
): φ(x
1
)φ(x
2
).
We know the contraction is given by the Feynman propagator
F
(
x
1
x
2
). So
we now want to compute
hp
0
1
, p
0
2
|:ψ
(x
1
)ψ(x
1
)ψ
(x
2
)ψ(x
2
): |p
1
, p
2
i
The only relevant term in the normal-ordered product is
ZZZZ
d
3
k
1
d
3
k
2
d
3
k
3
d
3
k
4
(2π)
12
p
16E
k
1
E
k
2
E
k
3
E
k
4
b
k
1
b
k
2
b
k
3
b
k
4
e
ik
1
·x
1
+ik
2
·x
2
ik
3
·x
1
ik
4
·x
2
.
Now letting this act on hp
0
1
, p
0
2
| and |p
1
, p
2
i would then give
e
ix
1
·(p
0
1
p
1
)+ix
2
·(p
0
2
p
2
)
+ e
ix
1
(p
0
2
p
1
)+ix
1
(p
0
1
p
1
)
, ()
plus what we get by swapping
x
1
and
x
2
by a routine calculation (swap the
b
-operators in the
ψ
with those that create
|ii, |fi
, insert the delta functions as
commutators, then integrate).
What we really want is to integrate the Hamiltonian. So we want to integrate
the result of the above computation with respect to all spacetime to obtain
(ig)
2
2
Z
d
4
x
1
d
4
x
2
() ∆
F
(x
1
x
2
)
=
(ig)
2
2
Z
d
4
x
1
d
4
x
2
()
Z
d
4
k
(2π)
4
e
ik·(x
1
x
2
)
k
2
m
2
+
= (ig)
2
Z
d
4
k
(2π)
4
i(2π)
8
k
2
m
2
+
δ
4
(p
0
1
p
1
+ k)δ
4
(p
0
2
p
2
k) + δ
4
(p
0
2
p
1
+ k)δ
4
(p
0
1
p
2
k)
= (ig)
2
(2π)
4
i
(p
1
p
0
1
)
2
m
2
+
i
(p
1
p
0
2
)
2
m
2
δ
4
(p
1
+ p
2
p
0
1
p
0
2
).
What we see again is a δ-function that enforces momentum conservation.
3.4 Feynman diagrams
This is better, but still rather tedious. How can we make this better? In
the previous example, we could imagine that the term that contributed to the
integral above represented the interaction where the two
ψ
-particles “exchanged”
a
φ
-particle. In this process, we destroy the old
ψ
-particles and construct new
ones with new momenta, and the
φ
-particle is created and then swiftly destroyed.
The fact that the
φ
-particles were just “intermediate” corresponds to the fact
that we included their contraction in the computation.
We can pictorially represent the interaction as follows:
ψ
ψ
ψ
ψ
The magical insight is that every term given by Wick’s theorem can be interpreted
as a diagram of this sort. Moreover, the term contributes to the process if and
only if it has the right “incoming” and “outgoing” particles. So we can figure
out what terms contribute by drawing the right diagrams.
Moreover, not only can we count the diagrams like this. More importantly,
we can read out how much each term contributes from the diagram directly!
This simplifies the computation a lot.
We will not provide a proof that Feynman diagrams do indeed work, as it
would be purely technical and would also involve the difficult work of what it
actually means to be a diagram. However, based on the computations we’ve
done above, one should be confident that at least something of this sort would
work, maybe up to some sign errors.
We begin by specifying what diagrams are “allowed”, and then specify how
we assign numbers to diagrams.
Given an initial state and final state, the possible Feynman diagrams are
specified as follows:
Definition
(Feynman diagram)
.
In the scalar Yukawa theory, given an initial
state |ii and final state |f i, a Feynman diagram consists of:
An external line for all particles in the initial and final states. A dashed
line is used for a φ-particle, and solid lines are used for ψ/
¯
ψ-particles.
φ
ψ
¯
ψ
Each
ψ
-particle comes with an arrow. An initial
ψ
-particle has an incoming
arrow, and a final
ψ
-particle has an outgoing arrow. The reverse is done
for
¯
ψ-particles.
φ
ψ
¯
ψ
We join the lines together with more lines and vertices so that the only loose
ends are the initial and final states. The possible vertices correspond to
the interaction terms in the Lagrangian. For example, the only interaction
term in the Lagrangian here is
ψ
ψφ
, so the only possible vertex is one
that joins a φ line with two ψ lines that point in opposite directions:
φ
ψ
¯
ψ
Each such vertex represents an interaction, and the fact that the arrows
match up in all possible interactions ensures that charge is conserved.
Assign a directed momentum
p
to each line, i.e. an arrow into or out of
the diagram to each line.
φ
ψ
¯
ψ
p
p
2
p
1
The initial and final particles already have momentum specified in the
initial and final state, and the internal lines are given “dummy” momenta
k
i
(which we will later integrate over).
Note that there are infinitely many possible diagrams! However, when we lay
down the Feynman rules later, we will see that the more vertices the diagram has,
the less it contributes to the sum. In fact, the
n
-vertices diagrams correspond
to the nth order term in the expansion of the S-matrix. So most of the time it
suffices to consider “simple” diagrams.
Example
(Nucleon scattering)
.
If we look at
ψ
+
ψ ψ
+
ψ
, the simplest
diagram is
ψ
ψ
ψ
ψ
On the other hand, we can also swap the two particles to obtain a diagram of
the form.
ψ
ψ
ψ
ψ
These are the ones that correspond to second-order terms.
There are also more complicated ones such as
ψ
ψ
ψ
ψ
This is a 1-loop diagram. We can also have a 2-loop diagram:
ψ
ψ
ψ
ψ
If we ignore the loops, we say we are looking at the tree level.
To each such diagram, we associate a number using the Feynman rules.
Definition
(Feynman rules)
.
To each Feynman diagram in the interaction, we
write down a term like this:
(i) To each vertex in the Feynman diagram, we write a factor of
(ig)(2π)
4
δ
4
X
i
k
i
!
,
where the sum goes through all lines going into the vertex (and put a
negative sign for those going out).
(ii)
For each internal line with momentum
k
, we integrate the product of all
factors above by
Z
d
4
k
(2π)
4
D(k
2
),
where
D(k
2
) =
i
k
2
m
2
+
for φ
D(k
2
) =
i
k
2
µ
2
+
for ψ
Example.
We consider the case where
g
is small. Then only the simple diagrams
with very few vertices matter.
ψ
ψ
ψ
ψ
p
1
p
0
1
p
2
p
0
2
k
ψ
ψ
ψ
ψ
p
1
p
0
2
p
2
p
0
1
k
As promised by the Feynman rules, the two diagrams give us
(ig)
2
(2π)
8
(δ
4
(p
1
p
0
1
k)δ
4
(p
2
+ k p
0
2
) + δ
4
(p
1
p
0
2
k)δ
4
(p
2
+ k p
0
1
)).
Now integrating gives us the formula
(ig)
2
Z
d
4
k
(2π)
4
i(2π)
8
k
2
m
2
+
(δ
4
(p
1
p
0
1
k)δ
4
(p
2
+ k p
0
2
) + δ
4
(p
1
p
0
2
k)δ
4
(p
2
+ k p
0
1
)).
Doing the integral gives us
(ig)
2
(2π)
4
i
(p
1
p
0
1
)
2
m
2
+
i
(p
1
p
0
2
)
2
m
2
δ
4
(p
1
+ p
2
p
0
1
p
0
2
),
which is what we found before.
There is a nice physical interpretation of the diagrams. We can interpret
the first diagram as saying that the nucleons exchange a meson of momentum
k
=
p
1
p
0
1
=
p
2
p
0
2
. This meson doesn’t necessarily satisfy the relativistic
dispersion relation
k
2
=
m
2
(note that
k
is the 4-momentum). When this
happens, we say it is off-shell, or that it is a virtual meson. Heuristically, it
can’t live long enough for its energy to be measured accurately. In contrast, the
external legs are on-shell, so they satisfy p
2
i
= µ
2
.
3.5 Amplitudes
Now the important question is — what do we do with that quantity we computed?
We first define a more useful quantity known as the amplitude:
Definition
(Amplitude)
.
The amplitude
A
f,i
of a scattering process from
|ii
to
|fi is defined by
hf|S 1 |ii = iA
f,i
(2π)
4
δ
4
(p
F
p
I
).
where
p
F
is the sum of final state 4-momenta, and
p
I
is the sum of initial
state 4-momenta. The factor of
i
sticking out is by convention, to match with
non-relativistic quantum mechanics.
We can now modify our Feynman rules accordingly to compute A
f,i
.
Draw all possible diagrams with the appropriate external legs and impose
4-momentum conservation at each vertex.
Write a factor (ig) for each vertex.
For each internal line of momentum p and mass m, put in a factor of
1
p
2
m
2
+
.
Integrate over all undetermined 4-momentum k flowing in each loop.
We again do some examples of computation.
Example
(Nucleon-antinucleon scattering)
.
Consider another tree-level process
ψ +
¯
ψ φ + φ.
We have a diagram of the form
ψ
¯
ψ
φ
φ
p
1
p
0
1
p
0
2
p
2
k
and a similar one with the two φ particles crossing.
The two diagrams then say
A = (ig)
2
1
(p
1
p
0
1
)
2
µ
2
+
1
(p
2
p
0
2
)
2
µ
2
.
Note that we dropped the
terms in the denominator because the denominator
never vanishes, and we don’t need to integrate over anything because all momenta
in the diagram are determined uniquely by momentum conservation.
Example (Meson scattering). Consider
φ + φ φ + φ.
This is a bit more tricky. There is no tree-level diagram we can draw. The best
we can get is a box diagram like this:
φ
φ
φ
φ
p
1
p
2
p
0
1
p
0
2
k
k p
0
2
k + p
0
1
p
1
k + p
0
1
We then integrate through all possible k.
This particular graph we’ve written down gives
iA = (ig)
4
Z
d
4
k
(2π)
4
i
4
(k
2
µ
2
+ )((k + p
0
1
)
2
µ
2
+ )((k + p
0
1
p
1
)
2
µ
2
+ )((k p
0
2
)
2
µ
2
+ )
.
For large
k
, this asymptotically tends to
R
d
4
k
k
8
, which fortunately means the
integral converges. However, this is usually not the case. For example, we
might have
d
4
k
k
4
, which diverges, or even
R
d
4
k
k
2
. This is when we need to do
renormalization.
Example (Feynman rules for φ
4
theory). Suppose our Lagrangian has a term
λ
4!
φ
4
.
We then have single-vertex interactions
Any such diagram contributes a factor of (
). Note that there is no
1
4!
. This
is because if we expand the term
4!
hp
0
1
, p
0
2
|:φ(x
1
)φ(x
2
)φ(x
3
)φ(x
4
): |p
1
, p
2
i,
there are 4! ways of pairing the
φ
(
x
i
) with the
p
i
, p
0
i
, so each possible interaction
is counted 4! times, and cancels that 4! factor.
For some diagrams, there are extra combinatoric factors, known as symmetry
factors, that one must take into account.
These amplitudes can be further related to experimentally measurable quan-
tities. We will not go into details here, but in The Standard Model course, we
will see that if we have a single initial state
i
and a collection of final states
{f}
,
then the decay rate of i to {f } is given by
Γ
if
=
1
2m
i
Z
|A
f,i
|
2
dρ
f
,
where
dρ
f
= (2π)
4
δ
4
(p
F
p
I
)
Y
r
d
3
p
r
(2π)
3
2p
0
r
,
and r runs over all momenta in the final states.
3.6 Correlation functions and vacuum bubbles
The
S
-matrix elements are good and physical, because they correspond to
probabilities of certain events happening. However, we are not always interested
in these quantities. For example, we might want to ask the conductivity of some
quantum-field-theoretic object. To do these computations, quantities known as
correlation functions are useful. However, we are not going to use these objects
in this course, so we are just going to state, rather than justify many of the
results, and you are free to skip this chapter.
Before that, we need to study the vacuum. Previously, we have been working
with the vacuum of the free theory |0i. This satisfies the boring relation
H
0
|0i = 0.
However, when we introduce an interaction term, this is no longer the vacuum.
Instead we have an interacting vacuum |i, satisfying
H |i = 0.
As before, we normalize the vacuum so that
h|i = 1.
Concretely, this interaction vacuum can be obtained by starting with a free
vacuum and then letting it evolve for infinite time.
Lemma. The free vacuum and interacting vacuum are related via
|i =
1
h|0i
U
I
(t, −∞) |0i =
1
h|0i
U
S
(t, −∞) |0i.
Similarly, we have
h| =
1
h|0i
h0|U (, t).
Proof.
Note that we have the last equality because
U
I
(
t, −∞
) and
U
S
(
t, −∞
)
differs by factors of e
iHt
0
which acts as the identity on |0i.
Consider an arbitrary state |Ψi. We want to show that
hΨ|U (t, −∞) |0i = hΨ|ih|0i.
The trick is to consider a complete set of eigenstates
{
, |xi}
for
H
with energies
E
x
. Then we have
U(t, t
0
) |xi = e
iE
x
(t
0
t)
|xi.
Also, we know that
1 = |ih| +
Z
dx |xihx|.
So we have
hΨ|U (t, −∞) |0i = hΨ|U (t, −∞)
|ih| +
Z
dx |xihx|
|0i
= hΨ|ih|0i + lim
t
0
→−∞
Z
dx e
iE
x
(t
0
t)
hΨ|xihx|0i.
We now claim that the second term vanishes in the limit. As in most of the
things in the course, we do not have a rigorous justification for this, but it is not
too far-stretched to say so since for any well-behaved (genuine) function
f
(
x
),
the Riemann-Lebesgue lemma tells us that for any fixed a, b R, we have
lim
µ→∞
Z
b
a
f(x)e
iµx
dx = 0.
If this were to hold, then
hΨ|U (t, −∞) |0i = hΨ|ih|0i.
So the result follows. The other direction follows similarly.
Now given that we have this vacuum, we can define the correlation function.
Definition
(Correlation/Green’s function)
.
The correlation or Green’s function
is defined as
G
(n)
(x
1
, ··· , x
n
) = h|T φ
H
(x
1
) ···φ
H
(x
n
) |i,
where φ
H
denotes the operators in the Heisenberg picture.
How can we compute these things? We claim the following:
Proposition.
G
(n)
(x
1
, ··· , x
n
) =
h0|T φ
I
(x
1
) ···φ
I
(x
n
)S |0i
h0|S |0i
.
Proof.
We can wlog consider the specific example
t
1
> t
2
> ··· > t
n
. We start
by working on the denominator. We have
h0|S |0i = h0|U(, t)U(t, −∞) |0i = h0|ih|ih|0i.
For the numerator, we note that S can be written as
S = U
I
(, t
1
)U
I
(t
1
, t
2
) ···U
I
(t
n
, −∞).
So after time-ordering, we know the numerator of the right hand side is
h0|U
I
(, t
1
)φ
I
(x
1
)U
I
(t
1
, t
2
)φ
I
(x
2
) ···φ
I
(n)U
I
(t
n
, −∞) |0i
h0|U
I
(, t
1
)φ
I
(x
1
)U
I
(t
1
, t
2
)φ
I
(x
2
) ···φ
I
(n)U
I
(t
n
, −∞) |0i
= h0|U
I
(, t
1
)φ
H
(x
1
) ···φ
H
(x
n
)U
I
(t
n
, −∞) |0i
= h0|ih|T φ
H
(x
1
) ···φ
H
(x
n
) |ih|0i.
So the result follows.
Now what does this quantity tell us? It turns out these have some really nice
physical interpretation. Let’s look at the terms
h0|T φ
I
(
x
1
)
···φ
I
(
x
n
)
S |0i
and
h0|S |0i individually and see what they tell us.
For simplicity, we will work with the
φ
4
theory, so that we only have a single
φ
field, and we will, without risk of confusion, draw all diagrams with solid lines.
Looking at
h0|S |0i
, we are looking at all transitions from
|0i
to
|0i
. The
Feynman diagrams corresponding to these would look like
1 vertex 2 vertex
These are known as vacuum bubbles. Then
h0|S |0i
is the sum of the amplitudes
of all these vacuum bubbles.
While this sounds complicated, a miracle occurs. It happens that the different
combinatoric factors piece together nicely so that we have
h0|S |0i = exp(all distinct (connected) vacuum bubbles).
Similarly, magic tells us that
h0|T φ
I
(x
1
) ···φ
I
(x
n
)S |0i =
X
connected diagrams
with n loose ends
h0|S |0i.
So what
G
(n)
(
x
1
, ··· , x
n
) really tells us is the sum of connected diagrams modulo
these silly vacuum bubbles.
Example.
The diagrams that correspond to
G
(4)
(
x
1
, ··· , x
4
) include things
that look like
x
1
x
2
x
3
x
4
x
1
x
4
x
3
x
2
x
1
x
2
x
3
x
4
Note that we define “connected” to mean every line is connected to some of the
end points in some way, rather than everything being connected to everything.
We can come up with analogous Feynman rules to figure out the contribution
of all of these terms.
There is also another way we can think about these Green’s function. Consider
the theory with a source J(x) added, so that
H = H
0
+ H
int
J(x)φ(x).
This
J
is a fixed background function, called a source in analogy with electro-
magnetism.
Consider the interaction picture, but now we choose (
H
0
+
H
int
) to be the
“free” part, and Jφ as the “interaction” piece.
Now the “vacuum” we use is not
|0i
but
|i
, since this is the “free vacuum”
for our theory. We define
W [J] = h|U
I
(−∞, ) |i.
This is then a functional in J. We can compute
W [J] = h|U
I
(−∞, ) |i
= h|T exp
Z
d
4
x J(x)φ
H
(x)
|i
= 1 +
X
n=1
(1)
n
n!
Z
d
4
x
1
···d
4
x
n
J(x
1
) ···J(x
n
)G
(n)
(x
1
, ··· , x
n
).
Thus by general variational calculus, we know
G
(n)
(x
1
, ··· , x
n
) = (1)
n
δ
n
W [J]
δJ(x
1
) ···δJ(x
n
)
J=0
.
Thus we call W [J] a generating function for the function G
(n)
(x
1
, ··· , x
n
).
4 Spinors
In this chapter, we return to the classical world where things make sense. We
are going to come up with the notion of a spinor, which is a special type of field
that, when quantized, gives us particles with “intrinsic spin”.
4.1 The Lorentz group and the Lorentz algebra
So far, we have only been working with scalar fields. These are pretty boring,
since when we change coordinates, the values of the field remain unchanged. One
can certainly imagine more exciting fields, like the electromagnetic potential.
Ignoring issues of choice of gauge, we can imagine the electromagnetic potential
as a 4-vector
A
µ
at each point in space-time. When we change coordinates by a
Lorentz transformation Λ, the field transforms by
A
µ
(x) 7→ Λ
µ
ν
A
ν
1
x).
Note that we had to put a Λ
1
inside
A
ν
because the names of the points have
changed. Λ
1
x
in the new coordinate system labels the same point as
x
in the
old coordinate system.
In general, we can consider vector-valued fields that transform when we
change coordinates. If we were to construct such a field, then given any Lorentz
transformation Λ, we need to produce a corresponding transformation D(Λ) of
the field. If our field
φ
takes values in a vector space
V
(usually
R
n
), then this
D(Λ) should be a linear map V V . The transformation rule is then
x 7→ Λx,
φ 7→ D(Λ)φ.
We want to be sure this assignment of
D
(Λ) behaves sensibly. For example, if we
apply the Lorentz transformation Λ =
1
, i.e. we do nothing, then
D
(
1
) should
not do anything as well. So we must have
D(1) = 1.
Now if we do Λ
1
and then Λ
2
, then our field will transform first by
D
1
), then
D
2
). For the universe to make sense, this had better be equal to
D
2
Λ
1
).
So we require
D
1
)D
2
) = D
1
Λ
2
)
for any Λ
1
,
Λ
2
. Mathematically, we call this a representation of the Lorentz
group.
Definition
(Lorentz group)
.
The Lorentz group, denoted O(1
,
3), is the group
of all Lorentz transformations. Explicitly, it is given by
O(1, 3) = {Λ M
4×4
: Λ
T
ηΛ = η},
where
η = η
µν
=
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
is the Minkowski metric. Alternatively, O(1
,
3) is the group of all matrices Λ
such that
hΛx, Λyi = hx, yi,
for all
x, y R
1+3
, where
hx, yi
denotes the inner product given by the Minkowski
metric
hx, yi = x
T
ηy.
Definition
(Representation of the Lorentz group)
.
A representation of the
Lorentz group is a vector space
V
and a linear map
D
(Λ) :
V V
for each
Λ O(1, 3) such that
D(1) = 1, D
1
)D
2
) = D
1
Λ
2
)
for any Λ
1
, Λ
2
O(1, 3).
The space V is called the representation space.
So to construct interesting fields, we need to find interesting representations!
There is one representation that sends each Λ to the matrix Λ acting on
R
1+3
,
but that is really boring.
To find more representations, we will consider “infinitesimal” Lorentz trans-
formations. We will find representations of these infinitesimal Lorentz trans-
formations, and then attempt to patch them up to get a representation of the
Lorentz group. It turns out that we will fail to patch them up, but instead we
will end up with something more interesting!
We write our Lorentz transformation Λ as
Λ
µ
ν
= δ
µ
ν
+ εω
µ
ν
+ O(ε
2
).
The definition of a Lorentz transformation requires
Λ
µ
σ
Λ
ν
ρ
η
σρ
= η
µν
.
Putting it in, we have
(δ
µ
σ
+ εω
µ
σ
)(δ
ν
ρ
+ εω
ν
ρ
)η
ρσ
+ O(ε
2
) = η
µν
.
So we find that we need
ω
µν
+ ω
νµ
= 0.
So
ω
is antisymmetric. Thus we figured that an infinitesimal Lorentz transfor-
mation is an antisymmetric matrix. This is known as the Lie algebra of the
Lorentz group.
Definition (Lorentz algebra). The Lorentz algebra is
o(1, 3) = {ω M
4×4
: ω
µν
+ ω
νµ
= 0}.
It is important that when we say that ω is antisymmetric, we mean exactly
ω
µν
+
ω
νµ
= 0. Usually, when we write out a matrix, we write out the entries of
ω
µ
ν
instead, and the matrix we see will not in general be antisymmetric, as we
will see in the examples below.
A 4
×
4 anti-symmetric matrix has 6 free components. These 6 components
in fact correspond to three infinitesimal rotations and three infinitesimal boosts.
We introduce a basis given by the confusing expression
(M
ρσ
)
µ
ν
= η
ρµ
δ
σ
ν
η
σµ
δ
ρ
ν
.
Here the
ρ
and
σ
count which basis vector (i.e. matrix) we are talking about,
and
µ, ν
are the rows and columns. Usually, we will just refer to the matrix as
M
ρσ
and not mention the indices µ, ν to avoid confusion.
Each basis element will have exactly one pair of non-zero entries. For example,
we have
(M
01
)
µ
ν
=
0 1 0 0
1 0 0 0
0 0 0 0
0 0 0 0
, (M
12
)
µ
ν
=
0 0 0 0
0 0 1 0
0 1 0 0
0 0 0 0
.
These generate a boost in the
x
1
direction and a rotation in the
x
1
-
x
2
plane
respectively, as we will later see.
By definition of a basis, we can write all matrices
ω
in the Lorentz algebra
as a linear combination of these:
ω =
1
2
ρσ
M
ρσ
.
Note that here
M
ρσ
and
M
σρ
will be the negative of each other, so
{M
ρσ
}
doesn’t
really form a basis. By symmetry, we will often choose so that
ρσ
=
σρ
as well.
Now we can talk about representations of the Lorentz algebra. In the case of
representations of the Lorentz group, we required that representations respect
multiplication. In the case of the Lorentz algebra, the right thing to ask to be
preserved is the commutator (cf. III Symmetries, Fields and Particles).
Definition
(Representation of Lorentz algebra)
.
A representation of the Lorentz
algebra is a collection of matrices that satisfy the same commutation relations as
the Lorentz algebra.
Formally, this is given by a vector space V and a linear map R(ω) : V V
for each ω o(1, 3) such that
R( +
0
) = aR(ω) + bR(ω
0
), R([ω, ω
0
]) = [R(ω), R(ω
0
)]
for all ω, ω
0
o(1, 3) and a, b R.
It turns out finding representations of the Lorentz algebra isn’t that bad. We
first note the following magic formula about the basis vectors of the Lorentz
algebra:
Proposition.
[M
ρσ
, M
τν
] = η
στ
M
ρν
η
ρτ
M
σν
+ η
ρν
M
στ
η
σν
M
ρτ
.
This can be proven, painfully, by writing out the matrices.
It turns out this is the only thing we need to satisfy:
Fact.
Given any vector space
V
and collection of linear maps
L
ρσ
:
V V
,
they give a representation of the Lorentz algebra if and only if
[L
ρσ
, L
τν
] = η
στ
L
ρν
η
ρτ
L
σν
+ η
ρν
L
στ
η
σν
L
ρτ
.
Suppose we have already found a representation of the Lorentz algebra. How
can we find a representation of the Lorentz group? We will need to make use of
the following fact:
Fact.
Let Λ be a Lorentz transformation that preserves orientation and does not
reverse time. Then we can write Λ =
exp
(
ω
) for some
ω o
(1
,
3). In coordinates,
we can find some
ρσ
such that
Λ = exp
1
2
ρσ
M
ρσ
. ()
Thus if we have found some representation
R
(
M
ρσ
), we can try to define a
representation of the Lorentz group by
R(Λ) = exp(R(ω)) = exp
1
2
ρσ
R(M
ρσ
)
. ()
Now we see two potential problems with our approach. The first is that we can
only lift a representation of the Lorentz algebra to the parity and time-preserving
Lorentz transformations, and not all of them. Even worse, it might not even be
well-defined for these nice Lorentz transformations. We just know that we can
find some
ρσ
such that (
) holds. In general, there can be many such Ω
ρσ
, and
they might not give the same answer when we evaluate ()!
Definition
(Restricted Lorentz group)
.
The restricted Lorentz group consists
of the elements in the Lorentz group that preserve orientation and direction of
time.
Example. Recall that we had
(M
12
)
µ
ν
=
0 0 0 0
0 0 1 0
0 1 0 0
0 0 0 0
We pick
12
=
21
= φ
3
,
with all other entries zero. Then we have
1
2
ρσ
M
ρσ
=
0 0 0 0
0 0 φ
3
0
0 φ
3
0 0
0 0 0 0
So we know
Λ = exp
1
2
ρσ
M
ρσ
=
1 0 0 0
0 cos φ
3
sin φ
3
0
0 sin φ
3
cos φ
3
0
0 0 0 1
,
which is a rotation in the
x
1
-
x
2
plane by
φ
3
. It is clear that
φ
3
is not uniquely
defined by Λ. Indeed,
φ
3
+ 2
for any
n
would give the same element of the
Lorentz group.
More generally, given any rotation parameters
φ
= (
φ
1
, φ
2
, φ
3
), we obtain a
rotation by setting
ij
= ε
ijk
φ
k
.
4.2 The Clifford algebra and the spin representation
It turns out there is a useful way of finding a representation of the Lorentz
algebra which gives rise to nice properties of the representation. This is via the
Clifford algebra.
We will make use of the following notation:
Notation (Anticommutator). We write
{A, B} = AB + BA
for the anticommutator of two matrices/linear maps.
Definition
(Clifford algebra)
.
The Clifford algebra is the algebra generated by
γ
0
, γ
1
, γ
2
, γ
3
subject to the relations
{γ
µ
, γ
ν
} = γ
µ
γ
ν
+ γ
ν
γ
µ
= 2η
µν
1.
More explicitly, we have
γ
µ
γ
ν
= γ
ν
γ
µ
for µ 6= ν.
and
(γ
0
)
2
= 1, (γ
i
)
2
= 1.
A representation of the Clifford algebra is then a collection of matrices (or linear
maps) satisfying the relations listed above.
Note that if we have a representation of the Clifford algebra, we will also
denote the matrices of the representation by γ
µ
, instead of, say, R(γ
µ
).
Example
(Chiral representation)
.
The simplest representation (in the sense
that it is easy to write out, rather than in any mathematical sense) is the
4-dimensional representation called the chiral representation. In block matrix
form, this is given by
γ
0
=
0 1
1 0
, γ
i
=
0 σ
i
σ
i
0
,
where the σ
i
are the Pauli matrices given by
σ
1
=
0 1
1 0
, σ
2
=
0 i
i 0
, σ
3
=
1 0
0 1
,
which satisfy
{σ
i
, σ
j
} = 2δ
ij
1.
Of course, this is not the only representation. Given any 4
×
4 matrix
U
, we
can construct a new representation of the Clifford algebra by transforming by
γ
µ
7→ Uγ
µ
U
1
.
It turns out any 4-dimensional representation of the Clifford algebra comes from
applying some similarity transformation to the chiral representation, but we will
not prove that.
We now claim that every representation of the Clifford algebra gives rise to
a representation of the Lorentz algebra.
Proposition.
Suppose
γ
µ
is a representation of the Clifford algebra. Then the
matrices given by
S
ρσ
=
1
4
[γ
ρ
, γ
σ
] =
(
0 ρ = σ
1
2
γ
ρ
γ
σ
ρ 6= σ
=
1
2
γ
ρ
γ
σ
1
2
η
ρσ
define a representation of the Lorentz algebra.
This is known as the spin representation.
We will need the following technical result to prove the proposition:
Lemma.
[S
µν
, γ
ρ
] = γ
µ
η
νρ
γ
ν
η
ρµ
.
Proof.
[S
µν
, γ
ρ
] =
1
2
γ
µ
γ
ν
1
2
η
µν
, γ
ρ
=
1
2
γ
µ
γ
ν
γ
ρ
1
2
γ
ρ
γ
µ
γ
ν
= γ
µ
(η
νρ
γ
ρ
γ
ν
) (η
µρ
γ
µ
γ
ρ
)γ
ν
= γ
µ
η
νρ
γ
ν
η
ρµ
.
Now we can prove our claim.
Proof of proposition. We have to show that
[S
µν
, S
ρσ
] = η
νρ
S
µσ
η
µρ
S
νσ
+ η
µσ
S
νρ
η
νσ
S
µρ
.
The proof involves, again, writing everything out. Using the fact that
η
ρσ
commutes with everything, we know
[S
µν
, S
ρσ
] =
1
2
[S
µν
, γ
ρ
γ
σ
]
=
1
2
[S
µν
, γ
ρ
]γ
σ
+ γ
ρ
[S
µν
, γ
σ
]
=
1
2
γ
µ
η
νρ
γ
σ
γ
ν
η
µρ
γ
σ
+ γ
ρ
γ
µ
η
νσ
γ
ρ
γ
ν
η
µσ
.
Then the result follows form the fact that
γ
µ
γ
σ
= 2S
µσ
+ η
µσ
.
So we now have a representation of the Lorentz algebra. Does this give us a
representation of the (restricted) Lorentz group? Given any Λ
SO
+
(1
,
3), we
can write
Λ = exp
1
2
ρσ
M
ρσ
.
We now try to define
S[Λ] = exp
1
2
ρσ
S
ρσ
.
Is this well-defined?
We can try using an example we have previously computed. Recall that given
any rotation parameter φ = (φ
1
, φ
2
, φ
3
), we can pick
ij
= ε
ijk
φ
k
to obtain a rotation denoted by
φ
. What does this give when we exponentiate
with S
ρσ
? We use the chiral representation, so that
S
ij
=
1
4

0 σ
i
σ
i
0
,
0 σ
j
σ
j
0

=
i
2
ε
ijk
σ
k
0
0 σ
k
Then we obtain
Proposition. Let φ = (φ
1
, φ
2
, φ
3
), and define
ij
= ε
ijk
φ
k
.
Then in the chiral representation of S, writing σ = (σ
1
, σ
2
, σ
3
), we have
S[Λ] = exp
1
2
ρσ
S
ρσ
=
e
iφ·σ/2
0
0 e
iφ·σ/2
.
In particular, we can pick
φ
= (0
,
0
,
2
π
). Then the corresponding Lorentz
transformation is the identity, but
S[1] =
e
σ
3
0
0 e
σ
3
!
= 1 6= 1.
So this does not give a well-defined representation of the Lorentz group, because
different ways of representing the element
1
in the Lorentz group will give
different values of
S
[
1
]. Before we continue on to discuss this, we take note of
the corresponding formula for boosts.
Note that we have
S
0i
=
1
2
0 1
1 0
0 σ
i
σ
i
0
=
1
2
σ
i
0
0 σ
i
.
Then the corresponding result is
Proposition. Write χ = (χ
1
, χ
2
, χ
3
). Then if
0i
=
i0
= χ
i
,
then
Λ = exp
1
2
ρσ
M
ρσ
is the boost in the χ direction, and
S[Λ] = exp
1
2
ρσ
S
ρσ
=
e
χ·σ/2
0
0 e
χ·σ/2
.
Now what is this
S
[Λ] we have constructed? To begin with, writing
S
[Λ]
is a very bad notation, because
S
[Λ] doesn’t only depend on what Λ is, but
also additional information on how we “got to” Λ, i.e. the values of
ρσ
. So
going from
1
to
1
by rotating by 2
π
is different from going from
1
to
1
by doing
nothing. However, physicists like to be confusing and this will be the notation
used in the course.
Yet,
S
[Λ] is indeed a representation of something. Each point of this “some-
thing” consists of an element of the Lorentz group, and also information on
how we got there (formally, a (homotopy class of) paths from the identity to Λ).
Mathematicians have come up with a fancy name of this, called the universal
cover , and it can be naturally given the structure of a Lie group. It turns out
in this case, this universal cover is a double cover, which means that for each
Lorentz transformation Λ, there are only two non-equivalent ways of getting to
Λ.
Note that the previous statement depends crucially on what we mean by
ways of getting to Λ being “equivalent”. We previously saw that for a rotation,
we can always add 2
to the rotation angle to get a different
ρσ
. However,
it is a fact that whenever we add 4
π
to the angle, we will always get back the
same
S
[Λ] for any representation
S
of the Lorentz algebra. So the only two
non-equivalent ways are the original one, and the original one plus 2π.
(The actual reason is backwards. We know for geometric reasons that
adding 4
π
will give an equivalent path, and thus it must be the case that any
representation must not change when we add 4
π
. Trying to supply further details
and justification for what is going on would bring us deep into the beautiful
world of algebraic topology)
We give a name to this group.
Definition
(Spin group)
.
The spin group
Spin
(1
,
3) is the universal cover of
SO
+
(1
,
3). This comes with a canonical surjection to
SO
+
(1
,
3), sending “Λ and
a path to Λ” to Λ.
As mentioned, physicists like to be confusing, and we will in the future keep
talking about “representation of the Lorentz group”, when we actually mean the
representation of the spin group.
So we have the following maps of (Lie) groups:
Spin(1, 3) SO
+
(1, 3) O(1, 3) .
This diagram pretty much summarizes the obstructions to lifting a representation
of the Lorentz algebra to the Lorentz group. What a representation of the Lorentz
algebra gives us is actually a representation of the group
Spin
(1
,
3), and we want
to turn this into a representation of O(1, 3).
If we have a representation
D
of O(1
,
3), then we can easily produce a
representation of
Spin
(1
,
3) given an element
M Spin
(1
,
3), we need to
produce some matrix. We simply apply the above map to obtain an element of
O(1, 3), and then the representation D gives us the matrix we wanted.
However, going the other way round is harder. If we have a representation of
Spin
(1
,
3), then for that to give a representation of
SO
+
(1
,
3), we need to make
sure that different ways of getting to a Λ don’t give different matrices in the
representation. If this is true, then we have found ourselves a representation
of
SO
+
(1
,
3). After that, we then need to decide what we want to do with the
elements of O(1, 3) not in SO
+
(1, 3), and this again involves some work.
4.3 Properties of the spin representation
We have produced a representation of the Lorentz group, which acts on some
vector space V
=
R
4
. Its elements are known as Dirac spinors.
Definition
(Dirac spinor)
.
A Dirac spinor is a vector in the representation
space of the spin representation. It may also refer to such a vector for each point
in space.
Our ultimate goal is to construct an action that involves a spinor. So we
would want to figure out a way to get a number out of a spinor.
In the case of a 4-vector, we had these things called covectors that lived in the
“dual space”. A covector
λ
can eat up a vector
v
and spurt out a number
λ
(
v
).
Often, we write the covector as
λ
µ
and the vector as
v
µ
, and then
λ
(
v
) =
λ
µ
v
µ
.
When written out like a matrix, a covector is represented by a “row vector”.
Under a Lorentz transformation, these objects transform as
λ 7→ λΛ
1
v 7→ Λv
(What do we mean by
λ 7→ λ
Λ
1
? If we think of
λ
as a row vector, then this is
just matrix multiplication. However, we can think of it without picking a basis
as follows
λ
Λ
1
is a covector, so it is determined by what it does to a vector
v. We then define (λΛ
1
)(v) = λ
1
v))
Then the result λv transforms as
λv 7→ λΛ
1
Λv = λv.
So the resulting number does not change under Lorentz transformations. (Math-
ematically, this says a covector lives in the dual representation space of the
Lorentz group)
Moreover, given a vector, we can turn it into a covector in a canonical way,
by taking the transpose and then inserting some funny negative signs in the
space components.
We want to do the same for spinors. This terminology may or may not be
standard:
Definition
(Cospinor)
.
A cospinor is an element in the dual space to space of
spinors, i.e. a cospinor
X
is a linear map that takes in a spinor
ψ
as an argument
and returns a number
Xψ
. A cospinor can be represented as a “row vector” and
transforms under Λ as
X 7→ XS[Λ]
1
.
This is a definition we can always make. The hard part is to produce some
actual cospinors. To figure out how we can do so, it is instructive to figure out
what S[Λ]
1
is!
We begin with some computations using the γ
µ
matrices.
Proposition. We have
γ
0
γ
µ
γ
0
= (γ
µ
)
.
Proof. This is true by checking all possible µ.
Proposition.
S[Λ]
1
= γ
0
S[Λ]
γ
0
,
where
S
[Λ]
denotes the Hermitian conjugate as a matrix (under the usual basis).
Proof. We note that
(S
µν
)
=
1
4
[(γ
ν
)
, (γ
µ
)
] = γ
0
1
4
[γ
µ
, γ
ν
]
γ
0
= γ
0
S
µν
γ
0
.
So we have
S[Λ]
= exp
1
2
µν
(S
µν
)
= exp
1
2
γ
0
µν
S
µν
γ
0
= γ
0
S[Λ]
1
γ
0
,
using the fact that (
γ
0
)
2
=
1
and
exp
(
A
) = (
exp A
)
1
. Multiplying both sides
on both sides by γ
0
gives the desired formula.
We now come to our acclaimed result:
Proposition. If ψ is a Dirac spinor, then
¯
ψ = ψ
γ
0
is a cospinor.
Proof.
¯
ψ transforms as
¯
ψ 7→ ψ
S[Λ]
γ
0
= ψ
γ
0
(γ
0
S[Λ]
γ
0
) =
¯
ψS[Λ]
1
.
Definition
(Dirac adjoint)
.
For any Dirac spinor
ψ
, its Dirac adjoint is given
by
¯
ψ = ψ
γ
0
.
Thus we immediately get
Corollary.
For any spinor
ψ
, the quantity
¯
ψψ
is a scalar, i.e. it doesn’t transform
under a Lorentz transformation.
The next thing we want to do is to construct 4-vectors out of spinors. While
the spinors do have 4 components, they aren’t really related to 4-vectors, and
transform differently. However, we do have that thing called
γ
µ
, and the indexing
by
µ
should suggest that
γ
µ
transforms like a 4-vector. Of course, it is a collection
of matrices and is not actually a 4-vector, just like
µ
behaves like a 4-vector
but isn’t. But it behaves sufficiently like a 4-vector and we can combine it with
other things to get actual 4-vectors.
Proposition. We have
S[Λ]
1
γ
µ
S[Λ] = Λ
µ
ν
γ
ν
.
Proof. We work infinitesimally. So this reduces to
1
1
2
ρσ
S
ρσ
γ
µ
1 +
1
2
ρσ
S
ρσ
=
1 +
1
2
ρσ
M
ρσ
µ
ν
γ
ν
.
This becomes
[S
ρσ
, γ
µ
] = (M
ρσ
)
µ
ν
γ
ν
.
But we can use the explicit formula for M to compute
(M
ρσ
)
µ
ν
γ
ν
= (η
σµ
δ
ρ
ν
η
ρµ
δ
σ
ν
)γ
ν
= γ
ρ
η
σµ
γ
σ
η
ρµ
,
and we have previously shown this is equal to [S
ρσ
, γ
µ
].
Corollary.
The object
¯
ψγ
µ
ψ
is a Lorentz vector, and
¯
ψγ
µ
γ
ν
ψ
transforms as a
Lorentz tensor.
In
¯
ψγ
µ
γ
ν
ψ
, the symmetric part is a Lorentz scalar and is proportional to
η
µν
¯
ψψ
, and the anti-symmetric part transforms as an antisymmetric Lorentz
tensor, and is proportional to
¯
ψS
µν
ψ.
4.4 The Dirac equation
Armed with these objects, we can now construct a Lorentz-invariant action. We
will, as before, not provide justification for why we choose this action, but as we
progress we will see some nice properties of it:
Definition (Dirac Lagrangian). The Dirac Lagrangian is given by
L =
¯
ψ(
µ
µ
m)ψ.
From this we can get the Dirac equation, which is the equation of motion
obtained by varying
ψ
and
¯
ψ
independently. Varying
¯
ψ
, we obtain the equation
Definition (Dirac equation). The Dirac equation is
(
µ
µ
m)ψ = 0.
Note that this is first order in derivatives! This is different from the Klein–
Gordon equation. This is only made possible by the existence of the
γ
µ
matrices.
If we wanted to write down a first-order equation for a scalar field, there is
nothing to contract
µ
with.
We are often going to meet vectors contracted with
γ
µ
. So we invent a
notation for it:
Notation (Slash notation). We write
A
µ
γ
µ
/
A.
Then the Dirac equation says
(i
/
m)ψ = 0.
Note that the
m
here means
m1
, for
1
the identity matrix. Whenever we have a
matrix equation and a number appears, that is what we mean.
Note that the
γ
µ
matrices are not diagonal. So they mix up different
components of the Dirac spinor. However, magically, it turns out that each
individual component satisfies the Klein–Gordon equation! We know
(
µ
µ
m)ψ = 0.
We now act on the left by another matrix to obtain
(
ν
ν
+ m)(
µ
µ
m)ψ = (γ
ν
γ
µ
ν
µ
+ m
2
)ψ = 0.
But using the fact that
µ
ν
commutes, we know that (after some relabelling)
γ
ν
γ
µ
µ
ν
=
1
2
{γ
µ
, γ
ν
}
µ
ν
=
µ
µ
.
So this tells us
(
µ
µ
+ m
2
)ψ = 0.
Now nothing mixes up different indices, and we know that each component of
ψ
satisfies the Klein–Gordon equation.
In some sense, the Dirac equation is the “square root” of the Klein–Gordon
equation.
4.5 Chiral/Weyl spinors and γ
5
Recall that if we picked the chiral representation of the Clifford algebra, the
corresponding representation of the spin group is
S[Λ] =
e
1
2
χ·σ
0
0 e
1
2
χ·σ
!
for boosts
e
i
2
φ·σ
0
0 e
i
2
φ·σ
!
for rotations
.
It is pretty clear from the presentation that this is actually just two independent
representations put together, i.e. the representation is reducible. We can then
write our spinor ψ as
ψ =
U
+
U
,
where
U
+
and
U
are 2-complex-component objects. These objects are called
Weyl spinors or chiral spinors.
Definition
(Weyl/chiral spinor)
.
A left (right)-handed chiral spinor is a 2-
component complex vector
U
+
and
U
respectively that transform under the
action of the Lorentz/spin group as follows:
Under a rotation with rotation parameters φ, both of them transform as
U
±
7→ e
iφ·σ/2
U
±
,
Under a boost χ, they transform as
U
±
7→ e
±χ·σ/2
U
±
.
So these are two two-dimensional representations of the spin group.
We have thus discovered
Proposition.
A Dirac spinor is the direct sum of a left-handed chiral spinor
and a right-handed one.
In group theory language,
U
+
is the (0
,
1
2
) representation of the Lorentz
group, and U
is the (
1
2
, 0) representation, and ψ is in (
1
2
, 0) (0,
1
2
).
As an element of the representation space, the left-handed part and right-
handed part are indeed completely independent. However, we know that the
evolution of spinors is governed by the Dirac equation. So a natural question to
ask is if the Weyl spinors are coupled in the Dirac Lagrangian.
Decomposing the Lagrangian in terms of our Weyl spinors, we have
L =
¯
ψ(i
/
m)ψ
=
U
+
U
0 1
1 0
i
0
t
+ σ
i
i
t
σ
i
i
0
m
1 0
0 1

U
+
U
= iU
σ
µ
µ
U
+ iU
+
¯σ
µ
µ
U
+
m(U
+
U
+ U
U
+
),
where
σ
µ
= (1, σ), ¯σ
µ
= (1, σ).
So the left and right-handed fermions are coupled if and only if the particle is
massive. If the particle is massless, then we have two particles satisfying the
Weyl equation:
Definition (Weyl equation). The Weyl equation is
i¯σ
µ
µ
U
+
= 0.
This is all good, but we produced these Weyl spinors by noticing that in our
particular chiral basis, the matrices
S
[Λ] looked good. Can we produce a more
intrinsic definition of these Weyl spinors that do not depend on a particular
representation of the Dirac spinors?
The solution is to introduce the magic quantity γ
5
:
Definition (γ
5
).
γ
5
=
0
γ
1
γ
2
γ
3
.
Proposition. We have
{γ
µ
, γ
5
} = 0, (γ
5
)
2
= 1
for all γ
µ
, and
[S
µν
, γ
5
] = 0.
Since (γ
5
)
2
= 1, we can define projection operators
P
±
=
1
2
(1 ± γ
5
).
Example. In the chiral representation, we have
γ
5
=
1 0
0 1
.
Then we have
P
+
=
1 0
0 0
, P
=
0 0
0 1
.
We can prove, in general, that these are indeed projections:
Proposition.
P
2
±
= P
±
, P
+
P
= P
P
+
= 0.
Proof. We have
P
2
±
=
1
4
(1 ± γ
5
)
2
=
1
4
(1 + (γ
5
)
2
± 2γ
5
) =
1
2
(1 ± γ
5
),
and
P
+
P
=
1
4
(1 + γ
5
)(1 γ
5
) =
1
4
(1 (γ
5
)
2
) = 0.
We can think of these
P
±
as projection operators to two orthogonal subspaces
of the vector space
V
of spinors. We claim that these are indeed representations
of the spin group. We define
V
±
= {P
±
ψ : ψ V }.
We claim that
S
[Λ] maps
V
±
to itself. To show this, we only have to compute
infinitesimally, i.e. that
S
µν
maps
V
±
to itself. But it follows immediately from
the fact that S
µν
commutes with γ
5
that
S
µν
P
±
ψ = P
±
S
µν
ψ.
We can then define the chiral spinors as
ψ
±
= P
±
ψ.
It is clear from our previous computation of the
P
±
in the chiral basis that these
agree with what we’ve defined before.
4.6 Parity operator
So far, we’ve considered only continuous transformations, i.e. transformations
continuously connected to the identity. These are those that preserve direction
of time and orientation. However, there are two discrete symmetries in the full
Lorentz group time reversal and parity:
T : (t, x) 7→ (t, x)
P : (t, x) 7→ (t, x)
Since we defined the spin representation via exponentiating up infinitesimal
transformations, it doesn’t tell us what we are supposed to do for these discrete
symmetries.
However, we do have some clues. Recall that we figured that the
γ
µ
trans-
formed like 4-vectors under continuous Lorentz transformations. So we can
postulate that
γ
µ
also transforms like a 4-vector under these discrete symmetries.
We will only do it for the parity transformation, since they behave interestingly
for spinors. We will suppose that our parity operator acts on the γ
µ
as
P : γ
0
7→ γ
0
γ
i
7→ γ
i
.
Because of the Clifford algebra relations, we can write this as
P : γ
µ
7→ γ
0
γ
µ
γ
0
.
So we see that
P
is actually conjugating by
γ
0
(note that (
γ
0
)
1
=
γ
0
), and this
is something we can generalize to everything. Since all the interesting matrices
are generated by multiplying and adding the
γ
µ
together, all matrices transform
via conjugation by γ
0
. So it is reasonable to assume that P is γ
0
.
Axiom. The parity operator P acts on the spinors as γ
0
.
So in particular, we have
Proposition.
P : ψ 7→ γ
0
ψ, P :
¯
ψ 7→
¯
ψγ
0
.
Proposition. We have
P : γ
5
7→ γ
5
.
Proof.
The
γ
1
, γ
2
, γ
3
each pick up a negative sign, while
γ
0
does not change.
Now something interesting happens. Since
P
switches the sign of
γ
5
, it
exchanges P
+
and P
. So we have
Proposition. We have
P : P
±
7→ P
.
In particular, we have
P ψ
±
= ψ
.
As
P
still acts as right-multiplication-by-
P
1
on the cospinors, we know
that scalar quantities etc are still preserved when we act by
P
. However, if we
construct something with
γ
5
, then funny things happen, because
γ
5
gains a sign
when we transform by P . For example,
P :
¯
ψγ
5
ψ 7→
¯
ψγ
5
ψ.
Note that here it is important that we view
γ
5
as a fixed matrix that does
not transform, and
P
only acts on
¯
ψ
and
ψ
. Otherwise, the quantity does not
change. If we make
P
act on everything, then (almost) by definition the resulting
quantity would remain unchanged under any transformation. Alternatively, we
can think of P as acting on γ
5
and leaving other things fixed.
Definition
(Pseudoscalar)
.
A pseudoscalar is a number that does not change
under Lorentz boosts and rotations, but changes sign under a parity operator.
Similarly, we can look at what happens when we apply
P
to
¯
ψγ
5
γ
µ
ψ
. This
becomes
¯
ψγ
5
γ
µ
ψ 7→
¯
ψγ
0
γ
5
γ
µ
γ
0
ψ =
(
¯
ψγ
5
γ
µ
ψ µ = 0
¯
ψγ
5
γ
µ
ψ µ 6= 0
.
This is known as an axial vector.
Definition
(Axial vector)
.
An axial vector is a quantity that transforms as
vectors under rotations and boosts, but gain an additional sign when transforming
under parity.
Type Example
Scalar
¯
ψψ
Vector
¯
ψγ
µ
ψ
Tensor
¯
ψS
µν
ψ
Pseudoscalar
¯
ψγ
5
ψ
Axial vector
¯
ψγ
5
γ
µ
ψ
We can now add extra terms to
L
that use
γ
5
. These terms will typically break
the parity invariance of the theory. Of course, it doesn’t always break parity
invariance, since we can multiply two pseudoscalars together to get a scalar.
It turns out nature does use
γ
5
, and they do break parity. The classic example
is a
W
-boson, which is a vector field, which couples only to left-handed fermions.
The Lagrangian is given by
L = ··· +
g
2
W
µ
¯
ψγ
µ
(1 γ
5
)ψ,
where (1 γ
5
) acts as the left-handed projection.
A theory which puts
ψ
±
on an equal footing is known as vector-like. Otherwise,
it is known as chiral.
4.7 Solutions to Dirac’s equation
Degrees of freedom
If we just look at the definition of a Dirac spinor, it has 4 complex components,
so 8 real degrees of freedom. However, the equation of motion imposes some
restriction on how a Dirac spinor can behave. We can compute the conjugate
momentum of the Dirac spinor to find
π
ψ
=
L
˙
ψ
=
.
So the phase space is parameterized by
ψ
and
ψ
, as opposed to
ψ
and
˙
ψ
in the
usual case. But
ψ
is completely determined by
ψ
! So the phase space really
just has 8 real degrees of freedom, as opposed to 16, and thus the number of
degrees of freedom of
ψ
itself is just 4 (the number of degrees of freedom is in
general half the number of degrees of freedom of the phase space). We will see
this phenomenon in the solutions we come up with below.
Plane wave solutions
We want to solve the Dirac equation
(i
/
m)ψ = 0.
We start with the simplest ansatz (“guess”)
ψ = u
p
e
ip·x
,
where
u
p
is a constant 4-component spinor which, as the notation suggests,
depends on momentum. Putting this into the equation, we get
(γ
µ
p
µ
m)u
p
= 0.
We write out the LHS to get
m p
µ
σ
µ
p
µ
¯σ
µ
m
u
p
= 0.
Proposition. We have a solution
u
p
=
p · σξ
p · ¯σξ
for any 2-component spinor ξ normalized such that ξ
ξ = 1.
Proof. We write
u
p
=
u
1
u
2
.
Then our equation gives
(p · σ)u
2
= mu
1
(p · ¯σ)u
1
= mu
2
Either of these equations can be derived from the other, since
(p · σ)(p · ¯σ) = p
2
0
p
i
p
j
σ
i
σ
j
= p
2
0
p
i
p
j
δ
ij
= p
µ
p
µ
= m
2
.
We try the ansatz
u
1
= (p · σ)ξ
0
for a spinor ξ
0
. Then our equation above gives
u
2
=
1
m
(p · ¯σ)(p · σ)ξ
0
=
0
.
So any vector of the form
u
p
= A
(p · σ)ξ
0
0
is a solution, for A a constant. To make this look more symmetric, we choose
A =
1
m
, ξ
0
=
p · ¯σξ,
with ξ another spinor. Then we have
u
1
=
1
m
(p · σ)
p · ¯σξ =
p · σξ.
This gives our claimed solution.
Note that solving the Dirac equation has reduced the number of dimensions
from 4 to 2, as promised.
Example. A stationary, massive particle of mass m and p = 0 has
u
p
=
m
ξ
ξ
for any 2-component spinor
ξ
. Under a spacial rotation, we have a transformation
ξ 7→ e
iφ·σ/2
ξ,
which rotates ξ.
Now let’s pick
φ
= (0
,
0
, φ
3
), so that
φ · σ
is a multiple of
σ
3
. We then pick
ξ =
1
0
.
This is an eigenvector of
σ
3
with positive eigenvalue. So it is invariant under
rotations about the
x
3
axis (up to a phase). We thus say this particle has spin
up along the
x
3
axis, and this will indeed give rise to quantum-mechanical spin
as we go further on.
Now suppose our particle is moving in the x
3
direction, so that
p
µ
= (E, 0, 0, p
3
).
Then the solution to the Dirac equation becomes
u
p
=
p · σ
1
0
p · ¯σ
1
0
=
p
E p
3
1
0
p
E + p
3
1
0
In the massless limit, we have E p
3
. So this becomes
2E
0
0
1
0
.
For a spin down field, i.e.
ξ =
0
1
,
we have
u
p
=
p
E + p
3
0
1
p
E p
3
0
1
2E
0
1
0
0
.
Helicity operator
Definition
(Helicity operator)
.
The helicity operator is a projection of angular
momentum along the direction of motion
h =
ˆ
p · J =
1
2
ˆp
i
σ
i
0
0 σ
i
.
From the above expressions, we know that a massless spin up particle has
h = +
1
2
, while a massless spin down particle has helicity
1
2
.
Negative frequency solutions
We now try a different ansatz. We try
ψ = v
p
e
ip·x
.
These are known as negative frequency solutions. The
u
p
solutions we’ve found
are known as positive frequency solutions.
The Dirac solution then becomes
v
p
=
p · ση
p · ¯ση
for some 2-component η with η
η = 1.
This is exactly the same as
u
p
, but with a relative minus sign between
v
1
and v
2
.
A basis
It is useful to introduce a basis given by
η
1
= ξ
1
=
1
0
, η
2
= ξ
2
=
0
1
We then define
u
s
p
=
p · σξ
s
p · ¯σξ
s
, v
s
p
=
p · ση
s
p · ¯ση
s
These then form a basis for the solution space of the Dirac equation.
4.8 Symmetries and currents
We now look at what Noether’s theorem tells us about spinor fields. We will
only briefly state the corresponding results because the calculations are fairly
routine. The only interesting one to note is that the Lagrangian is given by
L =
¯
ψ(i
/
m)ψ,
while the equations of motion say
(i
/
m)ψ = 0.
So whenever the equations of motion are satisfied, the Lagrangian vanishes. This
is going to simplify quite a few of our results.
Translation symmetry
As usual, spacetime translation is a symmetry, and the infinitesimal transforma-
tion is given by
x
µ
7→ x
µ
ε
µ
ψ 7→ ψ + ε
µ
µ
ψ.
So we have
δψ = ε
µ
µ
ψ, δ
¯
ψ = ε
µ
µ
¯
ψ.
Then we find a conserved current
T
µν
= i
¯
ψγ
µ
ν
ψ η
µν
L = i
¯
ψγ
µ
ν
ψ.
In particular, we have a conserved energy of
E =
Z
T
00
d
3
x
=
Z
d
3
x i
¯
ψγ
0
˙
ψ
=
Z
d
3
x
¯
ψ(
i
i
+ m)ψ,
where we used the equation of motion in the last line.
Similarly, we have a conserved total momentum
P
i
=
Z
d
3
x T
0i
=
Z
d
3
x
i
ψ.
Lorentz transformations
We can also consider Lorentz transformations. This gives a transformation
ψ 7→ S[Λ]ψ(x
µ
ω
µ
ν
x
ν
).
Taylor expanding, we get
δψ = ω
µ
ν
x
ν
µ
ψ +
1
2
ω
ρσ
(S
ρσ
)ψ
= ω
µν
x
ν
µ
ψ
1
2
(S
µν
)ψ
.
Similarly, we have
δ
¯
ψ = ω
µν
x
ν
µ
¯
ψ +
1
2
¯
ψ(S
µν
)
.
The change is sign is because
¯
ψ
transforms as
S
[Λ]
1
, and taking the inverse of
an exponential gives us a negative sign.
So we can write this as
j
µ
= ω
ρσ
i
¯
ψγ
µ
(x
σ
ρ
ψ S
ρσ
ψ)
+ ω
µ
ν
x
ν
L
= ω
ρσ
i
¯
ψγ
µ
(x
σ
ρ
ψ S
ρσ
ψ)
.
So if we allow ourselves to pick different
ω
ρσ
, we would have one conserved
current for each choice:
(J
µ
)
ρσ
= x
σ
T
µρ
x
ρ
T
µσ
i
¯
ψγ
µ
S
ρσ
ψ.
In the case of a scalar field, we only have the first two terms. We will later see
that the extra term will give us spin
1
2
after quantization. For example, we have
(J
0
)
ij
= i
¯
ψS
ij
ψ =
1
2
ε
ijk
ψ
σ
k
0
0 σ
k
ψ.
This is our angular momentum operator.
Internal vector symmetry
We also have an internal vector symmetry
ψ 7→ e
ψ.
So the infinitesimal transformation is
δψ = ψ.
We then obtain
j
µ
V
=
¯
ψγ
µ
ψ.
We can indeed check that
µ
j
µ
V
= (
µ
¯
ψ)γ
µ
ψ +
¯
ψγ
µ
(
µ
ψ) = im
¯
ψψ im
¯
ψψ = 0.
The conserved quantity is then
Q =
Z
d
3
x j
0
V
=
Z
d
3
x
¯
ψγ
0
ψ =
Z
d
3
x ψ
ψ.
We will see that this is electric charge/particle number.
Axial symmetry
When
m
= 0, we have an extra internal symmetry obtained by rotating left-
handed and right-handed fermions in opposite signs. These symmetries are called
chiral. We consider the transformations
ψ 7→ e
iαγ
5
ψ,
¯
ψ 7→
¯
ψe
iαγ
5
.
This gives us an axial current
j
µ
A
=
¯
ψγ
µ
γ
5
ψ.
This is an axial vector. We claim that this is conserved only when
m
= 0. We
have
µ
j
µ
A
=
µ
¯
ψγ
µ
γ
5
ψ +
¯
ψγ
µ
γ
5
µ
ψ
= 2im
¯
ψγ
5
ψ.
This time the two terms add rather than subtract. So this vanishes iff m = 0.
It turns out that this is an anomalous symmetry, in the sense that the
classical axial symmetry does not survive quantization.
5 Quantizing the Dirac field
5.1 Fermion quantization
Quantization
We now start to quantize our spinor field! As it will turn out, this is complicated.
Recall that for a complex scalar field, classically a solution of momentum
p
can be written as
b
p
e
ip·x
+ c
p
e
ip·x
for some constants
b
p
, c
p
. Here the first term is the positive frequency solution,
and the second term is the negative frequency solution. To quantize this field,
we then promoted the b
p
, c
p
to operators, and we had
φ(x) =
Z
d
3
p
(2π)
3
1
p
2E
p
(b
p
e
ip·x
+ c
p
e
ip·x
).
Similarly, classically, every positive frequency solution to Dirac’s equation of
momentum p can be written as
(b
1
p
u
1
p
+ b
2
p
u
2
p
)e
ip·x
for some b
s
p
, and similarly a negative frequency solution can be written as
(c
1
p
v
1
p
+ c
2
p
v
2
p
)e
ip·x
for some
c
s
p
. So when we quantize our field, we could promote the
b
p
and
c
p
to
operators and obtain
ψ(x) =
2
X
s=1
Z
d
3
p
(2π)
3
1
p
2E
p
b
s
p
u
s
p
e
ip·x
+ c
s
p
v
s
p
e
ip·x
ψ
(x) =
2
X
s=1
Z
d
3
p
(2π)
3
1
p
2E
p
b
s
p
u
s
p
e
ip·x
+ c
s
p
v
s
p
e
ip·x
In these expressions,
b
s
p
and
c
s
p
are operators, and
u
s
p
and
v
s
p
are the elements of
the spinor space we have previously found, which are concrete classical objects.
We can compute the conjugate momentum to be
π =
L
˙
ψ
= i
¯
ψγ
0
=
.
The Hamiltonian density is then given by
H = π
˙
ψ L
=
˙
ψ i
¯
ψγ
0
˙
ψ i
¯
ψγ
i
i
ψ + m
¯
ψψ
=
¯
ψ(
i
i
+ m)ψ.
What are the appropriate commutation relations to impose on the
ψ
, or, equiva-
lently, b
s
p
and c
s
p
? The naive approach would be to impose
[ψ
α
(x), ψ
β
(y)] = [ψ
α
(x), ψ
β
(y)] = 0
and
[ψ
α
(x), ψ
β
(y)] = δ
αβ
δ
3
(x y).
These are in turn equivalent to requiring
[b
r
p
, b
s
q
] = (2π)
3
δ
rs
δ
3
(p q),
[c
r
p
, c
s
q
] = (2π)
3
δ
rs
δ
3
(p q),
and all others vanishing.
If we do this, we can do the computations and find that (after normal
ordering) we obtain a Hamiltonian of
H =
Z
d
3
p
(2π)
3
E
p
(b
s
p
b
s
p
c
s
p
c
s
p
).
Note the minus sign before
c
s
p
c
s
p
. This is a disaster! While we can define a
vacuum
|0i
such that
b
s
p
|0i
=
c
s
p
|0i
= 0, this isn’t really a vacuum, because we
can keep applying c
p
to get lower and lower energies. This is totally bad.
What went wrong? The answer comes from actually looking at particles in
the Real World
TM
. In the case of scalar fields, the commutation relation
[a
p
, a
q
] = 0.
tells us that
a
p
a
q
|0i = |p, qi = |q, pi.
This means the particles satisfy Bose statistics , namely swapping two particles
gives the same state.
However, we know that fermions actually satisfy Fermi statistics. Swapping
two particles gives us a negative sign. So what we really need is
b
r
p
b
s
q
= b
s
q
b
r
p
.
In other words, instead of setting the commutator zero, we want to set the
anticommutator to zero:
{b
r
p
, b
s
q
} = 0.
In general, we are going to replace all commutators with anti-commutators. It
turns out this is what fixes our theory. We require that
Axiom. The spinor field operators satisfy
{ψ
α
(x), ψ
β
(y)} = {ψ
α
(x), ψ
β
(y)} = 0,
and
{ψ
α
(x), ψ
β
(y)} = δ
αβ
δ
3
(x y).
Proposition. The anti-commutation relations above are equivalent to
{c
r
p
, c
s
q
} = {b
r
p
, b
s
q
} = (2π)
3
δ
rs
δ
3
(p q),
and all other anti-commutators vanishing.
Note that from a computational point of view, these anti-commutation
relations are horrible. When we manipulate our operators, we will keep on
introducing negative signs to our expressions, and as we all know, keeping track
of signs is the hardest thing in mathematics. Indeed, these negative signs will
come and haunt us all the time, and we have to insert funny negative signs
everywhere.
If we assume these anti-commutation relations with the same Hamiltonian
as before, then we would find
Proposition.
H =
Z
d
3
p
(2π)
3
E
p
b
s
p
b
s
p
+ c
s
p
c
s
p
.
We now define the vacuum as in the bosonic case, where it is annihilated by
the b and c operators:
b
s
p
|0i = c
s
p
|0i = 0.
Although the
b
’s and
c
’s satisfy the anti-commutation relations, the Hamiltonian
satisfies commutator relations
[H, b
r
p
] = Hb
r
p
b
r
p
H
=
Z
d
3
q
(2π)
3
E
q
(b
s
q
b
s
q
+ c
s
q
c
s
q
)b
r
p
b
r
p
Z
d
3
q
(2π)
3
E
q
(b
s
q
b
s
q
+ c
s
q
c
s
q
)
= E
p
b
r
p
.
Similarly, we have
[H, b
r
p
] = E
p
b
r
p
and the corresponding relations for c
r
p
and c
r
p
.
Heisenberg picture
As before, we would like to put our Dirac field in the Heisenberg picture. We
have a spinor operator at each point x
µ
ψ(x) = ψ(x, t).
The time evolution of the field is given by
ψ
t
= i[H, ψ],
which is solved by
ψ(x) =
2
X
s=1
Z
d
3
p
(2π)
3
1
p
2E
p
b
s
p
u
s
p
e
ip·x
+ c
s
p
v
s
p
e
ip·x
ψ
(x) =
2
X
s=1
Z
d
3
p
(2π)
3
1
p
2E
p
b
s
p
u
s
p
e
ip·x
+ c
s
p
v
s
p
e
ip·x
.
We now look at the anti-commutators of the fields in the Heisenberg picture.
We define
Definition.
iS
αβ
(x y) = {ψ
α
(x),
¯
ψ
β
(y)}.
Dropping the indices, we write
iS(x y) = {ψ(x),
¯
ψ(y)}.
If we drop the indices, then we have to remember that
S
is a 4
×
4 matrix with
indices α, β.
Substituting ψ and
¯
ψ in using the expansion, we obtain
iS(x y) =
X
s,r
Z
d
3
p d
3
q
(2π)
6
1
p
4E
p
E
q
{b
s
p
, b
r
q
}u
s
p
¯u
r
q
e
i(p·xq·y)
+ {c
s
p
, c
r
q
}v
s
p
¯v
r
q
e
i(p·xq·y)
=
X
s
Z
d
3
p
(2π)
3
1
2E
p
u
s
p
¯u
s
p
e
p·(xy)
+ v
s
p
¯v
s
p
e
ip·(xy)
=
Z
d
3
p
(2π)
3
1
2E
p
h
(
/
p + m)e
ip·(xy)
+ (
/
p m)e
ip·(xy)
i
Recall that we had a scalar propagator
D(x y) =
Z
d
3
p
(2π)
3
1
2E
p
e
ip·(xy)
.
So we can write the above quantity in terms of the D(x y) by
iS(x y) = (i
/
x
+ m)(D(x y) D(y x)).
Some comments about this: firstly, for spacelike separations where
(x y)
2
< 0,
we had D(x y) D(y x) = 0.
For bosonic fields, we made a big deal of this, since it ensured that
[
φ
(
x
)
, φ
(
y
)] = 0. We then interpreted this as saying the theory was causal.
For spinor fields, we have anti-commutation relations rather than commutation
relations. What does this say about causality? The best we can say is that all
our observables are bilinear (or rather quadratic) in fermions, e.g.
H =
Z
d
3
x ψ
(
i
i
+ m)ψ.
So these observables will commute for spacelike separations.
Propagators
The next thing we need to figure out is the Feynman propagator. By a similar
computation to that above, we can determine the vacuum expectation value
h0|ψ(x)
¯
ψ(y) |0i =
Z
d
3
p
(2π)
3
1
2E
p
(
/
p + m)e
ip·(xy)
h0|
¯
ψ(y)ψ(x) |0i =
Z
d
3
p
(2π)
3
1
2E
p
(
/
p m)e
ip·(xy)
.
We define the Feynman propagator by
Definition
(Feynman propagator)
.
The Feynman propagator of a spinor field
is the time-ordered product
S
F
(x y) = h0|T ψ
α
(x)
¯
ψ
β
(y) |0i =
(
h0|ψ
α
(x)
¯
ψ
β
(y) |0i x
0
> y
0
h0|
¯
ψ
β
(y)ψ
α
(x) |0i y
0
> x
0
Note that we have a funny negative sign! This is necessary for Lorentz
invariance when (
x y
)
2
<
0, then there is no invariant way to determine
whether one time is bigger than the other. So we need the expressions for the two
cases to agree. In the case of a boson, we had
h0|φ
(
x
)
φ
(
y
)
|0i
=
h0|φ
(
y
)
φ
(
x
)
|0i
,
but here we have anti-commutation relations, so we have
ψ(x)
¯
ψ(y) =
¯
ψ(y)ψ(x).
So we need to insert the negative sign to ensure time-ordered product as defined
is Lorentz invariant.
For normal ordered products, we have the same behaviour. Fermionic opera-
tors anti-commute, so
:ψ
1
ψ
2
: = :ψ
2
ψ
1
: .
As before, the Feynman propagator appears in Wick’s theorem as the contraction:
Proposition.
ψ(x)
¯
ψ(y) = T(ψ(x)
¯
ψ(y)) :ψ(x)
¯
ψ(y): = S
F
(x y).
When we apply Wick’s theorem, we need to remember the minus sign. For
example,
:ψ
1
ψ
2
ψ
3
ψ
4
: = :ψ
1
ψ
3
ψ
2
ψ
4
: = ψ
1
ψ
3
:ψ
2
ψ
4
: .
Again, S
F
can be expressed as a 4-momentum integral
S
F
(x y) = i
Z
d
4
p
(2π)
4
e
ip·(xy)
/
p + m
p
2
m
2
+
.
As in the case of a real scalar field, it is a Green’s function of Dirac’s equation:
(i
/
x
m)S
F
(x y) =
4
(x y).
5.2 Yukawa theory
The interactions between a Dirac fermion and a real scalar field are governed by
the Yukawa interaction. The Lagrangian is given by
L =
1
2
µ
φ∂
µ
φ +
1
2
µ
2
φ
2
+
¯
ψ(
µ
µ
m)ψ λφ
¯
ψψ.
where
µ
is the mass of the scalar and
m
is the mass of the Lagrangian. This is
the full version of the Yukawa theory. Note that the kinetic term implies that
[ψ] = [
¯
ψ] =
3
2
.
Since [φ] = 1 and [L] = 4, we know that
[λ] = 0.
So this is a dimensionless coupling constant, which is good.
Note that we could have written the interaction term of the Lagrangian as
L
int
= λφ
¯
ψγ
5
ψ.
We then get a pseudoscalar Yuakwa theory.
We again do painful computations directly to get a feel of how things work,
and then state the Feynman rules for the theory.
Example. Consider a fermion scattering ψψ ψψ.
ψ
ψ
ψ
ψ
p
q
p
0
q
0
We have initial and final states
|ii =
p
4E
p
E
q
b
s
p
b
r
q
|0i
|fi =
p
4E
p
0
E
q
0
b
s
0
p
0
b
r
0
q
0
|0i.
Here we have to be careful about ordering the creation operators, because they
anti-commute, not commute. We then have
hf| =
p
4E
p
0
E
q
h0|b
r
0
q
0
b
s
0
p
0
.
We can then look at the O(λ
2
) term in hf |(S 1) |ii. We have
()
2
2
Z
d
4
x
1
d
4
x
2
T
¯
ψ(x
1
)ψ(x
1
)φ(x
1
)
¯
ψ(x
2
)ψ(x
2
)φ(x
2
)
The contribution to scattering comes from the contraction
:
¯
ψ(x
1
)ψ(x
1
)
¯
ψ(x
2
)ψ(x
2
): φ(x
1
)φ(x
2
).
The two
ψ
’s annihilate the initial state, whereas the two
¯
ψ
create the final state.
This is just like the bosonic case, but we have to be careful with minus signs
and spinor indices.
Putting in |ii and ignoring c operators as they don’t contribute, we have
:
¯
ψ(x
1
)ψ(x
1
)
¯
ψ(x
2
)ψ(x
2
): b
s
p
b
r
q
|0i
=
Z
d
3
k
1
d
3
k
2
(2π)
6
2
p
E
k
1
E
k
2
[
¯
ψ(x
1
)u
m
k
1
][
¯
ψ(x
2
)u
n
k
2
]e
ik
1
·x
1
ik
2
·x
2
b
m
k
1
b
n
k
2
b
s
p
b
r
q
|0i
where the square brackets show contraction of spinor indices
=
1
2
p
E
p
E
q
[
¯
ψ(x
1
)u
r
q
][
¯
ψ(x
2
)u
s
p
]e
iq·x
1
ip·x
2
[
¯
ψ(x
1
)u
s
p
][
¯
ψ(x
2
)u
r
q
]e
ip·xiq·x
2
|0i.
The negative sign, which arose from anti-commuting the
b
’s, is crucial. We put
in the left hand side to get
h0|b
r
0
q
0
b
s
0
p
0
[
¯
ψ(x
1
)u
r
q
][
¯
ψ(x
2
)u
s
p
]
=
1
2
p
E
p
0
E
q
0
[¯u
s
p
0
u
r
q
][¯u
r
0
q
0
u
s
p
]e
ip
0
·x
1
+iq
0
·x
2
[¯u
r
0
q
0
u
r
q
][¯u
s
0
p
0
u
s
p
]e
ip
0
·x
2
+iq
0
·x
1
.
Putting all of this together, including the initial relativistic normalization of the
initial state and the propagator, we have
hf|S 1 |ii
= ()
2
Z
d
4
x
1
d
4
x
2
(2π)
4
d
3
k
(2π)
4
ie
ik·(x
1
x
2
)
k
2
µ
2
+
[¯u
s
0
p
0
· u
s
p
][¯u
t
0
q
0
· u
r
q
]e
ix
1
·(q
0
q)+ ix
2
·(p
0
p)
[¯u
s
0
p
0
u
r
q
][u
r
0
q
0
u
s
p
]e
ix
1
·(p
0
q)+ ix
2
·(q
0
p)
= i()
2
Z
d
4
k(2π)
4
k
2
µ
2
+
[¯u
s
0
p
0
· u
s
p
][u
r
0
q
0
· u
r
q
]δ
4
(q
0
q + k)δ
4
(p
0
p + k)
[¯u
s
0
p
0
· u
y
q
][¯u
r
0
q
0
· u
s
p
]δ
4
(p
0
q + k)δ
4
(q
0
p + k)
.
So we get
hf|S 1 |ii = iA(2π)
4
δ
4
(p + q p
0
q
0
),
where
A = ()
2
[¯u
s
0
p
0
· u
s
p
][¯u
r
0
q
0
· u
r
q
]
(p
0
p)
2
µ
2
+
[¯u
s
0
p
0
· u
r
q
][¯u
r
0
q
0
· u
s
p
]
(q
0
p)
2
µ
2
+
!
.
5.3 Feynman rules
As always, the Feynman rules are much better:
(i)
An incoming fermion is given a spinor
u
r
p
, and an outgoing fermion is given
a ¯u
r
p
.
u
r
p
p
¯u
s
p
p
(ii)
For an incoming anti-fermion, we put in a
¯v
r
p
, and for an outgoing anti-
fermion we put a v
r
p
.
¯v
r
p
p
v
r
p
p
(iii) For each vertex we get a factor of ().
(iv) Each internal scalar line gets a factor of
i
p
2
µ
2
+
,
and each internal fermion we get
i(
/
p + m)
p
2
m
2
+
.
(v)
The arrows on fermion lines must flow consistently, ensuring fermion
conservation.
(vi)
We impose energy-momentum conservation at each vertex, and if we have
a loop, we integrate over all possible momentum.
(vii) We add an extra minus sign for a loop of fermions.
Note that the Feynman propagator is a 4
×
4 matrix. The indices are contracted
with each vertex, either with further propagators or external spinors.
We look at computations using Feynman rules.
Example (Nucleon scattering). For nucleon scattering, we have diagrams
p, s
q, r
p
0
, s
0
q
0
, r
0
p p
0
p, s
q, r
p
0
, s
0
q
0
, r
0
p q
0
In the second case, the fermions are swapped, so we get a relative minus sign.
So by the Feynman rules, this contributes an amplitude of
A = ()
2
[¯u
s
0
p
0
· u
s
p
][¯u
r
0
q
0
· u
r
q
]
(p
0
p)
2
µ
2
+
[¯u
s
0
p
0
· u
r
q
][¯u
r
0
q
0
· u
s
p
]
(q
0
p)
2
µ
2
+
!
.
Example. We look at nucleon to meson scattering
ψ
¯
ψ φφ.
We have two diagrams
p, s
q, r
p
0
q
0
p, s
q, r
p
0
q
0
This time, we flipped two bosons, so we are not gaining a negative sign. Then
the Feynman rules give us
A = ()
2
¯v
r
q
(
/
p
/
p
0
+ m)u
s
p
(p p
0
)
2
m
2
+
+
¯v
r
q
(
/
p
/
q
0
+ m)u
s
p
(p q
0
)
2
m
2
+
.
Example. We can also do nucleon anti-nucleon scattering
ψ
¯
ψ ψ
¯
ψ.
As before, we have two contributions
p, s
q, r
p
0
, s
0
q
0
, r
0
p p
0
p, s
q, r
p
0
, s
0
q
0
, r
0
This time we have an amplitude of
A = ()
2
[¯u
s
0
p
0
· u
s
p
][¯v
r
q
· v
r
0
q
0
]
(p p
0
)
2
µ
2
+
+
[ ¯v
q
r
· u
s
p
][¯u
s
0
p
0
· v
r
0
q
0
]
(p + q)
2
µ
2
+
!
.
We have these funny signs that we have to make sure are right. We have an
initial state
|ii =
p
4E
p
E
q
b
s
p
c
r
q
|0i |p, s; q, ri
|fi =
p
4E
p
0
E
q
0
b
s
0
p
0
c
r
0
q
0
|0i |p
0
, s
0
; q
0
, r
0
i
To check the signs, we work through the computations, but we can ignore all
factors because we only need the final sign. Then we have
ψ b + c
¯
ψ b
+ c
So we can check
hf|:
¯
ψ(x
1
)ψ(x
1
)
¯
ψ(x
2
)ψ(x
2
): b
s
p
c
r
q
|0i
hf|[¯v
m
k
1
ψ(x
1
)][
¯
ψ(x
2
)u
n
k
2
]c
m
k
1
b
n
k
2
b
s
p
c
r
q
|0i
hf|[¯v
m
q
ψ(x
1
)][
¯
ψ(x
2
)u
n
p
] |0i
h0|c
r
0
q
0
b
s
0
p
0
c
m
`
1
b
n
`
2
[¯v
r
q
· v
m
`
1
][¯u
n
`
2
· u
s
p
] |0i
[¯v
r
q
· v
r
0
q
0
][¯u
s
0
p
0
· u
s
p
],
where we got the final sign by anti-commuting
c
m
`
1
and
b
s
0
p
0
to make the
c
’s and
b’s get together.
We can follow a similar contraction to get a positive sign for the second
diagram.
6 Quantum electrodynamics
Finally, we get to the last part of the course, where we try to quantize electro-
magnetism. Again, this will not be so straightforward. This time, we will have
to face the fact that the electromagnetic potential
A
is not uniquely defined,
but can have many gauge transformations. The right way to encode this infor-
mation when we quantize the theory is not immediate, and will require some
experimentation.
6.1 Classical electrodynamics
We begin by reviewing what we know about classical electrodynamics. Classically,
the electromagnetic field is specified by an electromagnetic potential
A
, from
which we derive the electromagnetic field strength tensor
F
µν
=
µ
A
ν
ν
A
µ
,
We can then find the dynamics of a free electromagnetic field by
L =
1
4
F
µν
F
µν
.
The Euler-Lagrange equations give
µ
F
µν
= 0.
It happens that F satisfies the mystical Bianchi identity
λ
F
µν
+
µ
F
νλ
+
ν
F
λµ
= 0,
but we are not going to use this.
We can express these things in terms of the electromagnetic field. We write
A = (φ, A). Then we define
E = −∇φ
˙
A, B = A.
Then F
µν
can be written as
F
µν
=
0 E
x
E
y
E
z
E
x
0 B
z
B
y
E
y
B
z
0 B
x
E
z
B
y
B
x
0
Writing out our previous equations, we find that
· E = 0
· B = 0
˙
B = −∇ E
˙
E = B.
We notice something wrong. Photons are excitations of the electromagnetic
field. However, the photon only has two polarization states, i.e. two degrees of
freedom, while this vector field A
µ
has four. How can we resolve this?
There are two parts to the resolution. Firstly, note that the
A
0
field is not
dynamical, as it has no kinetic term in the Lagrangian (the
0
A
0
terms cancel
out by symmetry). Thus, if we’re given
A
i
and
˙
A
i
at some initial time
t
0
, then
A
0
is fully determined by · E = 0, since it says
·
˙
A +
2
A
0
= 0.
This has a solution
A
0
(x) =
Z
d
3
x
·
˙
A(x
0
)
4π(x x
0
)
.
So
A
0
is not independent, and we’ve got down to three degrees of freedom.
What’s the remaining one?
The next thing to note is that the theory is invariant under the transformation
A
µ
(x) 7→ A
µ
(x) +
µ
λ(x),
Indeed, the only “observable” thing is
F
µν
, and we can see that it transforms as
F
µν
7→
µ
(A
ν
+
ν
λ)
ν
(A
µ
+
µ
λ) = F
µν
.
It looks like that we now have an infinite number of symmetries.
This is a different kind of symmetry. Previously, we had symmetries acting
at all points in the universe in the same way, say
ψ 7→ e
ψ
for some
α R
.
This gave rise to conservation laws by Noether’s theorem.
Now do we get infinitely many symmetries from Noether’s theorem? The
answer is no. The local symmetries we’re considering now have a different
interpretation. Rather than taking a physical state to another physical state,
they are really a redundancy in our description. These are known as local or
gauge symmetries. Seeing this, we might be tempted to try to formulate the
theory only in terms of gauge invariant objects like
E
and
B
, but it turns out
this doesn’t work. To describe nature, it appears that we have to introduce
quantities that we cannot measure.
Now to work with our theory, it is common to fix a gauge. The idea is that
we specify some additional rules we want
A
µ
to satisfy, and hopefully this will
make sure there is a unique
A
µ
we can pick for each equivalence class. Picking
the right gauge for the right problem can help us a lot. This is somewhat like
picking a coordinate system. For example, in a system with spherical symmetry,
using spherical polar coordinates will help us a lot.
There are two gauges we would be interested in.
Definition (Lorenz gauge). The Lorenz gauge is specified by
µ
A
µ
= 0
Note that this is Lorenz, not Lorentz!
To make sure this is a valid gauge, we need to make sure that each electro-
magnetic potential has a representation satisfying this condition.
Suppose we start with
A
0
µ
such that
µ
A
0µ
=
f
. If we introduce a gauge
transformation
A
µ
= A
0
µ
+
µ
λ,
then we have
µ
A
µ
=
µ
µ
λ + f.
So we need to find a λ such that
µ
µ
λ = f.
By general PDE theory, such a λ exists. So we are safe.
However, it turns out this requirement does not pick a unique representation
in the gauge orbit. We are free to make a further gauge transformation by with
µ
µ
λ = 0,
which has non-trivial solutions (e.g. λ(x) = x
0
).
Another important gauge is the Coulomb gauge:
Definition (Coulomb gauge). The Coulomb gauge requires
· A = 0.
Of course, this is not a Lorentz-invariant condition.
Similar to the previous computations, we know that this is a good gauge.
Looking at the integral we’ve found for A
0
previously, namely
A
0
=
Z
d
3
x
0
·
˙
A(x
0
)
4π|x x
0
|
,
we find that
A
0
= 0 all the time. Note that this happens only because we do
not have matter.
Here it is easy to see the physical degrees of freedom the three components
in
A
satisfy a single constraint
· A
= 0, leaving behind two physical degrees
of freedom, which gives the desired two polarization states.
6.2 Quantization of the electromagnetic field
We now try to quantize the field, using the Lorenz gauge. The particles we
create in this theory would be photons, namely quanta of light. Things will go
wrong really soon.
We first try to compute the conjugate momentum
π
µ
of the vector field
A
µ
.
We have
π
0
=
L
˙
A
0
= 0.
This is slightly worrying, because we would later want to impose the commutation
relation
[A
0
(x), π
0
(y)] =
3
(x y),
but this is clearly not possible if π
0
vanishes identically!
We need to try something else. Note that under the Lorenz gauge
µ
A
µ
= 0,
the equations of motion tell us
µ
µ
A
ν
= 0.
The trick is to construct a Lagrangian where this is actually the equation of
motion, and then later impose
µ
A
µ
= 0 after quantization.
This is not too hard. We can pick the Lagrangian as
L =
1
4
F
µν
F
µν
1
2
(
µ
A
µ
)
2
.
We can work out the equations of motion of this, and we get
µ
F
µν
+
ν
(
µ
A
µ
) = 0.
Writing out the definition of
F
µν
, one of the terms cancels with the second bit,
and we get
µ
µ
A
ν
= 0.
We are now going to work with this Lagrangian, and only later impose
µ
A
µ
= 0
at the operator level.
More generally, we can use a Lagrangian
L =
1
4
F
µν
F
µν
1
2α
(
µ
A
µ
)
2
,
where
α
is a fixed constant. Confusingly, the choice of
α
is also known as a
gauge. We picked
α
= 1, which is called the Feynman gauge. If we take
α
0,
we obtain the Landau gauge.
This new theory has no gauge symmetry, and both
A
0
and
A
are dynamical
fields. We can find the conjugate momenta as
π
0
=
L
˙
A
0
=
µ
A
µ
π
i
=
L
˙
A
i
=
i
A
0
˙
A
i
.
We now apply the usual commutation relations
[A
µ
(x), A
ν
(y)] = [π
µ
(x), π
ν
(y)] = 0
[A
µ
(x), π
ν
(y)] =
3
(x y)δ
ν
µ
.
Equivalently, the last commutation relation is
[A
µ
(x), π
ν
(y)] =
3
(x y)η
µν
.
As before, in the Heisenberg picture, we get equal time commutation relations
[A
µ
(x, t),
˙
A
ν
(y, t)] =
µν
δ
3
(x y).
As before, we can write our operators in terms of creation and annihilation
operators:
A
µ
(x) =
Z
d
3
p
(2π)
3
1
p
2|p|
3
X
λ=0
E
(λ)
µ
(p)[a
λ
p
e
ip·x
+ a
λ
p
e
ip·x
],
π
ν
(x) =
Z
d
3
p
(2π)
3
i
r
|p|
2
3
X
λ=0
(E
(λ)
(p))
ν
[a
λ
p
e
ip·x
a
λ
p
e
ip·x
],
where for each
p
, the vectors
{E
(0)
(
p
)
, E
(1)
(
p
)
, E
(2)
(
p
)
, E
(3)
(
p
)
}
form a basis of
R
3,1
. The exact choice doesn’t really matter much, but we will suppose we pick
a basis with the following properties:
(i) E
(0)
(p) will be a timelike vector, while the others are spacelike.
(ii)
Viewing
p
as the direction of motion, we will pick
E
(3)
(
p
) to be a longi-
tudinal polarization, while
E
(1),(2)
(
p
) are transverse to the direction of
motion, i.e. we require
E
(1),(2)
µ
(p) · p
µ
= 0.
(iii) The normalization of the vectors is given by
E
(λ)
µ
(E
(λ
0
)
)
µ
= η
λλ
0
,
We can explicitly write down a choice of such basis vectors. When
p
(1, 0, 0, 1), then we choose
E
(0)
µ
(p) =
1
0
0
0
, E
(1)
µ
(p) =
0
1
0
0
, E
(2)
µ
(p) =
0
0
1
0
, E
(3)
µ
(p) =
0
0
0
1
.
Now for a general
p
, we pick a basis where
p
(1
,
0
,
0
,
1), which is always possible,
and then we define
E
(
p
) as above in that basis. This gives us a Lorentz-invariant
choice of basis vectors satisfying the desired properties.
One can do the tedious computations required to find the commutation
relations for the creation and annihilation operators:
Theorem.
[a
λ
p
, a
λ
0
q
] = [a
λ
p
, a
λ
0
q
] = 0
and
[a
λ
p
, a
λ
0
q
] = η
λλ
0
(2π)
3
δ
3
(p q).
Notice the strange negative sign!
We again define a vacuum |0i such that
a
λ
p
|0i = 0
for λ = 0, 1, 2, 3, and we can create one-particle states
|p, λi = a
λ
p
|0i.
This makes sense for
λ
= 1
,
2
,
3, but for
λ
= 0, we have states with negative
norm:
hp, 0|q, 0i h0|a
0
p
a
0
q
|0i
= (2π)
3
δ
3
(p q).
A Hilbert space with a negative norm means that we have negative probabilities.
This doesn’t make sense.
Here is where the Lorenz gauge comes in. By imposing
µ
A
µ
= 0, we are
going to get rid of bad things. But how do we implement this constraint? We
can try implementing it in a number of ways. We will start off in the obvious
way, which turns out to be too strong, and we will keep on weakening it until it
works.
If we just asked for
µ
A
µ
= 0, for
A
µ
the operator, then this doesn’t work,
because
π
0
=
µ
A
µ
,
and if this vanishes, then the commutation conditions cannot possibly be obeyed.
Instead, we can try to impose this on the Hilbert space rather than on the
operators. After all, that’s where the trouble lies. Maybe we can try to split the
Hilbert space up into good states and bad states, and then just look at the good
states only?
How do we define the good, physical states? Maybe we can impose
µ
A
µ
|ψi = 0
on all physical (“good”) states, but it turns out even this condition is a bit too
strong, as the vacuum will not be physical! To see this, we decompose
A
µ
(x) = A
+
µ
(x) + A
µ
(x),
where
A
+
µ
has the annihilation operators and
A
µ
has the creation operators.
Explicitly, we have
A
+
µ
(x) =
Z
d
3
p
(2π)
3
1
p
2|p|
E
(λ)
µ
a
λ
p
e
ip·x
A
µ
(x) =
Z
d
3
p
(2π)
3
1
p
2|p|
E
(λ)
µ
a
λ
p
e
ip·x
,
where summation over λ is implicit. Then we have
µ
A
+
µ
|0i = 0,
but we have
µ
A
µ
|0i 6= 0.
So not even the vacuum is physical! This is very bad.
Our final attempt at weakening this is to ask the physical states to satisfy
µ
A
+
µ
(x) |ψi = 0.
This ensures that
hψ|
µ
A
µ
|ψi = 0,
as
µ
A
+
µ
will kill the right hand side and
µ
A
µ
will kill the left. So
µ
A
µ
has
vanishing matrix elements between physical states.
This is known as the Gupta-Bleuler condition. The linearity of this condition
ensures that the physical states span a vector space H
phys
.
What does
H
phys
look like? To understand this better, we write out what
µ
A
µ
is. We have
µ
A
µ
=
Z
d
3
p
(2π)
3
1
p
2|p|
E
(λ)
µ
a
λ
p
(ip
µ
)e
ip·x
=
Z
d
3
p
(2π)
3
1
p
2|p|
i(a
3
p
a
0
p
)e
ip·x
,
using the properties of our particular choice of the
E
µ
. Thus the condition is
equivalently
(a
3
p
a
0
p
) |ψi = 0.
This means that it is okay to have a timelike or longitudinal photons, as long as
they come in pairs of the same momentum!
By restricting to these physical states, we have gotten rid of the negative
norm states. However, we still have the problem of certain non-zero states having
zero norm. Consider a state
|φi = a
0
p
a
3
p
|0i.
This is an allowed state, since it has exactly one timelike and one longitudinal
photon, each of momentum
p
. This has zero norm, as the norm contributions
from the
a
0
p
part cancel from those from the
a
3
p
part. However, this state is
non-zero! We wouldn’t want this to happen if we want a positive definite inner
product.
The solution is to declare that if two states differ only in the longitudinal
and timelike photons, then we consider them physically equivalent, or gauge
equivalent! In other words, we are quotienting the state space out by these
zero-norm states.
Of course, if we want to do this, we need to go through everything we do and
make sure that our physical observables do not change when we add or remove
these silly zero-norm states. Fortunately, this is indeed the case, and we are
happy.
Before we move on, we note the value of the Feynman propagator:
Theorem.
The Feynman propagator for the electromagnetic field, under a
general gauge α, is
h0|T A
µ
(x)A
ν
(y) |0i =
Z
d
4
p
(2π)
4
i
p
2
+
η
µν
+ (α 1)
p
µ
p
ν
p
2
e
ip·(xy)
.
6.3 Coupling to matter in classical field theory
We now move on to couple our EM field with matter. We first do it in the clear
and sensible universe of classical field theory. We will tackle two cases in the
first case, we do coupling with fermions. In the second, we couple with a mere
complex scalar field. It should be clear one how one can generalize these further
to other fields.
Suppose the resulting coupled Lagrangian looked like
L =
1
4
F
µν
F
µν
A
µ
j
µ
,
plus some kinetic and self-interaction terms for the field we are coupling with.
Then the equations of motion give us
µ
F
µν
= j
ν
.
Since F
µν
is anti-symmetric, we know we must have
0 =
µ
ν
F
µν
=
ν
j
ν
.
So we know
j
µ
is a conserved current. So to couple the EM field to matter, we
need to find some conserved current.
Coupling with fermions
Suppose we had a spinor field ψ with Lagrangian
L =
¯
ψ(i
/
m)ψ.
This has an internal symmetry
ψ 7→ e
iαψ
¯
ψ 7→ e
¯
ψ,
and this gives rise to a conserved current
j
µ
=
¯
ψγ
µ
ψ.
So let’s try
L =
1
4
F
µν
F
µν
+
¯
ψ(i
/
m)ψ e
¯
ψγ
µ
A
µ
ψ,
where e is some coupling factor.
Have we lost gauge invariance now that we have an extra term? If we just
substituted
A
µ
for
A
µ
+
µ
λ
, then obviously the Lagrangian changes. However,
something deep happens. When we couple the two terms, we don’t just add a
factor into the Lagrangian. We have secretly introduced a new gauge symmetry
to the fermion field, and now when we take gauge transformations, we have to
transform both fields at the same time.
To see this better, we rewrite the Lagrangian as
L =
1
4
F
µν
F
µν
+
¯
ψ(i
/
D m)ψ,
where D is the covariant derivative given by
D
µ
ψ =
µ
ψ + ieA
µ
ψ.
We now claim that L is invariant under the simultaneous transformations
A
µ
7→ A
µ
+
µ
λ(x)
ψ 7→ e
ieλ(x)
ψ.
To check that this is indeed invariant, we only have to check that
¯
ψ
/
Dψ
term.
We look at how D
µ
ψ transforms. We have
D
µ
ψ =
µ
ψ + ieA
µ
ψ
7→
µ
(e
ieλ(x)
ψ) + ie(A
µ
+
µ
λ(x))e
ieλ(x)
ψ
= e
ieλ(x)
D
µ
ψ.
So we have
¯
ψ
/
Dψ 7→
¯
ψ
/
Dψ,
i.e. this is gauge invariant.
So as before, we can use the gauge freedom to remove unphysical states after
quantization. The coupling constant
e
has the interpretation of electric charge,
as we can see from the equation of motion
µ
F
µν
= ej
ν
.
In electromagnetism, j
0
is the charge density, but after quantization, we have
Q = e
Z
d
3
x
¯
ψγ
0
ψ
=
Z
d
3
p
(2π)
3
(b
s
p
b
s
p
c
s
p
c
s
p
)
= e × (number of electrons number of anti-electrons),
with an implicit sum over the spin
s
. So this is the total charge of the electrons,
where anti-electrons have the opposite charge!
For QED, we usually write e in terms of the fine structure constant
α =
e
2
4π
1
137
for an electron.
Coupling with complex scalars
Now let’s try to couple with a scalar fields. For a real scalar field, there is no
suitable conserved current to couple
A
µ
to. For a complex scalar field
ϕ
, we can
use the current coming from ϕ e
ϕ, namely
j
µ
= (
µ
ϕ)
ϕ ϕ
µ
ϕ.
We try the obvious thing with an interaction term
L
int
= i[(
µ
ϕ)
ϕ ϕ
µ
ϕ]A
µ
This doesn’t work. The problem is that we have introduced a new term
j
µ
A
µ
to
the Lagrangian. If we compute the conserved current due to
ϕ 7→ e
ϕ
under
the new Lagrangian, it turns out we will obtain a different
j
µ
, which means in
this theory, we are not really coupling to the conserved current, and we find
ourselves going in circles.
To solve this problem, we can try random things and eventually come up
with the extra term we need to add to the system to make it consistent, and
then be perpetually puzzled about why that worked. Alternatively, we can also
learn from what we did in the previous example. What we did, at the end, was
to invent a new covariant derivative:
D
µ
ϕ =
µ
ϕ + ieA
µ
ϕ,
and then replace all occurrences of
µ
with D
µ
. Under a simultaneous gauge
transformation
A
µ
7→ A
µ
+
µ
λ(x)
ϕ 7→ e
ieλ(x)
ϕ,
the covariant derivative transforms as
D
µ
ϕ 7→ e
(x)
D
µ
ϕ,
as before. So we can construct a gauge invariant Lagrangian by
L =
1
4
F
µν
F
µν
+ (D
µ
ϕ)
(D
µ
ϕ) m
2
ϕ
ϕ.
The current is the same thing as we’ve had before, except we have
j
µ
= i(D
µ
ϕ)
ϕ ϕ
D
µ
ϕ.
In general, for any field
φ
taking values in a complex vector space, we have a
U(1) gauge symmetry
φ 7→ e
(x)
φ.
Then we can couple with the EM field by replacing
µ
φ 7→ D
µ
φ =
µ
φ + ieλ(x)A
µ
φ.
This process of just taking the old Lagrangian and then replacing partial deriva-
tives with covariant derivatives is known as minimal coupling. More details on
how this works can be found in the Symmetries, Fields and Particles course.
6.4 Quantization of interactions
Let’s work out some quantum amplitudes for light interacting with electrons.
We again have the Lagrangian
L =
1
4
F
µν
F
µν
+
¯
ψ(i
/
D m)ψ,
where
D
µ
=
µ
+ ieA
µ
,
For a change, we work in the Coulomb gauge
· A
= 0. So the equation of
motion for A
0
is
2
A
0
= e
¯
ψγ
0
ψ = ej
0
.
This equation has a solution
A
0
(x) = e
Z
d
3
x
0
j
0
(x
0
, t)
4π|x x
0
|
.
In the Coulomb gauge, we can rewrite the Maxwell part (i.e.
F
µν
F
µν
part) of
the Lagrangian (not density!) as
L
A
=
Z
d
3
x
1
2
(E
2
B
2
)
=
Z
d
3
x
1
2
(
˙
A + A
0
)
2
1
2
B
2
=
Z
d
3
x
1
2
˙
A
2
+
1
2
(A
0
)
2
1
2
B
2
,
where the gauge condition means that the cross-term vanishes by integration by
parts.
Integrating by parts and substituting our A
0
, we get that
L
A
=
Z
d
3
x
1
2
˙
A
2
+
e
2
2
Z
d
3
x
0
j
0
(x)j
0
(x
0
)
4π|x x
0
|
1
2
B
2
.
This is weird, because we now have a non-local term in the Lagrangian. This
term arises as an artifact of working in Coulomb gauge. This doesn’t appear in
Lorenz gauge.
Let’s now compute the Hamiltonian. We will use capital Pi (
Π
) for the
conjugate momentum of the electromagnetic potential
A
, and lower case pi (
π
)
for the conjugate momentum of the spinor field. We have
Π =
L
˙
A
=
˙
A
π
ψ
=
L
˙
ψ
=
.
Putting all these into the Hamiltonian, we get
H =
Z
d
3
x
1
2
˙
A
2
+
1
2
B
2
+
¯
ψ(
i
i
+ m)ψ ej ·A
+
e
2
2
Z
d
3
x
0
j
0
(x)j
0
(x
0
)
4π|x x
0
|
,
where
j =
¯
ψγψ, j
0
=
¯
ψγ
0
ψ.
After doing the tedious computations, one finds that the Feynman rules are as
follows:
(i) The photons are denoted as squiggly lines:
Each line comes with an index
i
that tells us which component of
A
we
are talking about. Each internal line gives us a factor of
D
tr
ij
=
i
p
2
+
δ
ij
p
i
p
j
|p|
2
,
while for each external line,
E
(i)
we simply write down a polarization vector
E
(i)
corresponding to the
polarization of the particle.
(ii) The ej · A term gives us an interaction of the form
This contributes a factor of
ieγ
i
, where
i
is the index of the squiggly line.
(iii)
The non-local term of the Lagrangian gives us instantaneous non-local
interactions, denoted by dashed lines:
x
y
The contribution of this factor, in position space, is given by
i(
0
)
2
δ(x
0
y
0
)
4π|x y|
.
Whoa. What is this last term saying? Recall that when we previously derived
our Feynman rules, we first obtained some terms from Wick’s theorem that
contained things like
e
ip·x
, and then integrated over
x
to get rid of these terms,
so that the resulting formula only included momentum things. What we’ve
shown here in the last rule is what we have before integrating over
x
and
y
. We
now want to try to rewrite things so that we can get something nicer.
We note that this piece comes from the
A
0
term. So one possible strategy
is to make it into a
D
00
piece of the photon propagator. So we treat this as
a photon indexed by
µ
= 0. We then rewrite the above rules and replace all
indices i with µ and let it range over 0, 1, 2, 3.
To do so, we note that
δ(x
0
y
0
)
4π|x y|
=
Z
d
4
p
(2π)
4
e
ip·(xy)
|p|
2
.
We now define the γ-propagator
D
µν
(p) =
i
p
2
+
δ
µν
p
µ
p
ν
|p|
2
µ, ν 6= 0
i
|p|
2
µ = ν = 0
0 otherwise
We now have the following Feynman rules:
(i) The photons are denoted as squiggly lines:
Each internal line comes with an index
µ
ranging over 0
,
1
,
2
,
3 that tells
us which component of
A
we are talking about, and give us a factor of
D
µν
, while for each external line,
E
(µ)
we simply write down a polarization vector
E
(µ)
corresponding to the
polarization of the particle.
(ii) The ej · A term gives us an interaction of the form
This contributes a factor of
ieγ
i
, where
i
is the index of the squiggly line.
Now the final thing to deal with is the annoying
D
µν
formula. We claim that
it can always be replaced by
D
µν
(p) = i
η
µν
p
2
,
i.e. in all contractions involving
D
µν
we care about, contracting with
D
µν
gives
the same result as contracting with
i
η
µν
p
2
. This can be proved in full generality
using momentum conservation, but we will only look at some particular cases.
Example. Consider the process
e
e
e
e
.
We look at one particular diagram
p, s
q, r
p
0
, s
0
q
0
, r
0
µ
ν
We have two vertices, which contribute to a term of
e
2
[¯u(p
0
)γ
µ
u(p)]D
µν
(k)[¯u(q
0
)γ
ν
u(q)],
where
k = p p
0
= q + q
0
.
We show that in this case, we can replace D
µν
(k) by
D
µν
(k) =
µν
k
2
.
The proof follows from current conservation. Recall that u(p) satisfies
(
/
p m)u(p) = 0.
We define the spinor combinations
α
µ
= ¯u(p
0
)γ
µ
u(p)
β
µ
= ¯u(q
0
)γ
µ
u(q)
What we have is that
k
µ
α
µ
= ¯u(p
0
)(
/
p
0
/
p)u(p) = ¯u(p
0
)(m m)u(p) = 0.
Similarly, k
µ
β
µ
is also zero. So our Feynman diagram is given by
α
µ
D
µν
β
ν
= i
α · β
k
2
(α · k)(β · k)
|k|
2
k
2
+
α
0
β
0
|k|
2
.
But we know that α
µ
k
µ
= β
µ
k
µ
= 0. So this is equal to
i
α · β
k
2
k
2
0
α
0
β
0
|k|
2
k
2
+
α
0
β
0
|k|
2
= i
α · β
k
2
1
|k|
2
k
2
(k
2
0
k
2
)α
0
β
0
= i
α · β
k
2
|k|
2
|k|
2
k
2
α
0
β
0
= i
α · β
k
2
= α
µ
µν
k
2
β
ν
.
What really is giving us this simple form is current conservation.
In general, in Lorenz gauge, we have
D
µν
=
i
p
2
η
µν
+ (α 1)
p
µ
p
ν
p
2
,
and the second term cancels in all physical processes, for similar reasons.
Charged scalars
We quickly go through the Feynman rules for charged complex scalar fields. We
will not use these for anything. The Lagrangian coming from minimal coupling
is
L = (D
µ
ψ)
D
µ
ψ
1
4
F
µν
F
µν
We can expand the first term to get
(D
µ
ψ)
D
µ
ψ =
µ
ψ
µ
ψ ieA
µ
(ψ
µ
ψ ψ
µ
ψ
) + e
2
A
µ
A
µ
ψ
ψ.
The Feynman rules for these are:
(i) The first interaction term gives us a possible vertex
p q
This contributes a factor of ie(p + q)
µ
.
(ii) The A
µ
A
µ
ψ
ψ term gives diagrams of the form
p q
This contributes a factor of 2ie
2
η
µν
.
6.5 Computations and diagrams
We now do some examples. Here we will not explain where the positive/negative
signs come from, since it involves some tedious work involving going through
Wick’s theorem, if you are not smart enough to just “see” them.
Example. Consider the process
e
e
e
e
.
We again can consider the two diagrams
p, s
q, r
p
0
, s
0
q
0
, r
0
µ
ν
p, s
q, r
p
0
, s
0
q
0
, r
0
µ
ν
We will do the first diagram slowly. The top vertex gives us a factor of
ie[¯u
s
0
p
0
γ
µ
u
s
p
].
The bottom vertex gives us
ie[¯u
r
0
q
0
γ
ν
u
r
q
].
The middle squiggly line gives us
µν
(p
0
p)
2
.
So putting all these together, the first diagram gives us a term of
i(ie)
2
[¯u
s
0
p
0
γ
µ
u
s
p
][¯u
r
0
q
0
γ
µ
u
r
q
]
(p
0
p)
2
!
.
Similarly, the second diagram gives
i(ie)
2
[¯u
s
0
p
0
γ
µ
u
s
q
][¯u
r
0
q
0
γ
µ
u
r
p
]
(p q)
2
!
,
Example. Consider the process
e
+
e
γγ.
We have diagrams of the form
p, s
q, r
E
µ
, p
0
E
ν
, q
0
This diagram gives us
i(ie)
2
[¯v
r
q
γ
ν
(
/
p
/
p
0
+ m)γ
µ
u
s
p
]
(p p
0
)
2
m
2
E
µ
(p
0
)E
ν
(q
0
).
Usually, since the mass of an electron is so tiny relative to our energy scales, we
simply ignore it.
Example
(Bhabha scattering)
.
This is definitely not named after an elephant.
We want to consider the scattering process
e
+
e
e
+
e
with diagrams
p, s
q, r
p
0
, s
0
q
0
, r
0
p, s
q, r
p
0
, s
0
q
0
, r
0
These contribute
i(ie)
2
[¯u
s
0
p
0
γ
µ
u
s
p
][¯v
r
q
γ
µ
v
r
0
q
0
]
(p p
0
)
2
+
[¯v
r
q
γ
µ
u
s
p
][¯u
s
0
p
0
γ
µ
v
r
0
q
0
]
(p + q)
2
!
.
Example (Compton scattering). Consider scattering of the form
γe
γe
.
We have diagrams
u
q
¯u
q
0
E
µ
(p)
E
ν
(p
0
)
p + q
u
q
¯u
q
0
E
µ
(p)
E
ν
(p
0
)
q p
0
Example. Consider the
γγ γγ
scattering process. We have a loop diagram
Naively, we might expect the contribution to be proportional to
Z
d
4
k
k
4
,
This integral itself diverges, but it turns out that if we do this explicitly, the
gauge symmetry causes things to cancel out and renders the diagram finite.
Example
(Muon scattering)
.
We could also put muons into the picture. These
behave like electrons, but are actually different things. We can consider scattering
of the form
e
µ
e
µ
This has a diagram
e
e
µ
µ
We don’t get the diagram obtained by swapping two particles because electrons
are not muons.
We can also get interactions of the form
e
+
e
µ
+
µ
by
e
e
+
µ
µ
+