Part III The Standard Model
Based on lectures by C. E. Thomas
Notes taken by Dexter Chua
Lent 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
The Standard Mo del of particle physics is, by far, the most successful application
of quantum field theory (QFT). At the time of writing, it accurately describes all
exp erimental measurements involving strong, weak, and electromagnetic interactions.
The course aims to demonstrate how this model, a QFT with gauge group
SU
(3)
×
SU
(2)
×
U(1) and fermion fields for the leptons and quarks, is realised in nature. It is
intended to complement the more general Advanced QFT course.
We begin by defining the Standard Model in terms of its local (gauge) and global
symmetries and its elementary particle content (spin-half leptons and quarks, and
spin-one gauge bosons). The parity
P
, charge-conjugation
C
and time-reversal
T
transformation properties of the theory are investigated. These need not be symmetries
manifest in nature; e.g. only left-handed particles feel the weak force and so it violates
parity symmetry. We show how
CP
violation becomes possible when there are three
generations of particles and describe its consequences.
Ideas of spontaneous symmetry breaking are applied to discuss the Higgs Mechanism
and why the weakness of the weak force is due to the spontaneous breaking of the
SU
(2)
×
U(1) gauge symmetry. Recent measurements of what appear to be Higgs boson
decays will be presented.
We show how to obtain cross sections and decay rates from the matrix element squared
of a process. These can be computed for various scattering and decay processes in
the electroweak sector using perturbation theory because the couplings are small. We
touch upon the topic of neutrino masses and oscillations, an important window to
physics beyond the Standard Model.
The strong interaction is described by quantum chromodynamics (QCD), the non-
abelian gauge theory of the (unbroken)
SU
(3) gauge symmetry. At low energies quarks
are confined and form bound states called hadrons. The coupling constant decreases
as the energy scale increases, to the point where perturbation theory can be used. As
an example we consider electron- positron annihilation to final state hadrons at high
energies. Time permitting, we will discuss nonperturbative approaches to QCD. For
example, the framework of effective field theories can be used to make progress in the
limits of very small and very large quark masses.
Both very high-energy experiments and very precise experiments are currently striving
to observe effects that cannot be described by the Standard Model alone. If time
permits, we comment on how the Standard Model is treated as an effective field theory
to accommo date (so far hypothetical) effects beyond the Standard Model.
Pre-requisites
It is necessary to have attended the Quantum Field Theory and the Symmetries, Fields
and Particles courses, or to be familiar with the material covered in them. It would
b e advantageous to attend the Advanced QFT course during the same term as this
course, or to study renormalisation and non-abelian gauge fixing.
Contents
0 Introduction
1 Overview
2 Chiral and gauge symmetries
2.1 Chiral symmetry
2.2 Gauge symmetry
3 Discrete symmetries
3.1 Symmetry operators
3.2 Parity
3.3 Charge conjugation
3.4 Time reversal
3.5 S-matrix
3.6 CPT theorem
3.7 Baryogenesis
4 Spontaneous symmetry breaking
4.1 Discrete symmetry
4.2 Continuous symmetry
4.3 General case
4.4 Goldstone’s theorem
4.5 The Higgs mechanism
4.6 Non-abelian gauge theories
5 Electroweak theory
5.1 Electroweak gauge theory
5.2 Coupling to leptons
5.3 Quarks
5.4 Neutrino oscillation and mass
5.5 Summary of electroweak theory
6 Weak decays
6.1 Effective Lagrangians
6.2 Decay rates and cross sections
6.3 Muon decay
6.4 Pion decay
6.5 K
0
-
¯
K
0
mixing
7 Quantum chromodynamics (QCD)
7.1 QCD Lagrangian
7.2 Renormalization
7.3 e
+
e
hadrons
7.4 Deep inelastic scattering
0 Introduction
In the Michaelmas Quantum Field Theory course, we studied the general theory
of quantum field theory. At the end of the course, we had a glimpse of quantum
electrodynamics (QED), which was an example of a quantum field theory that
described electromagnetic interactions. QED was a massively successful theory
that explained experimental phenomena involving electromagnetism to a very
high accuracy.
But of course, that is far from being a “complete” description of the universe.
It does not tell us what “matter” is actually made up of, and also all sorts of other
interactions that exist in the universe. For example, the atomic nucleus contains
positively-charged protons and neutral neutrons. From a purely electromagnetic
point of view, they should immediately blow apart. So there must be some force
that holds the particles together.
Similarly, in experiments, we observe that certain particles undergo decay.
For example, muons may decay into electrons and neutrinos. QED doesn’t
explain these phenomena as well.
As time progressed, physicists managed to put together all sorts of experi-
mental data, and came up with the Standard Model. This is the best description
of the universe we have, but it is lacking in many aspects. Most spectacularly,
it does not explain gravity at all. There are also some details not yet fully
sorted out, such as the nature of neutrinos. In this course, our objective is to
understand the standard model.
Perhaps to the disappointment of many readers, it will take us a while before
we manage to get to the actual standard model. Instead, during the first half of
the course, we are going to discuss some general theory regarding symmetries.
These are crucial to the development of the theory, as symmetry concerns impose
a lot of restriction on what our theories can be. More importantly, the “forces”
in the standard model can be (almost) completely described by the gauge group
we have, which is
SU
(3)
× SU
(2)
×
U(1). Making sense of these ideas is crucial
to understanding how the standard model works.
With the machinery in place, the actual description of the Standard Model is
actually pretty short. After describing the Standard Model, we will do various
computations with it, and make some predictions. Such predictions were of course
important it allowed us to verify that our theory is correct! Experimental
data also helps us determine the values of the constants that appear in the
theory, and our computations will help indicate how this is possible.
Historically, the standard model was of course discovered experimentally,
and a lot of the constructions and choices are motivated only by experimental
reasons. Since we are theorists, we are not going to motivate our choices by
experiments, but we will just write them down.
1 Overview
We begin with a quick overview of the things that exist in the standard model.
This description will mostly be words, and the actual theory will have to come
quite some time later.
Forces are mediated by spin 1 gauge bosons. These include
The electromagnetic field (EM ), which is mediated by the photon.
This is described by quantum electrodynamics (QED);
The weak interaction, which is mediated by the
W
±
and
Z
bosons;
and
The strong interaction, which is mediated by gluons
g
. This is
described by quantum chromodynamics (QCD).
While the electromagnetic field and weak interaction seem very different,
we will see that at high energies, they merge together, and can be described
by a single gauge group.
Matter is described by spin
1
2
fermions. These are described by Dirac
spinors. Roughly, we can classify them into 4 “types”, and each type comes
in 3 generations, which will be denoted G1, G2, G3 in the following table,
which lists which forces they interact with, alongside with their charge.
Type G1 G2 G3 Charge EM Weak Strong
Charged leptons e µ τ 1 3 3 7
Neutrinos ν
e
ν
µ
ν
τ
0 7 3 7
Positive quarks u c t +
2
3
3 3 3
Negative quarks d s b
1
3
3 3 3
The first two types are known as leptons, while the latter two are known
as quarks. We do not know why there are three generations. Note that a
particle interacts with the electromagnetic field iff it has non-zero charge,
since the charge by definition measures the strength of the interaction with
the electromagnetic field.
There is the Higgs boson, which has spin 0. This is responsible for giving
mass to the
W
±
, Z
bosons and fermions. This was just discovered in 2012
in CERN, and subsequently more properties have been discovered, e.g. its
spin.
As one would expect from the name, the gauge bosons are manifestations of
local gauge symmetries. The gauge group in the Standard Model is
SU(3)
C
× SU(2)
L
× U(1)
Y
.
We can talk a bit more about each component. The subscripts indicate what
things the group is responsible for. The subscript on
SU
(3)
C
means “colour”,
and this is responsible for the strong force. We will not talk about this much,
because it is complicated.
The remaining
SU
(2)
L
×
U(1)
Y
bit collectively gives the electroweak inter-
action. This is a unified description of electromagnetism and the weak force.
These are a bit funny. The
SU
(2)
L
is a chiral interaction. It only couples to
left-handed particles, hence the
L
. The U(1)
Y
is something we haven’t heard of
(probably), namely the hypercharge, which is conventionally denoted
Y
. Note
that while electromagnetism also has a U(1) gauge group, it is different from
this U(1) we see.
Types of symmetry
One key principle guiding our study of the standard model is symmetry. Sym-
metries can manifest themselves in a number of ways.
(i)
We can have an intact symmetry, or exact symmetry. In other words,
this is an actual symmetry. For example, U(1)
EM
and
SU
(3)
C
are exact
symmetries in the standard model.
(ii)
Symmetries can be broken by an anomaly. This is a symmetry that exists
in the classical theory, but goes away when we quantize. Examples include
global axial symmetry for massless spinor fields in the standard model.
(iii)
Symmetry is explicitly broken by some terms in the Lagrangian. This is
not a symmetry, but if those annoying terms are small (intentionally left
vague), then we have an approximate symmetry, and it may also be useful
to consider these.
For example, in the standard model, the up and down quarks are very
close in mass, but not exactly the same. This gives to the (global) isospin
symmetry.
(iv)
The symmetry is respected by the Lagrangian
L
, but not by the vacuum.
This is a “hidden symmetry”.
(a)
We can have a spontaneously broken symmetry: we have a vacuum
expectation value for one or more scalar fields, e.g. the breaking of
SU(2)
L
× U(1)
Y
into U(1)
EM
.
(b)
Even without scalar fields, we can get dynamical symmetry breaking
from quantum effects. An example of this in the standard model is
the SU(2)
L
× SU(2)
R
global symmetry in the strong interaction.
One can argue that (i) is the only case where we actually have a symmetry, but
the others are useful to consider as well, and we will study them.
2 Chiral and gauge symmetries
We begin with the discussion of chiral and gauge symmetries. These concepts
should all be familiar from the Michaelmas QFT course. But it is beneficial
to do a bit of review here. While doing so, we will set straight our sign and
notational conventions, and we will also highlight the important parts in the
theory.
As always, we will work in natural units
c
=
~
= 1, and the sign convention
is (+, , , ).
2.1 Chiral symmetry
Chiral symmetry is something that manifests itself when we have spinors. Since
all matter fields are spinors, this is clearly important.
The notion of chirality is something that exists classically, and we shall begin
by working classically. A Dirac spinor is, in particular, a (4-component) spinor
field. As usual, the Dirac matrices
γ
µ
are 4
×
4 matrices satisfying the Clifford
algebra relations
{γ
µ
, γ
ν
} = 2g
µν
I,
where g
µν
is the Minkowski metric. For any operator A
µ
, we write
/
A = γ
µ
A
µ
.
Then a Dirac fermion is defined to be a spinor field satisfying the Dirac equation
(i
/
m)ψ = 0.
We define
γ
5
= +
0
γ
1
γ
2
γ
3
,
which satisfies
(γ
5
)
2
= I, {γ
5
, γ
µ
} = 0.
One can do a lot of stuff without choosing a particular basis/representation for
the
γ
-matrices, and the physics we get out must be the same regardless of which
representation we choose, but sometimes it is convenient to pick some particular
representation to work with. We’ll generally use the chiral representation (or
Weyl representation), with
γ
0
=
0 1
1 0
, γ
i
=
0 σ
i
σ
i
0
, γ
5
=
1 0
0 1
,
where the σ
i
Mat
2
(C) are the Pauli matrices.
Chirality
γ
5
is a particularly interesting matrix to consider. We saw that it satisfies
(
γ
5
)
2
= 1. So we know that
γ
5
is diagonalizable, and the eigenvalues of
γ
5
are
either +1 or 1.
Definition
(Chirality)
.
A Dirac fermion
ψ
is right-handed if
γ
5
ψ
=
ψ
, and
left-handed if γ
5
ψ = ψ.
A left- or right-handed fermion is said to have definite chirality.
In general, a fermion need not have definite chirality. However, as we know
from linear algebra, the spinor space is a direct sum of the eigenspaces of
γ
5
. So
given any spinor ψ, we can write it as
ψ = ψ
L
+ ψ
R
,
where ψ
L
is left-handed, and ψ
R
is right-handed.
It is not hard to find these ψ
L
and ψ
R
. We define the projection operators
P
R
=
1
2
(1 + γ
5
), P
L
=
1
2
(1 γ
5
).
It is a direct computation to show that
γ
5
P
L
= P
L
, γ
5
P
R
= P
R
, (P
R,L
)
2
= P
R,L
, P
L
+ P
R
= I.
Thus, we can write
ψ = (P
L
+ P
R
)ψ = (P
L
ψ) + (P
R
ψ),
and thus we have
Notation.
ψ
L
= P
L
ψ, ψ
R
= P
R
ψ.
Moreover, we notice that
P
L
P
R
= P
R
P
L
= 0.
This implies
Lemma. If ψ
L
is left-handed and φ
R
is right-handed, then
¯
ψ
L
φ
L
=
¯
ψ
R
φ
R
= 0.
Proof. We only do the left-handed case.
¯
ψ
L
φ
L
= ψ
L
γ
0
φ
L
= (P
L
ψ
L
)
γ
0
(P
L
φ
L
)
= ψ
L
P
L
γ
0
P
L
φ
L
= ψ
L
P
L
P
R
γ
0
φ
L
= 0,
using the fact that {γ
5
, γ
0
} = 0.
The projection operators look very simple in the chiral representation. They
simply look like
P
L
=
I 0
0 0
, P
R
=
0 0
0 I
.
Now we have produced these
ψ
L
and
ψ
R
, but what are they? In particular, are
they themselves Dirac spinors? We notice that the anti-commutation relations
give
/
γ
5
ψ = γ
5
/
ψ.
Thus,
γ
5
does not commute with the Dirac operator (
i
/
m
), and one can check
that in general,
ψ
L
and
ψ
R
do not satisfy the Dirac equation. But there is one
exception. If the spinor is massless, so
m
= 0, then the Dirac equation becomes
/
ψ = 0.
If this holds, then we also have
/
γ
5
ψ
=
γ
5
/
ψ
= 0. In other words,
γ
5
ψ
is still
a Dirac spinor, and hence
Proposition. If ψ is a massless Dirac spinor, then so are ψ
L
and ψ
R
.
More generally, if the mass is non-zero, then the Lagrangian is given by
L =
¯
ψ(i
/
m)ψ =
¯
ψ
L
i
/
ψ
L
+
¯
ψ
L
i
/
ψ
R
m(
¯
ψ
L
ψ
R
+
¯
ψ
R
ψ
L
).
In general, it “makes sense” to treat ψ
L
and ψ
R
as separate objects only if the
spinor field is massless. This is crucial. As mentioned in the overview, the weak
interaction only couples to left-handed fermions. For this to “make sense”, the
fermions must be massless! But the electrons and fermions we know and love
are not massless. Thus, the masses of the electron and fermion terms cannot
have come from a direct mass term in the Lagrangian. They must obtain it via
some other mechanism, namely the Higgs mechanism.
To see an example of how the mass makes such a huge difference, by staring
at the Lagrangian, we notice that if the fermion is massless, then we have a
U(1)
L
×
U(1)
R
global symmetry under an element (
α
L
, α
R
)
U(1)
L
×
U(1)
R
,
the fermion transforms as
ψ
L
ψ
R
7→
e
L
ψ
L
e
R
ψ
R
.
The adjoint field transforms as
¯
ψ
L
¯
ψ
R
7→
e
L
¯
ψ
L
e
R
¯
ψ
R
,
and we see that the Lagrangian is invariant. However, if we had a massive
particle, then we would have the cross terms in the Lagrangian. The only way
for the Lagrangian to remain invariant is if
α
L
=
α
R
, and the symmetry has
reduced to a single U(1) symmetry.
Quantization of Dirac field
Another important difference the mass makes is the notion of helicity. This is
a quantum phenomenon, so we need to describe the quantum theory of Dirac
fields. When we quantize the Dirac field, we can decompose it as
ψ =
X
s,p
b
s
(p)u
s
(p)e
ip·x
+ d
s
(p)v
s
(p)e
ip·x
.
We explain these things term by term:
s is the spin, and takes values s = ±
1
2
.
The summation over all p is actually an integral
X
p
=
Z
d
3
p
(2π)
3
(2E
p
)
.
b
and
d
operators that create positive and negative frequency particles
respectively. We use relativistic normalization, where the states
|pi = b
(p) |0i
satisfy
hp|qi = (2π)
3
2E
p
δ
(3)
(p q).
The
u
s
(
p
) and
v
s
(
p
) form a basis of the solution space to the (classical)
Dirac equation, so that
u
s
(p)e
ip·x
, v
s
(p)e
ip·x
are solutions for any
p
and
s
. In the chiral representation, we can write
them as
u
s
(p) =
p · σξ
s
p · ¯σξ
s
, v
s
(p) =
p · ση
s
p · ¯ση
s
,
where as usual
σ
µ
= (I, σ
i
), ¯σ
µ
= (I, σ
i
),
and {ξ
±
1
2
} and {η
±
1
2
} are bases for R
2
.
We can define a quantum operator corresponding to the chirality, known as
helicity.
Definition
(Helicity)
.
We define the helicity to be the projection of the angular
momentum onto the direction of the linear momentum:
h = J ·
ˆ
p = S ·
ˆ
p,
where
J = ir × + S
is the total angular momentum, and S is the spin operator given by
S
i
=
i
4
ε
ijk
γ
j
γ
k
=
1
2
σ
i
0
0 σ
i
.
The main claim about helicity is that for a massless spinor, it reduces to the
chirality, in the following sense:
Proposition. If we have a massless spinor u, then
hu(p) =
γ
5
2
u(p).
Proof. Note that if we have a massless particle, then we have
/
pu = 0,
since quantumly,
p
is just given by differentiation. We write this out explicitly
to see
γ
µ
p
µ
u
s
= (γ
0
p
0
γ · p)u = 0.
Multiplying it by γ
5
γ
0
/p
0
gives
γ
5
u(p) = γ
5
γ
0
γ
i
p
i
p
0
u(p).
Again since the particle is massless, we know
(p
0
)
2
p · p = 0.
So
ˆ
p = p/p
0
. Also, by direct computation, we find that
γ
5
γ
0
γ
i
= 2S
i
.
So it follows that
γ
5
u(p) = 2hu(p).
In particular, we have
hu
L,R
=
γ
5
2
u
L,R
=
1
2
u
L,R
.
So u
L,R
has helicity
1
2
.
This is another crucial observation. Helicity is the spin in the direction of the
momentum, and spin is what has to be conserved. On the other hand, chirality is
what determines whether it interacts with the weak force. For massless particles,
these to notions coincide. Thus, the spin of the particles participating in weak
interactions is constrained by the fact that the weak force only couples to left-
handed particles. Consequently, spin conservation will forbid certain interactions
from happening.
However, once our particles have mass, then helicity is different from spin.
In general, helicity is still quite closely related to spin, at least when the mass is
small. So the interactions that were previously forbidden are now rather unlikely
to happen, but are actually possible.
2.2 Gauge symmetry
Another important aspect of the Standard Model is the notion of a gauge
symmetry. Classically, the Dirac equation has the gauge symmetry
ψ(x) 7→ e
ψ(x)
for any constant
α
, i.e. this transformation leaves all observable physics un-
changed. However, if we allow
α
to vary with
x
, then unsurprisingly, the kinetic
term in the Dirac Lagrangian is no longer invariant. In particular, it transforms
as
¯
ψi
/
ψ 7→
¯
ψi
/
ψ
¯
ψγ
µ
ψ
µ
α(x).
To fix this problem, we introduce a gauge covariant derivative D
µ
that transforms
as
D
µ
ψ(x) 7→ e
(x)
D
µ
ψ(x).
Then we find that
¯
ψi
/
Dψ transforms as
¯
ψi
/
Dψ 7→
¯
ψi
/
Dψ.
So if we replace every
µ
with D
µ
in the Lagrangian, then we obtain a gauge
invariant theory.
To do this, we introduce a gauge field A
µ
(x), and then define
D
µ
ψ(x) = (
µ
+ igA
µ
)ψ(x).
We then assert that under a gauge transformation
α
(
x
), the gauge field
A
µ
transforms as
A
µ
7→ A
µ
1
g
µ
α(x).
It is then a routine exercise to check that D
µ
transforms as claimed.
If we want to think of
A
µ
(
x
) as some physical field, then it should have a
kinetic term. The canonical choice is
L
G
=
1
4
F
µν
F
µν
,
where
F
µν
=
µ
A
ν
ν
A
µ
=
1
ig
[D
µ
, D
ν
].
We call this a U(1) gauge theory, because
e
is an element of U(1). Officially,
A
µ
is an element of the Lie algebra
u
(1), but it is isomorphic to
R
, so we did
not bother to make this distinction.
What we have here is a rather simple overview of how gauge theory works.
In reality, we find the weak field couples only with left-handed fields, and we
have to modify the construction accordingly.
Moreover, once we step out of the world of electromagnetism, we have to
work with a more complicated gauge group. In particular, the gauge group will
be non-abelian. Thus, the Lie algebra
g
has a non-trivial bracket, and it turns
out the right general formulation should include some brackets in
F
µν
and the
transformation rule for A
µ
. We will leave these for a later time.
3 Discrete symmetries
We are familiar with the fact that physics is invariant under Lorentz transfor-
mations and translations. These were relatively easy to understand, because
they are “continuous symmetries”. It is possible to “deform” any such transfor-
mation continuously (and even smoothly) to the identity transformation, and
thus to understand these transformations, it often suffices to understand them
“infinitesimally”.
There is also a “trivial” reason why these are easy to understand the
“types” of fields we have, namely vector fields, scalar fields etc. are defined by
how they transform under change of coordinates. Consequently, by definition,
we know how vector fields transform under Lorentz transformations.
In this chapter, we are going to study discrete symmetries. These cannot be
understood by such means, and we need to do a bit more work to understand
them. It is important to note that these discrete “symmetries” aren’t actually
symmetries of the universe. Physics is not invariant under these transformations.
However, it is still important to understand them, and in particular understand
how they fail to be symmetries.
We can briefly summarize the three discrete symmetries we are interested in
as follows:
Parity (P ): (t, x) 7→ (t, x)
Time-reversal (T ): (t, x) 7→ (t, x)
Charge conjugation (C ): This sends particles to anti-particles and vice
versa.
Of course, we can also perform combinations of these. For example, CP corre-
sponds to first applying the parity transformation, and then applying charge
conjugation. It turns out none of these are symmetries of the universe. Even
worse, any combination of two transformations is not a symmetry. We will
discuss these violations later on as we develop our theory.
Fortunately, the combination of all three, namely CPT, is a symmetry of a
universe. This is not (just) an experimental observation. It is possible to prove
(in some sense) that any (sensible) quantum field theory must be invariant under
CPT, and this is known as the CPT theorem.
Nevertheless, for the purposes of this chapter, we will assume that C, P, T
are indeed symmetries, and try to derive some consequences assuming this were
the case.
The above description of P and T tells us how the universe transforms
under the transformations, and the description of the charge conjugation is just
some vague words. The goal of this chapter is to figure out what exactly these
transformations do to our fields, and, ultimately, what they do to the
S
-matrix.
Before we begin, it is convenient to rephrase the definition of P and T as
follows. A general Poincar´e transformation can be written as a map
x
µ
7→ x
0µ
= Λ
µ
ν
x
ν
+ a
µ
.
A proper Lorentz transform has
det
Λ = +1. The transforms given by parity and
time reversal are improper transformations, and are given by
Definition (Parity transform). The parity transform is given by
Λ
µ
ν
= P
µ
ν
=
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
.
Definition
(Time reversal transform)
.
The time reversal transform is given by
T
µ
ν
=
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
.
3.1 Symmetry operators
How do these transformations of the universe relate to our quantum-mechanical
theories? In quantum mechanics, we have a state space
H
, and the fields are
operators
φ
(
t, x
) :
H H
for each (
t, x
)
R
1,3
. Unfortunately, we do not have
a general theory of how we can turn a classical theory into a quantum one, so
we need to make some (hopefully natural) assumptions about how this process
behaves.
The first assumption is absolutely crucial to get ourselves going.
Assumption.
Any classical transformation of the universe (e.g. P or T) gives
rise to some function f : H H.
In general
f
need not be a linear map. However, it turns out if the transfor-
mation is in fact a symmetry of the theory, then Wigner’s theorem forces a lot
of structure on f .
Roughly, Wigner’s theorem says the following if a function
f
:
H H
is
a symmetry, then it is linear and unitary, or anti-linear and anti-unitary.
Definition
(Linear and anti-linear map)
.
Let
H
be a Hilbert space. A function
f : H H is linear if
f(αΦ + βΨ) = αf (Φ) + βf(Ψ)
for all α, β C and Φ, Ψ H. A map is anti-linear if
f(αΦ + βΨ) = α
f(Φ) + β
f(Ψ).
Definition
(Unitary and anti-unitary map)
.
Let
H
be a Hilbert space, and
f : H H a linear map. Then f is unitary if
hfΦ, fΨi = hΦ, Ψi
for all Φ, Ψ H.
If f : H H is anti-linear, then it is anti-unitary if
hfΦ, fΨi = hΦ, Ψi
.
But to state the theorem precisely, we need to be a bit more careful. In
quantum mechanics, we consider two (normalized) states to be equivalent if they
differ by a phase. Thus, we need the following extra assumption on our function
f:
Assumption.
In the previous assumption, we further assume that if Φ
,
Ψ
H
differ by a phase, then fΦ and f Ψ differ by a phase.
Now what does it mean for something to “preserve physics”? In quantum
mechanics, the physical properties are obtained by taking inner products. So
we want to require f to preserve inner products. But this is not quite what we
want, because states are only defined up to a phase. So we should only care
about inner products defined up to a phase. Note that this in particular implies
f preserves normalization.
Thus, we can state Wigner’s theorem as follows:
Theorem
(Wigner’s theorem)
.
Let
H
be a Hilbert space, and
f
:
H H
be a
bijection such that
If Φ, Ψ H differ by a phase, then fΦ and f Ψ differ by a phase.
For any Φ, Ψ H, we have
|hfΦ, fΨi| = |hΦ, Ψi|.
Then there exists a map
W
:
H H
that is either linear and unitary, or
anti-linear and anti-unitary, such that for all Φ
H
, we have that
W
Φ and
f
Φ
differ by a phase.
For all practical purposes, this means we can assume the transformations C,
P and T are given by unitary or anti-unitary operators.
We will write
ˆ
C
,
ˆ
P
and
ˆ
T
for the (anti-)unitary transformations induced on
the Hilbert space by the C, P, T transformations. We also write
W
, a
) for the
induced transformation from the (not necessarily proper) Poincar´e transformation
x
µ
7→ x
0µ
= Λ
µ
ν
x
ν
+ a
µ
.
We want to understand which are unitary and which are anti-unitary. We will
assume that these W , a) “compose properly”, i.e. that
W
2
, a
2
)W
1
, a
1
) = W
2
Λ
1
, Λ
2
a
1
+ a
2
).
Moreover, we assume that for infinitesimal transformations
Λ
µ
ν
= δ
µ
ν
+ ω
µ
ν
, a
µ
= ε
µ
,
where ω and ε are small parameters, we can expand
W = W , a) = W (I + ω, ε) = 1 +
i
2
ω
µν
J
µν
+
µ
P
µ
,
where
J
µν
are the operators generating rotations and boosts, and
P
µ
are the
operators generating translations. In particular, P
0
is the Hamiltonian.
Of course, we cannot write parity and time reversal in this form, because
they are discrete symmetries, but we can look at what happens when we combine
these transformations with infinitesimal ones.
By assumption, we have
ˆ
P = W (P, 0),
ˆ
T = W (T, 0).
Then from the composition rule, we expect
ˆ
P W
ˆ
P
1
= W (PΛP
1
, Pa)
ˆ
T W
ˆ
T
1
= W (TΛT
1
, Ta).
Inserting expansions for
W
in terms of
ω
and
ε
on both sides, and comparing
coefficients of ε
0
, we find
ˆ
P iH
ˆ
P
1
= iH
ˆ
T iH
ˆ
T
1
= iH.
So iH and
ˆ
P commute, but iH and
ˆ
T anti-commute.
To proceed further, we need to make the following assumption, which is a
natural one to make if we believe P and T are symmetries.
Assumption.
The transformations
ˆ
P
and
ˆ
T
send an energy eigenstate of energy
E to an energy eigenstate of energy E.
From this, it is easy to figure out whether the maps
ˆ
P
and
ˆ
T
should be
unitary or anti-unitary.
Indeed, consider any normalized energy eigenstate Ψ with energy
E 6
= 0.
Then by definition, we have
hΨ, iHΨi = iE.
Then since
ˆ
P Ψ is also an energy eigenstate of energy E, we know
iE = h
ˆ
P Ψ, iH
ˆ
P Ψi = h
ˆ
P Ψ,
ˆ
P iHΨi.
In other words, we have
h
ˆ
P Ψ,
ˆ
P iHΨi = hΨ, iHΨi = iE.
So
ˆ
P must be unitary.
On the other hand, we have
iE = h
ˆ
T Ψ, iH
ˆ
T Ψi = −h
ˆ
T Ψ,
ˆ
T iHΨi.
In other words, we obtain
h
ˆ
T Ψ,
ˆ
T iHΨi = −hΨ, iHΨi = hΨ, iHΨi
iE.
Thus, it follows that
ˆ
T
must be anti-unitary. It is a fact that
ˆ
C
is linear and
unitary.
Note that these derivations rely crucially on the fact that we know the
operators must either by unitary or anti-unitary, and this allows us to just check
one inner product to determine which is the case, rather than all.
So far, we have been discussing how these operators act on the elements of
the state space. However, we are ultimately interested in understanding how
the fields transform under these transformations. From linear algebra, we know
that an endomorphism
W
:
H H
on a vector space canonically induces a
transformation on the space of operators, by sending
φ
to
W φW
1
. Thus, what
we want to figure out is how W φW
1
relates to φ.
3.2 Parity
Our objective is to figure out how
ˆ
P = W (P, 0)
acts on our different quantum fields. For convenience, we will write
x
µ
7→ x
µ
P
= (x
0
, x)
p
µ
7→ p
µ
P
= (p
0
, p).
As before, we will need to make some assumptions about how
ˆ
P
behaves.
Suppose our field has creation operator a
(p). Then we might expect
|pi 7→ η
a
|p
P
i
for some complex phase η
a
. We can alternatively write this as
ˆ
P a
(p) |0i = η
a
a
(p
P
) |0i.
We assume the vacuum is parity-invariant, i.e.
ˆ
P |0i
=
|0i
. So we can write
this as
ˆ
P a
(p)
ˆ
P
1
|0i = η
a
a
(p
P
) |0i.
Thus, it is natural to make the following assumption:
Assumption.
Let
a
(
p
) be the creation operator of any field. Then
a
(
p
)
transforms as
ˆ
P a
(p)
ˆ
P
1
= η
a
(p
P
)
for some η
. Taking conjugates, since
ˆ
P is unitary, this implies
ˆ
P a(p)
ˆ
P
1
= ηa(p
P
).
We will also need to assume the following:
Assumption.
Let
φ
be any field. Then
ˆ
P φ
(
x
)
ˆ
P
1
is a multiple of
φ
(
x
P
), where
“a multiple” can be multiplication by a linear map in the case where
φ
has more
than 1 component.
Scalar fields
Consider a complex scalar field
φ(x) =
X
p
a(p)e
ip·x
+ c(p)
e
+ip·x
,
where
a
(
p
) is an annihilation operator for a particle and
c
(
p
)
is a creation
operator for the anti-particles. Then we have
ˆ
P φ(x)
ˆ
P
1
=
X
p
ˆ
P a(p)
ˆ
P
1
e
ip·x
+
ˆ
P c
(p)
ˆ
P
1
e
+ip·x
=
X
p
η
a
a(p
P
)e
ip·x
+ η
c
c
(p
P
)e
+ip·x
Since we are integrating over all p, we can relabel p
P
p, and then get
=
X
p
η
a
a(p)e
ip
P
·x
+ η
c
c
(p)e
+ip
P
·x
We now note that x · p
P
= x
P
· p by inspection. So we have
=
X
p
η
a
a(p)e
ip·x
P
+ η
c
c
(p)e
ip·x
P
.
By assumption, this is proportional to
φ
(
x
P
). So we must have
η
a
=
η
c
η
P
.
Then we just get
ˆ
P φ(x)
ˆ
P
1
= η
P
φ(x
P
).
Definition
(Intrinsic parity)
.
The intrinsic parity of a field
φ
is the number
η
P
C such that
ˆ
P φ(x)
ˆ
P
1
= η
P
φ(x
P
).
For real scalar fields, we have
a
=
c
, and so
η
a
=
η
c
, and so
η
a
=
η
P
=
η
P
.
So η
P
= ±1.
Definition
(Scalar and pseudoscalar fields)
.
A real scalar field is called a
scalar field (confusingly) if the intrinsic parity is
1. Otherwise, it is called a
pseudoscalar field.
Note that under our assumptions, we have
ˆ
P
2
=
I
by the composition rule
of the
W
, a
). Hence, it follows that we always have
η
P
=
±
1. However, in
more sophisticated treatments of the theory, the composition rule need not hold.
The above analysis still holds, but for a complex scalar field, we need not have
η
P
= ±1.
Vector fields
Similarly, if we have a vector field V
µ
(x), then we can write
V
µ
(x) =
X
p,λ
E
µ
(λ, p)a
λ
(p)e
ip·x
+ E
µ
(λ, p)c
λ
(p)e
ip·x
.
where E
µ
(λ, p) are some polarization vectors.
Using similar computations, we find
ˆ
P V
µ
ˆ
P
1
=
X
p,λ
E
µ
(λ, p
P
)a
λ
(p)e
ip·x
P
η
a
+ E
µ
(λ, p
P
)c
λ
(p)e
+ip·x
P
η
c
.
This time, we have to deal with the
E
µ
(
λ, p
P
) term. Using explicit expressions
for E
µ
, we have
E
µ
(λ, p
P
) = P
µ
ν
E
ν
(λ, p).
So we find that
ˆ
P V
µ
(x
P
)
ˆ
P
1
= η
P
P
µ
ν
V
ν
(x
P
),
where for the same reasons as before, we have
η
P
= η
a
= η
c
.
Definition
(Vector and axial vector fields)
.
Vector fields are vector fields with
η
P
= 1. Otherwise, they are axial vector fields.
Dirac fields
We finally move on to the case of Dirac fields, which is the most complicated.
Fortunately, it is still not too bad.
As before, we obtain
ˆ
P ψ(x)
ˆ
P
1
=
X
p,s
η
b
b
s
(p)u(p
P
)e
p·x
P
+ η
d
d
s
(p)v
s
(p
P
)e
+ip·x
P
.
We use that
u
s
(p
P
) = γ
0
u
s
(p), v
s
(p
P
) = γ
0
v
s
(p),
which we can verify using Lorentz boosts. Then we find
ˆ
P ψ(x)
ˆ
P
1
= γ
0
X
p,s
η
b
b
s
(p)u(p)e
p·x
P
η
d
d
s
(p)v
s
(p)e
+ip·x
P
.
So again, we require that
η
b
= η
d
.
We can see this minus sign as saying particles and anti-particles have opposite
intrinsic parity.
Unexcitingly, we end up with
ˆ
P ψ(x)
ˆ
P
1
= η
P
γ
0
ψ(x
P
).
Similarly, we have
ˆ
P
¯
ψ(x)
ˆ
P
1
= η
P
¯
ψ(x
P
)γ
0
.
Since γ
0
anti-commutes with γ
5
, it follows that we have
ˆ
P ψ
L
ˆ
P
1
= η
P
γ
0
ψ
R
.
So the parity operator exchanges left-handed and right-handed fermions.
Fermion bilinears
We can now determine how various fermions bilinears transform. For example,
we have
¯
ψ(x)ψ(x) 7→
¯
ψ(x
P
)ψ(x
P
).
So it transforms as a scalar. On the other hand, we have
¯
ψ(x)γ
5
ψ(x) 7→
¯
ψ(x
P
)γ
5
ψ(x
P
),
and so this transforms as a pseudoscalar. We also have
¯
ψ(x)γ
µ
ψ(x) 7→ P
µ
ν
¯
ψ(x
P
)γ
ν
ψ(x
P
),
and so this transforms as a vector. Finally, we have
¯
ψ(x)γ
5
γ
µ
ψ(x) 7→ P
µ
0
¯
ψ(x
P
)γ
5
γ
µ
ψ(x
P
).
So this transforms as an axial vector.
3.3 Charge conjugation
We will do similar manipulations, and figure out how different fields transform
under charge conjugation. Unlike parity, this is not a spacetime symmetry. It
transforms particles to anti-particles, and vice versa. This is a unitary operator
ˆ
C, and we make the following assumption:
Assumption.
If
a
is an annihilation operator of a particle, and
c
is the annihi-
lation operator of the anti-particle, then we have
ˆ
Ca(p)
ˆ
C
1
= ηc(p)
for some η.
As before, this is motivated by the requirement that
ˆ
C |pi
=
η
|¯pi
. We will
also assume the following:
Assumption.
Let
φ
be any field. Then
ˆ
Cφ
(
x
)
ˆ
C
1
is a multiple of the conjugate
of
φ
, where “a multiple” can be multiplication by a linear map in the case where
φ
has more than 1 component. Here the interpretation of “the conjugate” depends
on the kind of field:
If
φ
is a bosonic field, then the conjugate of
φ
is
φ
= (
φ
)
T
. Of course, if
this is a scalar field, then the conjugate is just φ
.
If φ is a spinor field, then the conjugate of φ is
¯
φ
T
.
Scalar and vector fields
Scalar and vector fields behave in a manner very similar to that of parity
operators. We simply have
ˆ
Cφ
ˆ
C
1
= η
c
φ
ˆ
Cφ
ˆ
C
1
= η
c
φ
for some
η
c
. In the case of a real field, we have
φ
=
φ
. So we must have
η
c
=
±
1,
which is known as the intrinsic
c
-parity of the field. For complex fields, we can
introduce a global phase change of the field so as to set η
c
= 1.
This has some physical significance. For example, the photon field transforms
like
ˆ
CA
µ
(x)
ˆ
C
1
= A
µ
(x).
Experimentally, we see that
π
0
only decays to 2 photons, but not 1 or 3. Therefore,
assuming that c-parity is conserved, we infer that
η
π
0
c
= (1)
2
= +1.
Dirac fields
Dirac fields are more complicated. As before, we can compute
ˆ
Cψ(x)
ˆ
C
1
= η
c
X
p,s
d
s
(p)u
s
(p)e
ip·x
+ b
s
(p)v
s
(p)e
+ip·x
.
We compare this with
¯
ψ
T
(x) =
X
p,s
b
s
(p)¯u
sT
(p)e
+ip·x
+ d
s
(p)¯v
sT
(p)e
ip·x
.
We thus see that we have to relate ¯u
sT
and v
s
somehow; and ¯v
sT
with u
s
.
Recall that we constructed
u
s
and
v
s
in terms of elements
ξ
s
, η
s
R
2
. At
that time, we did not specify what
ξ
and
η
are, or how they are related, so we
cannot expect
u
and
v
to have any relation at all. So we need to make a specific
choice of ξ and η. We will choose them so that
η
s
=
2
ξ
s
.
Now consider the matrix
C =
0
γ
2
=
2
0
0
2
.
The reason we care about this is the following:
Proposition.
v
s
(p) = C ¯u
sT
, u
s
(p) = C¯v
sT
(p).
From this, we infer that
Proposition.
ψ
c
(x)
ˆ
Cψ(x)
ˆ
C
1
= η
c
C
¯
ψ
T
(x)
¯
ψ
c
(x)
ˆ
C
¯
ψ(x)
ˆ
C
1
= η
c
ψ
T
(x)C = η
c
ψ
T
(x)C
1
.
It is convenient to note the following additional properties of the matrix C:
Proposition.
(Cγ
µ
)
T
= Cγ
µ
C = C
T
= C
= C
1
(γ
µ
)
T
= Cγ
µ
C
1
, (γ
5
)
T
= +Cγ
5
C
1
.
Note that if ψ(x) satisfies the Dirac equation, then so does ψ
c
(x).
Apart from Dirac fermions, there are also things called Majorana fermions.
They have
b
s
(
p
) =
d
s
(
p
). This means that the particle is its own antiparticle,
and for these, we have
ψ
c
(x) = ψ(x).
These fermions have to be neutral. A natural question to ask is do these
exist? The only type of neutral fermions in the standard model is neutrinos, and
we don’t really know what neutrinos are. In particular, it is possible that they
are Majorana fermions. Experimentally, if they are indeed Majorana fermions,
then we would be able to observe neutrino-less double
β
decay, but current
experiments can neither observe nor rule out this possibility.
Fermion bilinears
We can look at how fermion bilinears change. For example,
j
µ
=
¯
ψ(x)γ
µ
ψ(x).
Then we have
ˆ
Cj
µ
(x)
ˆ
C
1
=
ˆ
C
¯
ψ
ˆ
C
1
γ
µ
ˆ
Cψ
ˆ
C
1
= η
c
ψ
T
C
1
γ
µ
C
¯
ψ
T
η
c
We now notice that the
η
c
and
η
c
cancel each other. Also, since this is a scalar,
we can take the transpose of the whole thing, but we will pick up a minus sign
because fermions anti-commute
=
¯
ψ(C
1
γ
µ
C)
T
ψ
=
¯
ψC
T
γ
µT
(C
1
)
T
ψ
=
¯
ψγ
µ
ψ
= j
µ
(x).
Therefore A
µ
(x)j
µ
(x) is invariant under
ˆ
C. Similarly,
¯
ψγ
µ
γ
5
ψ 7→ +
¯
ψγ
µ
γ
5
ψ.
3.4 Time reversal
Finally, we get to time reversal. This is more messy. Under T, we have
x
µ
= (x
0
, x
i
) 7→ x
µ
T
= (x
0
, x
i
).
Momentum transforms in the opposite way, with
p
µ
= (p
0
, p
i
) 7→ p
µ
T
= (p
0
, p
i
).
Theories that are invariant under
ˆ
T
look the same when we run them backwards.
For example, Newton’s laws are time reversal invariant. We will also see that
the electromagnetic and strong interactions are time reversal invariant, but the
weak interaction is not.
Here our theory begins to get more messy.
Assumption. For any field φ, we have
ˆ
T φ(x)
ˆ
T
1
φ(x
T
).
Assumption. For bosonic fields with creator a
, we have
ˆ
T a(p)
ˆ
T
1
= ηa(p
T
)
for some η. For Dirac fields with creator b
s
, we have
ˆ
T b
s
(p)
ˆ
T
1
= η(1)
1/2s
b
s
(p
T
)
for some η.
Why that complicated rule for Dirac fields? We first justify why we want
to swap spins when we perform time reversal. This is just the observation that
when we reverse time, we spin in the opposite direction. The factor of (
1)
1
2
s
tells us spinors of different spins transform differently.
Boson field
Again, bosonic fields are easy. The creation and annihilation operators transform
as
ˆ
T a(p)
ˆ
T
1
= η
T
a(p
T
)
ˆ
T c
(p)
ˆ
T
1
= η
T
c
(p
T
).
Note that the relative phases are fixed using the same argument as for
ˆ
P
and
ˆ
C
. When we do the derivations, it is very important to keep in mind that
ˆ
T
is
anti-linear.
Then we have
ˆ
T φ(x)
ˆ
T
1
= η
T
φ(x
T
).
Dirac fields
As mentioned, for Dirac fields, the annihilation and creation operators transform
as
ˆ
T b
s
(p)
ˆ
T
1
= η
T
(1)
1/2s
b
s
(p
T
)
ˆ
T d
s
(p)
ˆ
T
1
= η
T
(1)
1/2s
d
s
(p
T
)
It can be shown that
(1)
1
2
s
u
s
(p
T
) = Bu
s
(p)
(1)
1
2
s
v
s
(p
T
) = Bv
s
(p),
where
B = γ
5
C =
2
0
0
2
in the chiral representation.
Then, we have
ˆ
T ψ(x)
ˆ
T
1
= η
T
X
p,s
(1)
1
2
s
b
s
(p
T
)u
s
(p)e
+ip·x
+ d
s
v
s
(p)e
ip·x
.
Doing the standard manipulations, we find that
ˆ
T ψ(x)
ˆ
T
1
= η
T
X
p,s
(1)
1
2
s+1
b
s
(p)u
s
(p
T
)e
ip·x
T
+ d
s
(p)v
s
(p
T
)e
+ip·x
T
= +η
T
X
p,s
b
s
(p)BU
s
(p)e
ip·x
T
+ d
s
(p)BV
s
(p)e
+ip·x
T
= η
T
Bψ(x
T
).
Similarly,
ˆ
T
¯
ψ(x)
ˆ
T
1
= η
T
¯
ψ(x
T
)B
1
.
Fermion bilinears
Similarly, we have
¯
ψ(x)ψ(x) 7→
¯
ψ(x
T
)ψ(x
T
)
¯
ψ(x)γ
µ
ψ(x) 7→ T
µ
ν
¯
ψ(x
T
)γ
µ
ψ(x
T
)
This uses the fact that
B
1
γ
µ
B = T
µ
ν
γ
ν
.
So we see that in the second case, the 0 component, i.e. charge density, is
unchanged, while the spacial component, i.e. current density, gets a negative
sign. This makes physical sense.
3.5 S-matrix
We consider how S-matrices transform. Recall that S is defined by
hp
1
, p
2
, ···|S |k
A
, k
B
, ···i =
out
hp
1
, p
2
, ···|k
A
, k
b
, ···i
in
= lim
T →∞
hp
1
, p
2
, ···|e
iH2T
|k
A
, k
B
, ···i
We can write
S = T exp
i
Z
−∞
V (t) dt
,
where T denotes the time-ordered integral, and
V (t) =
Z
d
3
x L
I
(x),
and L
I
(x) is the interaction part of the Lagrangian.
Example. In QED, we have
L
I
= e
¯
ψ(x)γ
µ
A
µ
(x)ψ(x).
We can draw a table of how things transform:
P C T
L
I
(x) L
I
(x
P
) L
I
(x) L
I
(x
T
)
V (t) V (t) V (t) V (t)
S S S ??
A bit more care is to figure out how the
S
matrix transforms when we reverse
time, as we have a time-ordering in the integral. To figure out, we explicitly
write out the time-ordered integral. We have
S =
X
n=0
(i)
n
Z
−∞
dt
1
Z
t
1
−∞
dt
2
···
Z
t
n1
−∞
dt
n
V (t
1
)V (t
2
) ···V (t
n
).
Then we have
S
T
=
ˆ
T S
ˆ
T
1
=
X
n
(+i)
n
Z
−∞
dt
1
Z
t
1
−∞
dt
2
···
Z
t
n1
−∞
dt
n
V (t
1
)V (t
2
) ···V (t
n
)
We now put τ
i
= t
n+12
, and then we can write this as
=
X
n
(+i)
n
Z
−∞
dt
1
Z
τ
n
−∞
···
Z
τ
2
−∞
dt
n
V (τ
n
)V (τ
n1
) ···V (τ
1
)
=
X
n
(+i)
n
Z
−∞
dτ
n
Z
τ
n
dτ
n1
···
Z
τ
2
dτ
1
V (τ
n
)V (τ
n1
) ···V (τ
1
).
We now notice that
Z
−∞
dτ
n
Z
τ
n
dτ
n1
=
Z
−∞
dτ
n1
Z
τ
n1
−∞
dτ
n
.
We can see this by drawing the picture
τ
n
τ
n1
τ
n1
= τ
n
So we find that
S
T
=
X
n
(+i)
n
Z
−∞
dτ
1
Z
τ
1
−∞
dτ
2
···
Z
τ
n1
−∞
dτ
n
V (τ
n
)V (τ
n1
) ···V (τ
1
).
We can then see that this is equal to S
.
What does this tell us? Consider states |ηi and |ξi with
|η
T
i =
ˆ
T |ηi
|ξ
T
i =
ˆ
T |ξi
The Dirac bra-ket notation isn’t very useful when we have anti-linear operators.
So we will write inner products explicitly. We have
(η
T
, Sξ
T
) = (
ˆ
T η, S
ˆ
T ξ)
= (
ˆ
T η, S
T
ˆ
T ξ)
= (
ˆ
T η,
ˆ
T S
, ξ)
= (η, S
ξ)
= (ξ, Sη)
where we used the fact that
ˆ
T is anti-unitary. So the conclusion is
hη
T
|S |ξ
T
i = hξ|S |ηi.
So if
ˆ
T L
I
(x)
ˆ
T
1
= L
I
(x
T
),
then the
S
-matrix elements are equal for time-reversed processes, where the
initial and final states are swapped.
3.6 CPT theorem
Theorem
(CPT theorem)
.
Any Lorentz invariant Lagrangian with a Hermitian
Hamiltonian should be invariant under the product of P, C and T.
We will not prove this. For details, one can read Streater and Wightman,
PCT, spin and statistics and all that (1989).
All observations we make suggest that CPT is respected in nature. This
means a particle propagating forwards in time cannot be distinguished from the
antiparticle propagating backwards in time.
3.7 Baryogenesis
In the universe, we observe that there is much more matter than anti-matter.
Baryogenesis is the generation of this asymmetry in the universe. Sakarhov came
up with three conditions that are necessary for this to happen.
(i)
Baryon number violation (or leptogenesis, i.e. lepton number asymmetry),
i.e. some process X Y + B that generates an excess baryon.
(ii)
Non-equilibrium. Otherwise, the rate of
Y
+
B X
is the same as the
rate of X Y + B.
(iii)
C and CP violation. We need C violation, or else the rate of
X Y
+
B
and the rate of
¯
X
¯
Y
+
¯
B
would be the same, and the effects cancel out
each other.
We similarly need CP violation, or else the rate of
X nq
L
and the rate
of
X nq
R
, where
q
L,R
are left and right handed quarks, is equal to the
rate of ¯x n¯q
L
and ¯x n¯q
R
, and this will wash out our excess.
4 Spontaneous symmetry breaking
In this chapter, we are going to look at spontaneous symmetry breaking. The
general setting is as follows — our Lagrangian
L
enjoys some symmetry. However,
the potential has multiple minima. Usually, in perturbation theory, we imagine
we are sitting near an equilibrium point of the system (the “vacuum”), and then
look at what happens to small perturbations near the equilibrium.
When there are multiple minima, we have to arbitrarily pick a minimum
to be our vacuum, and then do perturbation around it. In many cases, this
choice of minimum is not invariant under our symmetries. Thus, even though
the theory itself is symmetric, the symmetry is lost once we pick a vacuum. It
turns out interesting things happen when this happens.
4.1 Discrete symmetry
We begin with a toy example, namely that of a discrete symmetry. Consider a
real scalar field φ(x) with a symmetric potential V (φ), so that
V (φ) = V (φ).
This gives a discrete Z/2Z symmetry φ φ.
We will consider the case of a φ
4
theory, with Lagrangian
L =
1
2
µ
φ∂
µ
φ
1
2
m
2
φ
2
λ
4!
φ
4
for some λ.
We want the potential to
as
φ
, so we necessarily require
λ >
0.
However, since the
φ
4
term dominates for large
φ
, we are now free to pick the
sign of m
2
, and still get a sensible theory.
Usually, this theory has m
2
> 0, and thus V (φ) has a minimum at φ = 0:
φ
S(φ)
However, we could imagine a scenario where
m
2
<
0, where we have “imaginary
mass”. In this case, the potential looks like
φ
V (φ)
v
To understand this potential better, we complete the square, and write it as
V (φ) =
λ
4
(φ
2
v
2
)
2
+ constant,
where
v =
r
m
2
λ
.
We see that now
φ
= 0 becomes a local maximum, and there are two (global)
minima at
φ
=
±v
. In particular,
φ
has acquired a non-zero vacuum expectation
value (VEV ).
We shall wlog consider small excitations around φ = v. Let’s write
φ(x) = v + cf(x).
Then we can write the Lagrangian as
L =
1
2
µ
f
µ
f λ
v
2
f
2
+ +vf
3
+
1
4
f
4
,
plus some constants. Therefore f is a scalar field with mass
m
2
f
= 2v
2
λ.
This
L
is not invariant under
f f
. The symmetry of the original Lagrangian
has been broken by the VEV of φ.
4.2 Continuous symmetry
We can consider a slight generalization of the above scenario. Consider an
N-component real scalar field φ = (φ
1
, φ
2
, ··· , φ
N
)
T
, with Lagrangian
L =
1
2
(
µ
φ) · (
µ
φ) V (φ),
where
V (φ) =
1
2
m
2
φ
2
+
λ
4
φ
4
.
As before, we will require that λ > 0.
This is a theory that is invariant under global O(
N
) transforms of
φ
. Again,
if
m
2
>
0, then
φ
= 0 is a global minimum, and we do not have spontaneous
symmetry breaking. Thus, we consider the case m
2
< 0.
In this case, we can write
V (φ) =
λ
4
(φ
2
v
2
)
2
+ constant,
where
v =
m
2
λ
> 0.
So any
φ
with
φ
2
=
v
2
gives a global minimum of the system, and so we have a
continuum of vacua. We call this a Sombrero potential (or wine bottle potential).
φ
1
φ
2
V (φ)
Without loss of generality, we may choose the vacuum to be
φ
0
= (0, 0, ··· , 0, v)
T
.
We can then consider steady small fluctuations about this:
φ(x) = (π
1
(x), π
2
(x), ··· , π
N1
(x), v + σ(x))
T
.
We now have a π field with N 1 components, plus a 1-component σ field.
We rewrite the Lagrangian in terms of the π and σ fields as
L =
1
2
(
µ
π) ·(
µ
π) +
1
2
(
µ
σ) ·(
µ
σ) V (π, σ),
where
V (π, σ) =
1
2
m
2
σ
σ
s
+ λv(σ
2
+ π
2
)σ +
λ
4
(σ
2
+ π
2
)
2
.
Again, we have dropped a constant term.
In this Lagrangian, we see that the σ field has mass
m
σ
=
2λv
2
,
but the
π
fields are massless. We can understand this as follows the
σ
field
corresponds to the radial direction in the potential, and this is locally quadratic,
and hence has a mass term. However, the
π
fields correspond to the excitations
in the azimuthal directions, which are flat.
4.3 General case
What we just saw is a completely general phenomenon. Suppose we have an
N
-dimensional field
φ
= (
φ
1
, ··· , φ
N
), and suppose our theory has a Lie group
of symmetries
G
. We will assume that the action of
G
preserves the potential
and kinetic terms individually, and not just the Lagrangian as a whole, so that
G will send a vacuum to another vacuum.
Suppose there is more than one choice of vacuum. We write
Φ
0
= {φ
0
: V (φ
0
) = V
min
}
for the set of all vacua. Now we pick a favorite vacuum
φ
0
Φ
0
, and we want
to look at the elements of G that fix this vacuum. We write
H = stab(φ
0
) = {h G :
0
= φ
0
}.
This is the invariant subgroup, or stabilizer subgroup of
φ
0
, and is the symmetry
we are left with after the spontaneous symmetry breaking.
We will make some further simplifying assumptions. We will suppose
G
acts
transitively on Φ
0
, i.e. given any two vacua
φ
0
, φ
0
0
, we can find some
g G
such
that
φ
0
0
= gφ
0
.
This ensures that any two vacua are “the same”, so which
φ
0
we pick doesn’t really
matter. Indeed, given two such vacua and
g
relating them, it is a straightforward
computation to check that
stab(φ
0
0
) = g stab(φ
0
)g
1
.
So any two choices of vacuum will result in conjugate, and in particular isomor-
phic, subgroups H.
It is a curious, and also unimportant, observation that given a choice of
vacuum φ
0
, we have a canonical bijection
G/H Φ
0
g gφ
0
.
So we can identify G/H and Φ
0
. This is known as the orbit-stabilizer theorem.
We now try to understand what happens when we try to do perturbation
theory around our choice of vacuum. At φ = φ
0
+ δφ, we can as usual write
V (φ
0
+ δφ) V (φ
0
) =
1
2
δφ
r
2
V
φ
r
φ
s
δφ
s
+ O(δφ
3
).
This quadratic
2
V term is now acting as a mass. We call it the mass matrix
M
2
rs
=
2
V
φ
r
φ
s
.
Note that we are being sloppy with where our indices go, because (
M
2
)
rs
is ugly.
It doesn’t really matter much, since
φ
takes values in
R
N
(or
C
n
), which is a
Euclidean space.
In our previous example, we saw that there were certain modes that went
massless. We now want to see if the same happens. To do so, we pick a basis
{
˜
t
a
} of h (the Lie algebra of H), and then extend it to a basis {t
a
, θ
a
} of g.
We consider a variation
δφ = εθ
a
φ
0
around φ
0
.
Note that when we write
θ
a
φ
0
, we mean the resulting infinitesimal change of
φ
0
when
θ
a
acts on field. For a general Lie algebra element, this may be zero.
However, we know that
θ
a
does not belong to
h
, so it does not fix
φ
0
. Thus
(assuming sensible non-degeneracy conditions) this implies that
θ
a
φ
0
is non-zero.
At this point, the uncareful reader might be tempted to say that since
G
is a
symmetry of V , we must have V (φ
0
+ δφ) = V (φ
0
), and thus
(θ
a
φ)
r
M
2
rs
(θ
a
φ)
s
= 0,
and so we have found zero eigenvectors of
M
rs
. But this is wrong. For example,
if we take
M =
1 0
0 1
, θ
a
φ =
1
1
,
then our previous equation holds, but
θ
a
φ
is not a zero eigenvector. We need to
be a bit more careful.
Instead, we note that by definition of
G
-invariance, for any field value
φ
and
a corresponding variation
φ 7→ φ + εθ
a
φ,
we must have
V (φ + δφ) = V (φ),
since this is what it means for G to be a symmetry.
Expanding to first order, this says
V (φ + δφ) V (φ) = ε(θ
a
φ)
r
V
φ
r
= 0.
Now the trick is to take a further derivative of this. We obtain
0 =
φ
s
(θ
a
φ)
r
V
φ
r
=
φ
s
(θ
a
φ)
r
V
φ
r
+ (θ
a
φ)
r
M
2
rs
.
Now at a vacuum φ = φ
0
, the first term vanishes. So we must have
(θ
a
φ
0
)
r
M
2
rs
= 0.
By assumption,
θ
a
φ
0
6
= 0. So we have found a zero vector of
M
rs
. Recall that
we had
dim G dim H
of these
θ
a
in total. So we have found
dim G dim H
zero eigenvectors in total.
We can do a small sanity check. What if we replaced
θ
a
with
t
a
? The same
derivations would still follow through, and we deduce that
(t
a
φ
0
)
r
M
2
rs
= 0.
But since
t
a
is an unbroken symmetry, we actually just have
t
a
φ
0
= 0, and this
really doesn’t say anything interesting.
What about the other eigenvalues? They could be zero for some other reason,
and there is no reason to believe one way or the other. However, generally, in
the scenarios we meet, they are usually massive. So after spontaneous symmetry
breaking, we end up with
dim Gdim H
massless modes, and
N
(
dim Gdim H
)
massive ones.
This is Goldstone’s theorem in the classical case.
Example.
In our previous
O
(
N
) model, we had
G
=
O
(
N
), and
H
=
O
(
N
1).
These have dimensions
dim G =
N(N 1)
2
, dim H =
(N 1)(N 2)
2
.
Then we have
Φ
0
=
S
N1
,
the (N 1)-dimensional sphere. So we expect to have
N 1
2
(N (N 2)) = N 1
massless modes, which are the
π
fields we found. We also found
N
(
N
1) = 1
massive mode, namely the σ field.
Example.
In the first discrete case, we had
G
=
Z/
2
Z
and
H
=
{e}
. These are
(rather degenerate) 0-dimensional Lie groups. So we expect 0
0 = 0 massless
modes, and 1 0 = 1 massive modes, which is what we found.
4.4 Goldstone’s theorem
We now consider the quantum version of Goldstone’s theorem. We hope to get
something similar.
Again, suppose our Lagrangian has a symmetry group
G
, which is sponta-
neously broken into
H G
after picking a preferred vacuum
|0i
. Again, we take
a Lie algebra basis
{t
a
, θ
a
}
of
g
, where
{t
a
}
is a basis for
h
. Note that we will
assume that
a
runs from 1
, ··· , dim G
, and we use the first
dim H
labels to refer
to the t
a
, and the remaining to label the θ
a
, so that the actual basis is
{t
1
, t
2
, ··· , t
dim H
, θ
dim H+1
, ··· , θ
dim G
}.
For aesthetic reasons, this time we put the indices on the generators below.
Now recall, say from AQFT, that these symmetries of the Lagrangian give
rise to conserved currents j
µ
a
(x). These in turn give us conserved charges
Q
a
=
Z
d
3
x j
0
a
(x).
Moreover, under the action of the
t
a
and
θ
a
, the corresponding change in
φ
is
given by
δφ = [Q
a
, φ].
The fact that the θ
a
break the symmetry of |0i tells us that, for a > dim H,
h0|[Q
a
, φ(0)] |0i 6= 0.
Note that the choice of evaluating
φ
at 0 is completely arbitrary, but we have
to pick somewhere, and 0 is easy to write. Writing out the definition of
Q
a
, we
know that
Z
d
3
x h0|[j
0
a
(x), φ(0)] |0i 6= 0.
More generally, since the label
0
is arbitrary by Lorentz invariance, we are really
looking at the relation
h0|[j
µ
a
(x), φ(0)] |0i 6= 0,
and writing it in this more Lorentz covariant way makes it easier to work with.
The goal is to deduce the existence of massless states from this non-vanishing.
For convenience, we will write this expression as
X
µ
a
= h0|[j
µ
a
(x), φ(0)] |0i.
We treat the two terms in the commutator separately. The first term is
h0|j
µ
a
(x)φ(0) |0i =
X
n
h0|j
µ
a
(x) |nihn|φ(0) |0i. ()
We write p
n
for the momentum of |ni. We now note the operator equation
j
µ
a
(x) = e
iˆp·x
j
µ
a
(0)e
iˆp·x
.
So we know
h0|j
µ
a
(x) |ni = h0|j
µ
a
(0) |nie
ip
n
·x
.
We can use this to do some Fourier transform magic. By direct verification, we
can see that () is equivalent to
i
Z
d
4
k
(2π)
3
ρ
µ
a
(k)e
ik·x
,
where
µ
a
(k) = (2π)
3
X
n
δ
4
(k p
n
) h0|j
µ
a
(0) |nihn|φ(0) |0i.
Similarly, we define
i˜ρ
µ
a
(k) = (2π)
3
X
n
δ
4
(k p
n
) h0|φ(0) |nihn|j
µ
a
(0) |0i.
Then we can write
X
µ
a
= i
Z
d
4
k
(2π)
3
ρ
µ
a
(k)e
ik·x
˜ρ
µ
a
(k)e
+ik·x
.
This is called the allen-Lehmann spectral representation.
We now claim that
ρ
µ
a
(
k
) and
˜ρ
µ
a
(
k
) can be written in a particularly nice
form. This involves a few observations (when I say
ρ
µ
a
, I actually mean
ρ
µ
a
and
˜ρ
µ
a
):
ρ
µ
a
depends Lorentz covariantly in
k
. So it “must” be a scalar multiple of
k
µ
.
ρ
µ
a
(k) must vanish when k
0
> 0, since k
0
0 is non-physical.
By Lorentz invariance, the magnitude of
ρ
µ
a
(
k
) can depend only on the
length (squared) k
2
of k.
Under these assumptions, we can write can write ρ
µ
a
and ˜ρ
µ
a
as
ρ
µ
a
(k) = k
µ
Θ(k
0
)ρ
a
(k
2
),
˜ρ
µ
a
(k) = k
µ
Θ(k
0
)˜ρ
a
(k
2
),
where Θ is the Heaviside theta function
Θ(x) =
(
1 x > 0
0 x 0
.
So we can write
X
µ
a
= i
Z
d
4
k
(2π)
3
k
µ
Θ(k
0
)(ρ
a
(k
2
)e
ik·x
˜ρ
a
(k
2
)e
+ik·x
).
We now use the (nasty?) trick of hiding the
k
µ
by replacing it with a derivative:
X
µ
a
=
µ
Z
d
4
k
(2π)
3
Θ(k
0
)
ρ
a
(k
2
)e
ik·x
+ ˜ρ
a
(k
2
)e
ik·x
.
Now we might find the expression inside the integral a bit familiar. Recall that
the propagator is given by
D(x, σ) = h0|φ(x)φ(0) |0i =
Z
d
4
k
(2π)
3
Θ(k
0
)δ(k
2
σ)e
ik·x
.
where σ is the square of the mass of the field. Using the really silly fact that
ρ
a
(k
2
) =
Z
dσ ρ
a
(σ)δ(k
2
σ),
we find that
X
µ
a
=
µ
Z
dσ (ρ
a
(σ)D(x, σ) + ˜ρ
a
(σ)D(x, σ)).
Now we have to recall more properties of
D
. For spacelike
x
, we have
D
(
x, σ
) =
D
(
x, σ
). Therefore, requiring
X
µ
a
to vanish for spacelike
x
by causality, we see
that we must have
ρ
a
(σ) = ˜ρ
a
(σ).
Therefore we can write
X
µ
a
=
µ
Z
dσ ρ
a
(σ)i∆(x, σ), ()
where
i∆(x, σ) = D(x, σ) D(x, σ) =
Z
d
4
k
(2π)
3
δ(k
2
σ)ε(k
0
)e
ik·x
,
and
ε(k
0
) =
(
+1 k
0
> 0
1 k
0
< 0
.
This is again a different sort of propagator.
We now use current conservation, which tells us
µ
j
µ
a
= 0.
So we must have
µ
X
µ
a
=
2
Z
dσ ρ
a
(σ)i∆(x, σ) = 0.
On the other hand, by inspection, we see that satisfies the Klein-Gordon
equation
(
2
+ σ)∆ = 0.
So in (), we can replace
2
with σ∆. So we find that
µ
X
µ
a
=
Z
dσ σρ
a
(σ)i∆(x, σ) = 0.
This is true for all
x
. In particular, it is true for timelike
x
where is non-zero.
So we must have
σρ
a
(σ) = 0.
But we also know that
ρ
a
(
σ
) cannot be identically zero, because
X
µ
a
is not. So
the only possible explanation is that
ρ
a
(σ) = N
a
δ(σ),
where N
a
is a dimensionful non-zero constant.
Now we retrieve our definitions of ρ
a
. Recall that they are defined by
µ
a
(k) = (2π)
3
X
n
δ
4
(k p
n
) h0|j
µ
a
(0) |nihn|φ(0) |0i
ρ
µ
a
(k) = k
µ
Θ(k
0
)ρ
a
(k
2
).
So the fact that
ρ
a
(
σ
) =
N
a
δ
(
σ
) implies that there must be some states, which
we shall call |B(p)i, of momentum p, such that p
2
= 0, and
h0|j
µ
a
(0) |B(p)i 6= 0
hB(p)|φ(0) |0i 6= 0.
The condition that
p
2
= 0 tell us these particles are massless, which are those
massless modes we were looking for! These are the Goldstone bosons.
We can write the values of the above as
h0|j
µ
a
(0) |B(p)i = iF
B
a
p
µ
hB(p)|φ(0) |0i = Z
B
,
whose form we deduced by Lorentz in/covariance. By dimensional analysis, we
know F
aB
is a dimension 1 constant, and Z
B
is a dimensionless constant.
From these formulas, we note that since
φ
(0)
|0i
is rotationally invariant, we
deduce that |B(p)i also is. So we deduce that these are in fact spin 0 particles.
Finally, we end with some computations that relate these different numbers
we obtained. We will leave the computational details for the reader, and just
outline what we should do. Using our formula for ρ, we find that
h0|[j
µ
a
(x), φ(0)] |0i =
µ
Z
N
a
δ(σ)i∆(x, σ) dσ = iN
a
µ
∆(x, 0).
Integrating over space, we find
h0|[Q
a
, φ(0)] |0i = iN
a
Z
d
3
x
0
∆(x, 0) = iN
a
.
Then we have
h0|t
a
φ(0) |0i = i h0|[Q
a
, φ(0)] |0i = i · iN
a
= N
a
.
So this number
N
a
sort-of measures how much the symmetry is broken, and we
once again see that it is the breaking of symmetry that gives rise to a non-zero
N
a
, hence the massless bosons.
We also recall that we had
ik
µ
Θ(k
0
)N
a
δ(k
2
) =
X
B
Z
d
3
p
2|p|
δ
4
(k p) h0|j
µ
a
(0) |B(p)ihB(p)|φ(0) |0i.
On the RHS, we can plug in the names we gave for the matrix elements, and on
the left, we can write it as an integral
Z
d
3
p
2|p|
δ
4
(k p)ik
µ
N
a
=
Z
d
3
p
2|p|
δ
4
(k p)ip
µ
X
B
F
B
a
Z
B
.
So we must have
N
a
=
X
B
F
B
a
Z
B
.
We can repeat this process for each independent symmetry-breaking
θ
a
g \ h
,
and obtain a Goldstone boson this way. So all together, at least superficially, we
find
n = dim G dim H
many Goldstone bosons.
It is important to figure out the assumptions we used to derive the result.
We mentioned “Lorentz invariance” many times in the proof, so we certainly
need to assume our theory is Lorentz invariant. Another perhaps subtle point is
that we also needed states to have a positive definite norm. It turns out in the
case of gauge theory, these do not necessarily hold, and we need to work with
them separately.
4.5 The Higgs mechanism
Recall that we had a few conditions for our previous theorem to hold. In the case
of gauge theories, these typically don’t hold. For example, in QED, imposing
a Lorentz invariant gauge condition (e.g. the Lorentz gauge) gives us negative
norm states. On the other hand, if we fix the gauge condition so that we don’t
have negative norm states, this breaks Lorentz invariance. What happens in this
case is known as the Higgs mechanism.
Let’s consider the case of scalar electrodynamics. This involves two fields:
A complex scalar field φ(x) C.
A 4-vector field A(x) R
1,3
.
As usual the components of
A
(
x
) are denoted
A
µ
(
x
). From this we define the
electromagnetic field tensor
F
µν
=
µ
A
ν
ν
A
µ
,
and we have a covariant derivative
D
µ
=
µ
+ iqA
µ
.
As usual, the Lagrangian is given by
L =
1
4
F
µν
F
µν
+ (D
µ
φ)
(D
µ
φ) V (|φ|
2
)
for some potential V .
A U(1) gauge transformation is specified by some
α
(
x
)
R
. Then the fields
transform as
φ(x) 7→ e
(x)
φ(x)
A
µ
(x) 7→ A
µ
(x)
1
q
µ
α(x).
We will consider a φ
4
theory, so that the potential is
V (|φ|
2
) = µ
2
|φ|
2
+ λ|φ|
4
.
As usual, we require
λ >
0, and if
µ
2
>
0, then this is boring with a unique
vacuum at φ = 0. In this case, A
µ
is massless and φ is massive.
If instead µ
2
< 0, then this is the interesting case. We have a minima at
|φ
0
|
2
=
µ
2
2λ
v
2
2
.
Without loss of generality, we expand around a real φ
0
, and write
φ(x) =
1
2
e
(x)/v
(v + η(x)),
where η, θ are real fields.
Now we notice that
θ
isn’t a “genuine” field (despite being real). Our theory
is invariant under gauge transformations. Thus, by picking
α(x) =
1
v
θ(x),
we can get rid of the θ(x) term, and be left with
φ(x) =
1
2
(v + η(x)).
This is called the unitary gauge. Of course, once we have made this choice, we
no longer have the gauge freedom, but on the other hand everything else going
on becomes much clearer.
In this gauge, the Lagrangian can be written as
L =
1
2
(
µ
η
µ
η + 2µ
2
η
2
)
1
4
F
µν
F
µν
+
q
2
v
2
2
A
µ
A
µ
+ L
int
,
where L
int
is the interaction piece that involves more than two fields.
We can now just read off what is going on in here.
The η field is massive with mass
m
2
µ
= 2µ
2
= 2λv
2
> 0.
The photon now gains a mass
m
2
A
= q
2
v
2
.
What would be the Goldstone boson, namely the
θ
field, has been “eaten”
to become the longitudinal polarization of
A
µ
and
A
µ
has gained a degree
of freedom (or rather, the gauge non-degree-of-freedom became a genuine
degree of freedom).
One can check that the interaction piece becomes
L
int
=
q
2
2
A
µ
A
µ
η
2
+ qm
A
A
µ
A
µ
η
λ
4
η
4
m
η
r
λ
2
η
3
.
So we have interactions that look like
This η field is the “Higgs boson” in our toy theory.
4.6 Non-abelian gauge theories
We’ll not actually talk about spontaneous symmetry in non-abelian gauge theories.
That is the story of the next chapter, where we start studying electroweak theory.
Instead, we’ll just briefly set up the general framework of non-abelian gauge
theory.
In general, we have a gauge group
G
, which is a compact Lie group with
Lie algebra
g
. We also have a representation of
G
on
C
n
, which we will assume
is unitary, i.e. each element of
G
is represented by a unitary matrix. Our field
ψ(x) C
n
takes values in this representation.
A gauge transformation is specified by giving a
g
(
x
)
G
for each point
x
in
the universe, and then the field transforms as
ψ(x) 7→ g(x)ψ(x).
Alternatively, we can represent this transformation infinitesimally, by producing
some t(x) g, and then the transformation is specified by
ψ(x) 7→ exp(it(x))ψ(x).
Associated to our gauge theory is a gauge field
A
µ
(
x
)
g
(i.e. we have an element
of
g
for each
µ
and
x
), which transforms under an infinitesimal transformation
t(x) g as
A
µ
(x) 7→
µ
t(x) + [t, A
µ
].
The gauge covariant derivative is again give by
D
µ
=
µ
+ igA
µ
.
where all fields are, of course, implicitly evaluated at
x
. As before, we can define
F
µν
=
µ
A
ν
ν
A
µ
g[A
µ
, A
ν
] g,
Alternatively, we can write this as
[D
µ
, D
ν
] = igF
µν
.
We will later on work in coordinates. We can pick a basis of
g
, say
t
1
, ··· , t
n
.
Then we can write an arbitrary element of
g
as
θ
a
(
x
)
t
a
. Then in this basis, we
write the components of the gauge field as
A
a
µ
, and similarly the field strength is
denoted F
a
µν
. We can define the structure constants f
abc
by
[t
a
, t
b
] = f
abc
t
c
.
Using this notation, the gauge part of the Lagrangian L is
L
g
=
1
4
F
a
µν
F
aµν
=
1
2
Tr(F
µν
F
µν
).
We now move on to look at some actual theories.
5 Electroweak theory
We can now start discussing the standard model. In this chapter, we will account
for everything in the theory apart from the strong force. In particular, we will
talk about the gauge (electroweak) and Higgs part of the theory, as well as the
matter content.
The description of the theory will be entirely in the classical world. It is only
when we do computations that we quantize the theory and compute
S
-matrices.
5.1 Electroweak gauge theory
We start by understanding the gauge and Higgs part of the theory. As mentioned,
the gauge group is
G = SU(2)
L
× U(1)
Y
.
We will see that this is broken by the Higgs mechanism.
It is convenient to pick a basis for su(2), which we will denote by
τ
a
=
σ
a
2
.
Note that these are not genuinely elements of
su
(2). We need to multiply these
by
i
to actually get members of
su
(2) (thus, they form a complex basis for the
complexification of
su
(2)). As we will later see, these act on fields as
e
a
instead
of e
τ
a
. Under this basis, the structure constants are f
abc
= ε
abc
.
We start by describing how our gauge group acts on the Higgs field.
Definition
(Higgs field)
.
The Higgs field
φ
is a complex scalar field with
two components,
φ
(
x
)
C
2
. The
SU
(2) action is given by the fundamental
representation on C
2
, and the hypercharge is Y =
1
2
.
Explicitly, an (infinitesimal) gauge transformation can be represented by
elements
α
a
(
x
)
, β
(
x
)
R
, corresponding to the elements
α
a
(
x
)
τ
a
su
(2) and
β(x) u(1)
=
R. Then the Higgs field transform as
φ(x) 7→ e
a
(x)τ
a
e
i
1
2
β(x)
φ(x),
where the
1
2
factor of β(x) comes from the hypercharge being
1
2
.
Note that when we say
φ
is a scalar field, we mean it transforms trivially via
Lorentz transformations. It still takes values in the vector space C
2
.
The gauge fields corresponding to
SU
(2) and U(1) are denoted
W
a
µ
and
B
µ
,
where again
a
runs through
a
= 1
,
2
,
3. The covariant derivative associated to
these gauge fields is
D
µ
=
µ
+ igW
a
µ
τ
a
+
1
2
ig
0
B
µ
for some coupling constants g and g
0
.
The part of the Lagrangian relating the gauge and Higgs field is
L
gauge
=
1
2
Tr(F
W
µν
F
W,µν
)
1
4
F
B
µν
F
B,µν
+ (D
µ
φ)
(D
µ
φ) µ
2
|φ|
2
λ|φ|
4
,
where the field strengths of W and B respectively are given by
F
W,a
µν
=
µ
W
a
ν
ν
W
a
µ
gε
abc
W
b
µ
W
c
ν
F
B
µν
=
µ
B
ν
ν
V
µ
.
In the case
µ
2
<
0, the Higgs field acquires a VEV, and we wlog shall choose
the vacuum to be
φ
0
=
1
2
0
v
,
where
µ
2
= λv
2
< 0.
As in the case of U(1) symmetry breaking, the gauge term gives us new things
with mass. After spending hours working out the computations, we find that
(D
µ
φ)
(D
µ
φ) contains mass terms
1
2
v
2
4
g
2
(W
1
)
2
+ g
2
(W
2
)
2
+ (gW
3
+ g
0
B)
2
.
This suggests we define the following new fields:
W
±
µ
=
1
2
(W
1
µ
iW
2
µ
)
Z
0
µ
A
µ
=
cos θ
W
sin θ
W
sin θ
W
cos θ
W
W
3
µ
B
µ
,
where we pick θ
W
such that
cos θ
W
=
g
p
g
2
+ g
02
, sin θ
W
=
g
0
p
g
2
+ g
02
.
This θ
W
is called the Weinberg angle. Then the mass term above becomes
1
2
v
2
g
2
4
((W
+
)
2
+ (W
)
2
) +
v
2
(g
2
+ g
02
)
4
Z
2
.
Thus, our particles have masses
m
W
=
vg
2
m
Z
=
v
2
p
g
2
+ g
02
m
γ
= 0,
where
γ
is the
A
µ
particle, which is the photon. Thus, our original
SU
(2)
×
U(1)
Y
breaks down to a U(1)
EM
symmetry. In terms of the Weinberg angle,
m
W
= m
Z
cos θ
W
.
Thus, through the Higgs mechanism, we find that the
W
±
and
Z
bosons gained
mass, but the photon does not. This agrees with what we find experimentally.
In practice, we find that the masses are
m
W
80 GeV
m
Z
91 GeV
m
γ
< 10
18
GeV.
Also, the Higgs boson gets mass, as what we saw previously. It is given by
m
H
=
p
2µ
2
=
2λv
2
.
Note that the Higgs mass depends on the constant
λ
, which we haven’t seen
anywhere else so far. So we can’t tell
m
H
from what we know about
W
and
Z
.
Thus, until the Higgs was discovered recently, we didn’t know about the mass of
the Higgs boson. We now know that
m
H
125 GeV.
Note that we didn’t write out all terms in the Lagrangian, but as we did before,
we are going to get W
±
, Z-Higgs and Higgs-Higgs interactions.
One might find it a bit pointless to define the
W
+
and
W
fields as we did,
as the kinetic terms looked just as good with
W
1
and
W
2
. The advantage of
this new definition is that
W
+
µ
is now the complex conjugate
W
µ
, so we can
instead view the
W
bosons as given by a single complex scalar field, and when
we quantize, W
+
µ
will be the anti-particle of W
µ
.
5.2 Coupling to leptons
We now look at how gauge bosons couple to matter. In general, for a particle
with hypercharge Y , the covariant derivative is
D
µ
=
µ
+ igW
a
µ
τ
a
+ ig
0
Y B
µ
In terms of the W
±
and Z bosons, we can write this as
D
µ
=
µ
+
ig
2
(W
+
µ
τ
+
+ W
µ
τ
) +
igZ
µ
cos θ
W
(cos
2
θ
W
τ
3
sin
2
θ
W
Y )
+ ig sin θ
W
A
µ
(τ
3
+ Y ),
where
τ
±
= τ
1
±
2
.
By analogy, we interpret the final term
ig sin θ
W
A
µ
(
τ
3
+
Y
) is the usual coupling
with the photon. So we can identify the (magnitude of the) electron charge as
e = g sin θ
W
,
while the U(1)
EM
charge matrix is
Q = U(1)
EM
= τ
3
+ Y.
If we want to replace all occurrences of the hypercharge
Y
with
Q
, then we need
to note that
cos
2
θ
W
τ
3
sin
2
θ
W
Y = τ
3
sin
2
θ
W
Q.
We now introduce the electron field. The electron field is given by a spinor field
e(x). We will decompose it as left and right components
e(x) = e
L
(x) + e
R
(x).
There is also a neutrino field
ν
e
L
(
x
). We will assume that neutrinos are massless,
and there are only left-handed neutrinos. We know this is not entirely true,
because of neutrinos oscillation, but the mass is very tiny, and this is a very
good approximation.
To come up with an actual electroweak theory, we need to specify a represen-
tation of SU(2) × U(1). It is convenient to group our particles by handedness:
R(x) = e
R
(x), L(x) =
ν
e
L
(x)
e
L
(x)
.
Here
R
(
x
) is a single spinor, while
L
(
x
) consists of a pair, or (
SU
(2)) doublet of
spinors.
We first look at
R
(
x
). Experimentally, we find that
W
±
only couples to
left-handed leptons. So
R
(
x
) will have the trivial representation of
SU
(2). We
also know that electrons have charge
Q
=
1. Since
τ
3
acts trivially on
R
, we
must have
Q = Y = 1
for the right-handed leptons.
The left-handed particles are more complicated. We will assert that
L
has
the “fundamental” representation of SU(2), by which we mean given a matrix
g =
a b
¯
b ¯a
SU(2),
it acts on L as
gL =
a b
¯
b ¯a
ν
e
L
e
L
=
e
L
+ be
L
¯
e
L
+ ¯ae
L
.
We know the electron has charge
1 and the neutrino has charge 0. So we need
Q =
0 0
0 1
.
Since Q = τ
3
+ Y , we must have Y =
1
2
.
Using this notation, the gauge part of the lepton Lagrangian can be written
concisely as
L
EW
lepton
=
¯
Li
/
DL +
¯
Ri
/
DR.
Note that
¯
L
means we take the transpose of the matrix
L
, viewed as a matrix
with two components, and then take the conjugate of each spinor individually,
so that
¯
L =
¯ν
e
L
(x) ¯e
L
(x)
.
How about the mass? Recall that it makes sense to deal with left and right-
handed fermions separately only if the fermion is massless. A mass term
m
e
(¯e
L
e
R
+ ¯e
R
e
L
)
would be very bad, because our SU(2) action mixes e
L
with ν
e
L
.
But we do know that the electron has mass. The answer is that the mass
is again granted by the spontaneous symmetry breaking of the Higgs boson.
Working in unitary gauge, we can write the Higgs boson as
φ(x) =
1
2
0
v + h(x)
,
with v, h(x) R.
The lepton-Higgs interactions is given by
L
lept
=
2λ
e
(
¯
LφR +
¯
L),
where
λ
e
is the Yukawa coupling. It is helpful to make it more explicit what we
mean by this expression. We have
¯
LφR =
¯ν
e
L
(x) ¯e
L
(x)
1
2
0
v + h(x)

e
R
(x) =
1
2
(v + h(x))¯e
L
(x)e
R
(x).
Similarly, we can write out the second term and obtain
L
lept
= λ
e
(v + h)(¯e
L
e
R
+ ¯e
R
e
L
)
= m
e
¯ee λ
e
h¯ee,
where
m
e
= λ
e
v.
We see that while the interaction term is a priori gauge invariant, once we have
spontaneously broken the symmetry, we have magically obtained a mass term
m
e
. The second term in the Lagrangian is the Higgs-fermion coupling. We see
that this is proportional to m
e
. So massive things couple more strongly.
We now return to the fermion-gauge boson interactions. We can write it as
L
EM,int
lept
=
g
2
2
(J
µ
W
+
µ
+ J
µ
W
µ
) + eJ
µ
EM
A
µ
+
g
2 cos θ
W
J
µ
n
Z
µ
.
where
J
µ
EM
= ¯
µ
e
J
µ
= ¯ν
e
L
γ
µ
(1 γ
5
)e
J
µ
n
=
1
2
¯ν
e
L
γ
µ
(1 γ
5
)ν
e
L
¯
µ
(1 γ
5
4 sin
2
θ
w
)e
Note that
J
µ
EM
has a negative sign because the electron has negative charge.
These currents are known as the EM current, charge weak current and neutral
weak current respectively.
There is one thing we haven’t mentioned so far. We only talked about
electrons and neutrinos, but the standard model has three generations of leptons.
There are the muon (
µ
) and tau (
τ
), and corresponding left-handed neutrinos.
With these in mind, we introduce
L
1
=
ν
e
L
e
L
, L
2
=
ν
µ
L
µ
L
, L
3
=
ν
τ
L
τ
L
.
R
1
= e
R
, R
2
= µ
R
, R
3
= τ
R
.
These couple and interact in exactly the same way as the electrons, but have
heavier mass. It is straightforward to add these into the
L
EW,int
lept
term. What is
more interesting is the Higgs interaction, which is given by
L
lept
=
2
λ
ij
¯
L
i
φR
j
+ (λ
)
ij
¯
R
i
φ
L
j
,
where
i
and
j
run over the different generations. What’s new now is that we
have the matrices λ M
3
(C). These are not predicted by the standard model.
This
λ
is just a general matrix, and there is no reason to expect it to be
diagonal. However, in some sense, it is always diagonalizable. The key insight
is that we contract the two indices
λ
with two different kinds of things. It is a
general linear algebra fact that for any matrix
λ
at all, we can find some unitary
matrices U and S such that
λ = U ΛS
,
where Λ is a diagonal matrix with real entries. We can then transform our fields
by
L
i
7→ U
ij
L
j
R
i
7→ S
ij
R
j
,
and this diagonalizes L
lept
.
It is also clear that this leaves
L
EW
lept
invariant, especially if we look at the
expression before symmetry breaking. So the mass eigenstates are the same as
the weak eigenstates”. This tells us that after diagonalizing, there is no mixing
within the different generations.
This is important. It is not possible for quarks (or if we have neutrino mass),
and mixing does occur in these cases.
5.3 Quarks
We now move on to study quarks. There are 6 flavours of quarks, coming in
three generations:
Charge First generation Second generation Third generation
+
2
3
Up (u) Charm (c) Top (t)
1
3
Down (d) Strange (s) Bottom (b)
each of which is a spinor.
The right handed fields have trivial
SU
(2) representations, and are thus
SU
(2)
singlets. We write them as
u
R
=
u
R
c
R
t
R
which have Y = Q = +
2
3
, and
d
R
=
d
R
s
R
b
R
which have Y = Q =
1
3
.
The left-handed fields are in SU(2) doublets
Q
i
L
=
u
i
L
d
i
L
=
u
d
L
c
s
L
t
b
L
and these all have Y =
1
6
. Here i = 1, 2, 3 labels generations.
The electroweak part of the Lagrangian is again straightforward, given by
L
EW
quark
=
¯
Q
L
i
/
DQ
L
+ ¯u
R
i
/
Du
R
+
¯
d
R
i
/
Dd
R
.
It takes a bit more work to couple these things with
φ
. We want to do so in a
gauge invariant way. To match up the
SU
(2) part, we need a term that looks
like
¯
Q
i
L
φ,
as both
Q
L
and
φ
have the fundamental representation. Now to have an
invariant U(1) theory, the hypercharges have to add to zero, as under a U(1)
gauge transformation
β
(
x
), the term transforms as
e
i
P
Y
i
β(x)
. We see that the
term
¯
Q
i
L
φd
i
R
works well. However, coupling
¯
Q
i
L
with
u
R
is more problematic.
¯
Q
i
L
has
hypercharge
1
6
and
u
R
has hypercharge +
2
3
. So we need another term of
hypercharge
1
2
. To do so, we introduce the charge-conjugated φ, defined by
(φ
c
)
α
= ε
αβ
φ
β
.
One can check that this transforms with the fundamental
SU
(2) representation,
and has
Y
=
1
2
. Inserting generic coupling coefficients
λ
ij
u,d
, we write the
Lagrangian as
L
quark
=
2(λ
ij
d
¯
Q
i
L
φd
i
R
+ λ
ij
u
¯
Q
i
L
φ
c
u
j
R
+ h.c.).
Here h.c. means “hermitian conjugate”.
Let’s try to diagonalize this in the same way we diagonalized leptons. We
again work in unitary gauge, so that
φ(x) =
1
2
0
v + h(x)
,
Then the above Lagrangian can be written as
L
quark
= (λ
ij
d
¯
d
i
L
(v + h)d
i
R
+ λ
ij
u
¯u
i
L
(v + h)u
j
R
+ h.c.).
We now write
λ
u
= U
u
Λ
u
S
u
λ
d
= U
d
Λ
d
S
d
,
where Λ
u,d
are diagonal and
S
u,d
, U
u,d
are unitary. We can then transform the
field in a similar way:
u
L
7→ U
u
u
L
, d
L
7→ U
d
d
L
, u
R
7→ S
u
u
R
, d
R
7→ S
d
d
R
.
and then it is a routine check that this diagonalizes
L
quark
. In particular, the
mass term looks like
X
i
(m
i
d
¯
d
i
L
d
i
R
+ m
i
d
¯u
i
L
u
i
R
+ h.c.),
where
m
i
q
= vΛ
ii
q
.
How does this affect our electroweak interactions? The
¯u
R
i
/
Du
R
and
¯
d
R
i
/
Dd
R
are staying fine, but the different components of
¯
Q
L
“differently”. In particular,
the
W
±
µ
piece given by
¯
Q
L
i
/
DQ
L
is not invariant. That piece, can be explicitly
written out as
g
2
2
J
±µ
W
±
µ
,
where
J
µ+
= ¯u
i
L
γ
µ
d
i
L
.
Under the basis transformation, this becomes
¯u
i
L
γ
µ
(U
u
U
d
)
ij
d
j
L
.
This is not going to be diagonal in general. This leads to inter-generational
quark couplings. In other words, we have discovered that the mass eigenstates
are (in general) not equal to the weak eigenstates.
The mixing is dictated by the Cabibbo–Kabyashi–Maskawa matrix (CKM
matrix )
V
CKM
= U
u
U
d
.
Explicitly, we can write
V
CKM
=
V
ud
V
us
V
ub
V
cd
V
cs
V
cb
V
td
V
ts
V
tb
,
where the subscript indicate which two things it mixes. So far, these matrices
are not predicted by any theory, and is manually plugged into the model.
However, the entries aren’t completely arbitrary. Being the product of two
unitary matrices, we know
V
CKM
is a unitary matrix. Moreover, the entries are
only uniquely defined up to some choice of phases, and this further cuts down
the number of degrees of freedom.
If we only had two generations, then we get what is known as Cabibbo
mixing. A general 2
×
2 unitary matrix has 4 real parameters one angle and
three phases. However, redefining each of the 4 quark fields with a global U(1)
transformation, we can remove three relative phases (if we change all of them by
the same phase, then nothing happens). We can then write this matrix as
V =
cos θ
c
sin θ
c
sin θ
c
cos θ
c
,
where
θ
c
is the Cabibbo angle. Experimentally, we have
sin θ
c
0
.
22. It turns
out the reality of this implies that CP is conserved, which is left as an exercise
on the example sheet.
In this case, we explicitly have
J
µ
= cos θ
c
¯u
L
γ
µ
d
L
+ sin θ
c
¯u
l
γ
µ
s
L
sin θ
c
¯c
L
γ
µ
d
L
+ cos θ
c
¯c
L
γ
µ
s
L
.
If the angle is 0, then we have no mixing between the generations.
With three generations, there are nine parameters. We can think of this as
3 (Euler) angles and 6 phases. As before, we can re-define some of the quark
fields to get rid of five relative phases. We can then write
V
CKM
in terms of
three angles and 1 phase. In general, this
V
CKM
is not real, and this gives us
CP violation. Of course, if it happens that this phase is real, then we do not
have CP violation. Unfortunately, experimentally, we figured this is not the case.
By the CPT theorem, we thus deduce that T violation happens as well.
5.4 Neutrino oscillation and mass
Since around 2000, we know that the mass eigenstates and weak eigenstates for
neutrinos are not equivalent, as neutrinos were found to change from one flavour
to another. This implies there is some mixing between different generations of
leptons. The analogous mixing matrix is the Pontecorov–Maki–Nakagawa–Sakata
matrix , written U
P M NS
.
As of today, we do not really understand what neutrinos actually are, as
neutrinos don’t interact much, and we don’t have enough experimental data. If
neutrinos are Dirac fermions, then they behave just like quarks, and we expect
CP violation.
However, there is another possibility. Since neutrinos do not have a charge,
it is possible that they are their own anti-particles. In other words, they are
Majorana fermions. It turns out this implies that we cannot get rid of that many
phases, and we are left with 3 angles and 3 phases. Again, we get CP violation
in general.
We consider these cases briefly in turn.
Dirac fermions
If they are Dirac fermions, then we must also get some right-handed neutrinos,
which we write as
N
i
= ν
i
R
= (ν
eR
, ν
µR
, ν
τR
).
Then we modify the Dirac Lagrangian to say
L
lept
=
2(λ
ij
¯
L
i
φR
j
+ λ
ij
ν
¯
L
i
φ
c
N
j
+ h.c.).
This is exactly like for quarks. As in quarks, we obtain a mass term.
X
i
m
i
ν
(¯ν
i
R
ν
i
L
+ ¯ν
i
L
ν
i
R
).
Majorana neutrinos
If neutrinos are their own anti-particles, then, in the language we had at the
beginning of the course, we have
d
s
(p) = b
s
(p).
Then ν(x) = ν
L
(x) + ν
R
(x) must satisfy
ν
c
(x) = C ¯ν
T
L
= ν(x).
Then we see that we must have
ν
R
(x) = ν
c
L
(x),
and vice versa. So the right-handed neutrino field is not independent of the
left-handed field. In this case, the mass term would look like
1
2
X
i
m
i
ν
(¯ν
ic
L
ν
i
L
+ ¯ν
i
L
ν
ic
L
).
As in the case of leptons, postulating a mass term directly like this breaks gauge
invariance. Again, we solve this problem by coupling with the Higgs field. It
takes some work to find a working gauge coupling, and it turns out it the simplest
thing that works is
L
L,φ
=
Y
ij
M
(L
iT
φ
c
)C(φ
cT
L
j
) + h.c..
This is weird, because it is a dimension 5 operator. This dimension 5 operator is
non-renormalizable. This is actually okay, as along as we think of the standard
model as an effective field theory, describing physics at some low energy scale.
5.5 Summary of electroweak theory
We can do a quick summary of the electroweak theory. We start with the picture
before spontaneous symmetry breaking.
The gauge group of this theory is
SU
(2)
L
×
U(1)
Y
, with gauge fields
W
µ
su
(2) and
B
µ
u
(1). The coupling of U(1) with the particles is specified by a
hypercharge
Y
, and the
SU
(2) couplings will always be trivial or fundamental.
The covariant derivative is given by
D
µ
=
µ
+ igW
a
µ
τ
a
+
1
2
g
0
Y B
µ
,
where
τ
a
are the canonical generators of
su
(2). The field strengths are given by
F
W,a
µν
=
µ
W
a
ν
ν
W
a
µ
gε
abc
W
b
µ
W
c
ν
F
B
µν
=
µ
B
ν
ν
V
µ
.
The theory contains the scalar Higgs field
φ C
2
, which has hypercharge
Y
=
1
2
and the fundamental
SU
(2) representation. We also have three generations of
matter, given by
Type G1 G2 G2 Q Y
L
Y
R
Positive quarks u c t +
2
3
+
1
6
+
2
3
Negative quarks d s b
1
3
+
1
6
1
3
Leptons (e, µ, τ) e µ τ 1
1
2
1
Leptons (neutrinos) ν
e
ν
µ
ν
τ
0
1
2
???
Here G1, G2, G3 are the three generations,
Q
is the charge,
Y
L
is the hypercharge
of the left-handed version, and
Y
R
is the hypercharge of the right-handed version.
Each of these matter fields is a spinor field.
From now on, we describe the theory in the case of a massless neutrino,
because we don’t really know what the neutrinos are. We group the matter fields
as
L =
ν
e
L
e
L
ν
µ
L
µ
L
ν
τ
L
τ
L
R = ( e
R
µ
R
τ
R
)
u
R
= ( u
R
c
R
t
R
)
d
R
= ( d
R
s
R
b
R
)
Q
L
=
u
L
d
L
c
L
s
L
t
L
b
L
The Lagrangian has several components:
The kinetic term of the gauge field is given by
L
gauge
=
1
2
Tr(F
W
µν
F
W,µν
)
1
4
F
B
µν
F
B,µν
,
The Higgs field couples with the gauge fields by
L
gauge
= (D
µ
φ)
(D
µ
φ) µ
2
|φ|
2
λ|φ|
4
,
After spontaneous symmetry breaking, this gives rise to massive
W
±
, Z
and Higgs bosons. This also gives us
W
±
, Z
-Higgs interactions, as well as
gives us Higgs-Higgs interactions.
The leptons interact with the Higgs field by
L
lept
=
2(λ
ij
¯
L
i
φR
j
+ h.c.).
This gives us lepton masses and lepton-Higgs interactions. Of course, this
piece has to be modified after we figure out what neutrinos actually are.
The gauge coupling of the leptons induce
L
EW
lept
=
¯
Li
/
DL +
¯
Ri
/
DR,
with an implicit sum over all generations. This gives us lepton interactions
with
W
±
, Z, γ
. Once we introduce neutrino masses, this is described by
the PMNS matrix, and gives us neutrino oscillations and (possibly) CP
violation.
Higgs-quark interactions are given by
L
quark
=
2(λ
ij
d
¯
Q
i
L
φd
i
R
+ λ
ij
u
¯
Q
i
L
φ
c
u
j
R
+ h.c.).
which gives rise to quark masses.
Finally, the gauge coupling of the quarks is given by
L
EW
quark
=
¯
Q
L
i
/
DQ
L
+ ¯u
R
i
/
Du
R
+
¯
d
R
i
/
Dd
R
,
which gives us quark interactions with
W
±
, Z, γ
. The interactions are
described by the CKM matrix. This gives us quark flavour and CP
violation.
We now have all of the standard model that involves the electroweak part
and the matter. Apart from QCD, which we will describe quite a bit later, we’ve
got everything in the standard model.
6 Weak decays
We have mostly been describing (the electroweak part of) the standard model
rather theoretically. Can we actually use it to make predictions? This is what
we will do in this chapter. We will work out decay rates for certain processes,
and see how they compare with experiments. One particular point of interest in
our computations is to figure out how CP violation manifest itself in decay rates.
6.1 Effective Lagrangians
We’ll only consider processes where energies and momentum are much less than
m
W
, m
Z
. In this case, we can use an effective field theory. We will discuss
more formally what effective field theories are, but we first see how it works in
practice.
In our case, what we’ll get is the Fermi weak Lagrangian. This Lagrangian
in fact predates the Standard Model, and it was only later on that we discovered
the Fermi weak Lagrangian is an effective Lagrangian of what we now know of
as electroweak theory.
Recall that the weak interaction part of the Lagrangian is
L
W
=
g
2
2
(J
µ
W
+
µ
+ J
µ
W
µ
)
g
2 cos θ
W
J
µ
n
Z
µ
.
Our general goal is to compute the S-matrix
S = T exp
i
Z
d
4
x L
W
(x)
.
As before,
T
denotes time-ordering. The strategy is to Taylor expand this in
g
.
Ultimately, we will be interested in computing
hf|S |ii
for some initial and final states
|ii
and
|fi
. Since we are at the low energy
regime, we will only attempt to compute these quantities when
|ii
and
|fi
do not
contain
W
±
or
Z
bosons. This allows us to drop terms in the Taylor expansion
having free W
±
or Z components.
We can explicitly Taylor expand this, keeping the previous sentence in mind.
How can we possibly get rid of the
W
±
and
Z
terms in the Taylor expansion?
If we think about Wick’s theorem, we know that when we take the time-ordered
product of several operators, we sum over all possible contractions of the fields,
and contraction practically means we replace two operators by the Feynman
propagator of that field.
Thus, if we want to end up with no
W
±
or
Z
term, we need to contract all
the
W
±
and
Z
fields together. This in particular requires an even number of
W
±
and
Z
terms. So we know that there is no
O
(
g
) term left, and the first
non-trivial term is O(g
2
). We write the propagators as
D
W
µν
(x x
0
) = hT W
µ
(x)W
+
ν
(w
0
)i
D
Z
µν
(x x
0
) = hT Z
µ
(x)Z
ν
(x
0
)i.
Thus, the first interesting term is
g
2
one. For initial and final states
|ii
and
|fi
,
we have
hf|S |ii = hf|T
1
g
2
8
Z
d
4
x d
4
x
0
J
µ
(x)D
W
µν
(x x
0
)J
ν
(x
0
)
+
1
cos
2
θ
W
J
µ
n
D
Z
µν
(x x
0
)J
ν
n
(x
0
)
+ O(g
4
)
|ii
As always, we work in momentum space. We define the Fourier transformed
propagator
˜
D
Z,W
µν
(p) by
D
Z,W
µν
(x y) =
Z
d
4
p
(2π)
4
e
ip·(xy)
˜
D
Z,W
µν
(p),
and we will later compute to find that
˜
D
µν
is
˜
D
Z,W
µν
(p) =
i
p
2
m
2
Z,W
+
g
µν
+
p
µ
p
ν
m
2
Z,W
!
.
Here
g
µν
is the metric of the Minkowski space. We will put aside the computation
of the propagator for the moment, and discuss consequences.
At low energies, e.g. the case of quarks and leptons (except for the top quark),
the momentum scales involved are much less than
m
2
Z,W
. In this case, we can
approximate the propagators by ignoring all the terms involving
p
. So we have
˜
D
Z,W
µν
(p)
ig
µν
m
2
Z,W
.
Plugging this into the Fourier transform, we have
D
Z,W
µν
(x y)
ig
µν
m
2
Z,W
δ
4
(x y).
What we see is that we can describe this interaction by a contact interaction,
i.e. a four-fermion interaction. Note that if we did not make the approximation
p
0, then our propagator will not have the
δ
(4)
(
x y
), hence the effective
action is non-local.
Thus, the second term in the S-matrix expansion becomes
Z
d
4
x
ig
2
8m
2
W
J
µ
(x)J
µ
(x) +
m
2
W
m
2
Z
cos
2
θ
W
J
µ
n
(x)J
(x)
.
We want to define the effective Lagrangian
L
eff
W
to be the Lagrangian not involving
W
±
, Z such that for “low energy states”, we have
hf|S |ii = hf|T exp
i
Z
d
4
x L
eff
W
|ii
= hf|T
1 + i
Z
d
4
x L
eff
W
+ ···
|ii
Based on our previous computations, we find that up to tree level, we can write
iL
eff
W
(x)
iG
F
2
J
µ
J
µ
(x) + ρJ
µ
n
J
(x)
,
where, again up to tree level,
G
F
2
=
g
2
8m
2
W
, ρ =
m
2
W
m
2
Z
cos
2
θ
W
,
Recall that when we first studied electroweak theory, we found a relation
m
W
=
m
Z
cos θ
W
. So, up to tree level, we have
ρ
= 1. When we look at higher levels,
we get quantum corrections, and we can write
ρ = 1 + ρ.
This value is sensitive to physics “beyond the Standard Model”, as the other
stuff can contribute to the loops. Experimentally, we find
ρ 0.008.
We can now do our usual computations, but with the effective Lagrangian rather
than the usual Lagrangian. This is the Fermi theory of weak interaction. This
predates the idea of the standard model and the weak interaction. The
1
m
2
W
in
G
F
in some sense indicates that Fermi theory breaks down at energy scales near
m
W
, as we would expect.
It is interesting to note that the mass dimension of
G
F
is
2. This is to
compensate for the dimension 6 operator
J
µ
J
µ
. This means our theory is
non-renormalizable. This is, of course, not a problem, because we do not think
this is a theory that should be valid up to arbitrarily high energy scales. To
derive this Lagrangian, we’ve assumed our energy scales are m
W
, m
Z
.
Computation of propagators
We previously just wrote down the values of the
W
±
and
Z
-propagators. We
will now explicitly do the computations of the
Z
propagator. The computation
for
W
±
is similar. We will gloss over subtleties involving non-abelian gauge
theory here, ignoring problems involving ghosts etc. We’ll work in the so-called
R
ε
-gauge.
In this case, we explicitly write the free Z-Lagrangian as
L
Free
Z
=
1
4
(
µ
Z
ν
ν
Z
µ
)(
µ
Z
ν
ν
Z
µ
) +
1
2
m
2
Z
Z
µ
Z
µ
.
To find the propagator, we introduce an external current
j
µ
coupled to
Z
µ
. So
the new Lagrangian is
L = L
free
Z
+ j
µ
(x)Z
µ
(x).
Through some routine computations, we see that the Euler–Lagrange equations
give us
2
Z
ρ
ρ
· Z + m
2
Z
Z
ρ
= j
ρ
. ()
We need to solve this. We take the
ρ
of this, which gives
2
· Z
2
· Z + m
2
Z
· Z = · j.
So we obtain
m
2
Z
· Z = · j.
Putting this back into () and rearranging, we get
(
2
+ m
2
Z
)Z
µ
=
g
µν
+
µ
ν
m
2
Z
j
ν
.
We can write the solution as an integral over this current. We write the solution
as
Z
µ
(x) = i
Z
d
4
y D
Z
µν
(x y)j
ν
(y), ()
and further write
D
2
µν
(x y) =
Z
d
4
p
(2π)
4
e
ip·(xy)
˜
D
2
µν
(p).
Then by applying (
2
+ m
2
Z
) to (), we find that we must have
˜
D
2
µν
(p) =
i
p
2
m
2
Z
+
g
µν
p
µ
p
ν
m
2
Z
.
6.2 Decay rates and cross sections
We now have an effective Lagrangian. What can we do with it? In this section,
we will consider two kinds of experiments:
We leave a particle alone, and see how frequently it decays, and into what.
We crash a lot of particles into each other, and count the number of
scattering events produced.
The relevant quantities are decay rates and cross sections respectively. We will
look at both of these in turn, but we will mostly be interested in computing
decay rates only.
Decay rate
Definition
(Decay rate)
.
Let
X
be a particle. The decay rate Γ
X
is rate of
decay of
X
in its rest frame. In other words, if we have a sample of
X
, then this
is the number of decays of
X
per unit time divided by the number of
X
present.
The lifetime of X is
τ
1
Γ
X
.
We can write
Γ
X
=
X
f
i
Γ
Xf
i
,
where Γ
Xf
i
is the partial decay rate to the final state f
i
.
Note that the
P
f
i
is usually a complicated mixture of genuine sums and
integrals.
Often, we are only interested in how frequently it decays into, a particular
particle, instead of just the total decay ray. Thus, for each final state
|f
i
i
, we
want to compute Γ
Xf
i
, and then sum over all final states we are interested in.
As before, we can compute the S-matrix elements, defined by
hf|S |ii.
We will take
i
=
X
. As before, recall that there is a term 1 of
S
that corresponds
to nothing happening, and we are not interested in that. If
|fi 6
=
|ii
, then this
term does not contribute.
It turns out the interesting quantity to extract out of the
S
-matrix is given
by the invariant amplitude:
Definition (Invariant amplitude). We define the invariant amplitude M by
hf|S 1 |ii = (2π)
4
δ
(4)
(p
f
p
i
)iM
fi
.
When actually computing this, we make use of the following convenient fact:
Proposition. Up to tree order, and a phase, we have
M
fi
= hf|L(0) |ii,
where L is the Lagrangian. We usually omit the (0).
Proof sketch. Up to tree order, we have
hf|S 1 |ii = i
Z
d
4
x hf |L(x) |ii
We write
L
in momentum space. Then the only
x
-dependence in the
hf|L
(
x
)
|ii
factor is a factor of
e
i(p
f
p
i
)·x
. Integrating over
x
introduces a factor of
(2π)
4
δ
(4)
(p
f
p
i
). Thus, we must have had, up to tree order,
hf|L(x) |ii = M
fi
e
i(p
f
p
i
)·x
.
So evaluating this at x = 0 gives the desired result.
How does this quantity enter the picture? If we were to do this naively, we
would expect the probability of a transition i f is
P (i f) =
|hf|S 1 |ii|
2
hf|fihi|ii
.
It is not hard to see that we will very soon have a lot of
δ
functions appearing
in this expression, and these are in general bad. As we saw in QFT, these
δ
functions came from the fact that the universe is infinite. So what we do is that
we work with a finite spacetime, and then later take appropriate limits.
We suppose the universe has volume
V
, and we also work over a finite
temporal extent T . Then we replace
(2π)
4
δ
(4)
(0) 7→ V T, (2π)
3
δ
(3)
(0) 7→ V.
Recall that with infinite volume, we found the normalization of our states to be
hi|ii = (2π)
3
2p
0
i
δ
(3)
(0).
In finite volume, we can replace this with
hi|ii = 2p
0
i
V.
Similarly, we have
hf|fi =
Y
r
(2p
0
r
V ),
where r runs through all final state labels. In the S-matrix, we have
|hf|S 1 |ii| =
(2π)
4
δ
(4)
(p
f
p
i
)
2
|M
fi
|
2
.
We don’t really have a δ(0) here, but we note that for any x, we have
(δ(x))
2
= δ(x)δ(0).
The trick here is to replace only one of the
δ
(0) with
V T
, and leave the other
one alone. Of course, we don’t really have a
δ
(
p
f
p
i
) function in finite volume,
but when we later take the limit, it will become a δ function again.
Noting that p
0
i
= m
i
since we are in the rest frame, we find
P (i f) =
1
2m
i
V
|M
fi
|
2
(2π)
4
δ
(4)
p
i
X
r
p
r
!
V T
Y
r
1
2p
0
r
V
.
We see that two of our V ’s cancel, but there are a lot left.
The next thing to realize is that it is absurd to ask what is the probability of
decaying into exactly
f
. Instead, we have a range of final states. We will abuse
notation and use
f
to denote both the set of all final states we are interested in,
and members of this set. We claim that the right expression is
Γ
if
=
1
T
Z
P (i f)
Y
r
V
(2π)
3
d
3
p
r
.
The obvious question to ask is why we are integrating against
V
(2π)
3
d
3
p
r
, and
not, say, simply d
3
p
r
. The answer is that both options are not quite right. Since
we have finite volume, as we are familiar from introductory QM, the momentum
eigenstates should be discretized. Thus, what we really want to do is to do an
honest sum over all possible values of p
r
.
But of course, we don’t like sums, and since we are going to take the limit
V
anyway, we replace the sum with an integral, and take into account of
the density of the momentum eigenstates, which is exactly
V
(2π)
3
.
We now introduce a measure
dρ
f
= (2π)
4
δ
(4)
p
i
X
r
p
r
!
Y
r
d
3
p
r
(2π)
3
2p
0
r
,
and then we can concisely write
Γ
if
=
1
2m
i
Z
|M
fi
|
2
dρ
f
.
Note that when we actually do computations, we need to manually pick what
range of momenta we want to integrate over.
Cross sections
We now quickly look at another way we can do experiments. We imagine we set
two beams running towards each other with velocities
v
a
and
v
b
respectively.
We let the particle densities be ρ
a
and ρ
b
.
The idea is to count the number of scattering events,
n
. We will compute
this relative to the incident flux
F = |v
a
v
b
|ρ
a
,
which is the number of incoming particles per unit area per unit time.
Definition (Cross section). The cross section is defined by
n = F σ.
Given these, the total number of scattering events is
N =
b
V = F σρ
b
V = |v
a
v
b
|ρ
a
ρ
b
V σ.
This is now a more symmetric-looking expression.
Note that in this case, we genuinely have a finite volume, because we have
to pack all our particles together in order to make them collide. Since we are
boring, we suppose we actually only have one particle of each type. Then we
have
ρ
a
= ρ
b
=
1
V
.
In this case, we have
σ =
V
|v
a
v
b
|
N.
We can do computations similar to what we did last time. This time there are a
few differences. Last time, the initial state only had one particle, but now we
have two. Thus, if we go back and look at our computations, we see that we will
gain an extra factor of
1
V
in the frequency of interactions. Also, since we are
no longer in rest frame, we replace the masses of the particles with the energies.
Then the number of interactions is
N =
Z
1
(2E
a
)(2E
b
)V
|M
fi
|
2
dρ
f
.
We are often interested in knowing the number of interactions sending us to each
particular momentum (range) individually, instead of just knowing about how
many particles we get. For example, we might be interested in which directions
the final particles are moving in. So we are interested in
dσ =
V
|v
a
v
b
|
dN =
1
|v
a
v
b
|4E
a
E
b
|M
fi
|
2
dρ
f
.
Experimentalists will find it useful to know that cross sections are usually
measured in units called barns”. This is defined by
1 barn = 10
28
m
2
.
6.3 Muon decay
We now look at our first decay process, the muon decay:
µ
e¯ν
e
ν
µ
.
This is in fact the only decay channel of the muon.
µ
ν
µ
e
¯ν
e
p
k
q
q
0
We will make the simplifying assumption that neutrinos are massless.
The relevant bit of L
eff
W
is
G
F
2
J
α
J
α
,
where the weak current is
J
α
= ¯ν
e
γ
α
(1 γ
5
)e + ¯ν
µ
γ
α
(1 γ
5
)µ + ¯ν
τ
γ
α
(1 γ
5
)τ.
We see it is the interaction of the first two terms that will render this decay
possible.
To make sure our weak field approximation is valid, we need to make sure
we live in sufficiently low energy scales. The most massive particle involved is
m
µ
= 105.658 371 5(35) MeV.
On the other hand, the mass of the weak boson is
m
W
= 80.385(15) GeV,
which is much bigger. So the weak field approximation should be valid.
We can now compute
M =
e
(k)¯ν
e
(q)ν
µ
(q
0
)
L
eff
W
µ
(p)
.
Note that we left out all the spin indices to avoid overwhelming notation. We
see that the first term in
J
α
is relevant to the electron bit, while the second is
relevant to the muon bit. We can then write this as
M =
G
F
2
e
(k)¯ν
e
(q)
¯
α
(1 γ
5
)ν
e
|0ihν
µ
(q
0
)| ¯ν
µ
γ
α
(1 γ
5
)µ
µ
(p)
=
G
F
2
¯u
e
(k)γ
α
(1 γ
5
)v
ν
e
(q)¯u
ν
µ
(q
0
)γ
α
(1 γ
5
)u
p
(p).
Before we plunge through more computations, we look at what we are interested
in and what we are not.
At this point, we are not interested in the final state spins. Therefore, we
want to sum over the final state spins. We also don’t know the initial spin of
µ
.
So we average over the initial states. For reasons that will become clear later,
we will write the desired amplitude as
1
2
X
spins
|M|
2
=
1
2
X
spins
MM
=
1
2
G
2
F
2
X
spins
¯u
e
(k)γ
α
(1 γ
5
)v
ν
e
(q)¯u
ν
µ
(q
0
)γ
α
(1 γ
5
)u
µ
(p)
×
¯u
µ
(p)γ
β
(1 γ
5
)u
ν
µ
(q
0
)¯v
ν
e
(q)γ
β
(1 γ
5
)u
e
(k)
=
1
2
G
2
F
2
X
spins
¯u
e
(k)γ
α
(1 γ
5
)v
ν
e
(q)¯v
ν
e
(q)γ
β
(1 γ
5
)u
e
(k)
×
¯u
ν
µ
(q
0
)γ
α
(1 γ
5
)u
µ
(p)¯u
µ
(p)γ
β
(1 γ
5
)u
ν
µ
(q
0
)
.
We write this as
G
2
F
4
S
αβ
1
S
2αβ
,
where
S
αβ
1
=
X
spins
¯u
e
(k)γ
α
(1 γ
5
)v
ν
e
(q)¯v
ν
e
(q)γ
β
(1 γ
5
)u
e
(k)
S
2αβ
=
X
spins
¯u
ν
µ
(q
0
)γ
α
(1 γ
5
)u
µ
(p)¯u
µ
(p)γ
β
(1 γ
5
)u
ν
µ
(q
0
).
To actually compute this, we recall the spinor identities
X
spins
u(p)¯u(p) =
/
p + m,
X
spins
v(p)¯v(p) =
/
p m
In our expression for, say,
S
1
, the
v
ν
e
¯v
ν
e
is already in the right form to apply
these identities, but
¯u
e
and
u
e
are not. Here we do a slightly sneaky thing. We
notice that for each fixed
α, β
, the quantity
S
αβ
1
is a scalar. So we trivially have
S
αβ
1
=
Tr
(
S
αβ
1
). We now use the cyclicity of trace, which says
Tr
(
AB
) =
Tr
(
BA
).
This applies even if
A
and
B
are not square, by the same proof. Then noting
further that the trace is linear, we get
S
αβ
1
= Tr(S
αβ
1
)
=
X
spins
Tr
h
¯u
e
(k)γ
α
(1 γ
5
)v
ν
e
(q)¯v
ν
e
(q)γ
β
(1 γ
5
)
i
=
X
spins
Tr
h
u
e
(k)¯u
e
(k)γ
α
(1 γ
5
)v
ν
e
(q)¯v
ν
e
(q)γ
β
(1 γ
5
)
i
= Tr
h
(
/
k + m
e
)γ
α
(1 γ
5
)
/
qγ
β
(1 γ
5
)
i
Similarly, we find
S
2αβ
= Tr
h
/
q
0
γ
α
(1 γ
5
)(
/
p + m
µ
)γ
β
(1 γ
5
)
i
.
To evaluate these traces, we use trace identities
Tr(γ
µ
1
···γ
µ
n
) = 0 if n is odd
Tr(γ
µ
γ
ν
γ
ρ
γ
σ
) = 4(g
µν
g
ρσ
g
µρ
γ
νσ
+ g
µσ
g
νρ
)
Tr(γ
5
γ
µ
γ
ν
γ
ρ
γ
σ
) = 4
µνρσ
.
This gives the rather scary expressions
S
αβ
1
= 8
k
α
q
β
+ k
β
q
α
(k · q)g
αβ
αβµρ
k
µ
q
ρ
S
2αβ
= 8
q
0
α
p
β
+ q
0
β
p
α
(q
0
· p)g
αβ
αβµρ
q
0µ
p
ρ
,
but once we actually contract these two objects, we get the really pleasant result
1
2
X
|M|
2
= 64G
2
F
(p · q)(k · q
0
).
It is instructive to study this expression in particular cases. Consider the case
where
e
and
ν
µ
go out along the +
z
direction, and
¯ν
e
along
z
. Then we have
k · q
0
=
p
m
2
e
+ k
2
z
q
0
z
k
z
q
0
z
.
As m
e
0, we have 0.
This is indeed something we should expect even without doing computations.
We know weak interaction couples to left-handed particles and right-handed
anti-particles. If
m
e
= 0, then we saw that helicity are chirality the same. Thus,
the spin of ¯ν
e
must be opposite to that of ν
µ
and e
:
¯ν
e
1
2
ν
µ
1
2
e
1
2
So they all point in the same direction, and the total spin would be
3
2
. But the
initial total angular momentum is just the spin
1
2
. So this violates conservation
of angular momentum.
But if
m
e
6
= 0, then the left-handed and right-handed components of the
electron are coupled, and helicity and chirality do not coincide. So it is possible
that we obtain a right-handed electron instead, and this gives conserved angular
momentum. We call this helicity suppression, and we will see many more
examples of this later on.
It is important to note that here we are only analyzing decays where the
final momenta point in these particular directions. If m
e
= 0, we can still have
decays in other directions.
There is another interesting thing we can consider. In this same set up,
if
m
e
6
= 0, but neutrinos are massless, then we can only possibly decay to
left-handed neutrinos. So the only possible assignment of spins is this:
¯ν
e
1
2
ν
µ
1
2
e
1
2
Under parity transform, the momenta reverse, but spins don’t. If we transform
under parity, we would expect to see the same behaviour.
¯ν
e
1
2
ν
µ
1
2
e
1
2
But (at least in the limit of massless neutrinos) this isn’t allowed, because weak
interactions don’t couple to right-handed neutrinos. So we know that weak
decays violate P.
We now now return to finish up our computations. The decay rate is given
by
Γ =
1
2m
µ
Z
d
3
k
(2π)
3
2k
0
Z
d
3
q
(2π)
3
2q
0
Z
d
3
q
0
(2π)
3
q
00
× (2π)
4
δ
(4)
(p k q q
0
)
1
2
X
|M|
2
.
Using our expression for |M |, we find
Γ =
G
2
F
8π
5
m
µ
Z
d
3
k d
3
q d
3
q
0
k
0
q
0
q
00
δ
(4)
(p k q q
0
)(p · q)(k · q
0
).
To evaluate this integral, there is a useful trick.
For convenience, we write Q = p k, and we consider
I
µν
(Q) =
Z
d
3
q
q
0
d
3
q
0
q
00
δ
(4)
(Q q q
0
)q
µ
q
0
ν
.
By Lorentz symmetry arguments, this must be of the form
I
µν
(Q) = a(Q
2
)Q
µ
Q
ν
+ b(Q
2
)g
µν
Q
2
,
where a, b : R R are some scalar functions.
Now consider
g
µν
I
µν
=
Z
d
3
q
q
0
d
3
q
0
q
00
δ
(4)
(Q q q
0
)q · q
0
= (a + 4b)Q
2
.
But we also know that
(q + q
0
)
2
= q
2
+ q
02
+ 2q · q
0
= 2q · q
0
because neutrinos are massless. On the other hand, by momentum conservation,
we know
q + q
0
= Q.
So we know
q · q
0
=
1
2
Q
2
.
So we find that
a + 4b =
I
2
, (1)
where
I =
Z
d
3
q
q
0
Z
d
3
q
0
q
00
δ
(4)
(Q q q
0
).
We can consider something else. We have
Q
µ
Q
ν
I
µν
= a(Q
2
)Q
4
+ b(Q
2
)Q
4
=
Z
d
3
q
q
0
Z
d
3
q
0
q
00
δ
(4)
(Q q q
0
)(q · Q)(q
0
· Q).
Using the masslessness of neutrinos and momentum conservation again, we find
that
(q · Q)(q
0
· Q) = (q · q
0
)(q · q
0
).
So we find
a + b =
I
4
. (2)
It remains to evaluate
I
, and to do so, we can just evaluate it in the frame where
Q = (σ, 0) for some σ. Now note that since q
2
= q
02
= 0, we must have
q
0
= |q|.
So we have
I =
Z
d
3
q
|q|
Z
d
3
q
0
|q
0
|
δ(σ |q| |q
0
|)
3
Y
i=1
δ(q
i
q
0
i
)
=
Z
d
3
q
|q|
2
δ(σ 2|q|)
= 4π
Z
0
d|q| δ(σ 2|q|)
= 2π.
So we find that
Γ =
G
2
F
3m
µ
(2π)
4
Z
d
3
k
k
0
2p · (p k)k · (p k) + (p · k)(p k)
2
Recall that we are working in the rest frame of µ. So we know that
p · k = m
µ
E, p ·p = m
2
µ
, k · k = m
2
e
,
where E = k
0
. Note that we have
m
e
m
µ
0.0048 1.
So to make our lives easier, it is reasonable to assume
m
e
= 0. In this case,
|k| = E, and then
Γ =
G
2
F
(2π)
4
3m
µ
Z
d
3
k
E
2m
2
µ
m
µ
E 2(m
µ
E)
2
2(m
µ
E)
2
+ m
µ
Em
2
µ
=
G
2
F
m
µ
(2π)
4
3
Z
d
3
k (3m
µ
4E)
=
4πG
2
F
m
µ
(2π)
4
3
Z
dE E
2
(3m
µ
4E)
We now need to figure out what we want to integrate over. When
e
is at rest,
then
E
min
= 0. The maximum energy is obtained when
ν
µ
, ¯ν
e
are in the same
direction and opposite to e
. In this case, we have
E + (E
¯ν
e
+ E
ν
µ
) = m
µ
.
By energy conservation, we also have
E (E
¯ν
e
+ E
ν
µ
) = 0.
So we find
E
max
=
m
µ
2
.
Thus, we can put in our limits into the integral, and find that
Γ =
G
2
F
m
5
µ
192π
3
.
As we mentioned at the beginning of the chapter, this is the only decay channel
off the muon. From experiments, we can measure the lifetime of the muon as
τ
µ
= 2.1870 × 10
6
s.
This tells us that
G
F
= 1.164 × 10
5
GeV
2
.
Of course, this isn’t exactly right, because we ignored all loop corrections (and
approximated
m
e
= 0). But this is reasonably good, because those effects are
very small, at an order of 10
6
as large. Of course, if we want to do more
accurate and possibly beyond standard model physics, we need to do better than
this.
Experimentally,
G
F
is consistent with what we find in the
τ e¯ν
e
ν
τ
and
µ e¯ν
e
ν
µ
decays. This is some good evidence for lepton universality, i.e. they
have different masses, but they couple in the same way.
6.4 Pion decay
We are now going to look at a slightly different example. We are going to study
decays of pions, which are made up of a pair of quark and anti-quark. We will
in particular look at the decay
π
(¯ud) e
¯ν
e
.
π
e
¯ν
e
p
k
q
This is actually quite hard to do from first principles, because
π
is made up of
quarks, and quark dynamics is dictated by QCD, which we don’t know about
yet. However, these quarks are not free to move around, and are strongly bound
together inside the pion. So the trick is to hide all the things that happen in the
QCD side in a single constant
F
π
, without trying to figure out, from our theory,
what F
π
actually is.
The relevant currents are
J
α
lept
= ¯ν
e
γ
α
(1 γ
5
)e
J
α
had
= ¯
α
(1 γ
5
)(V
ud
d + V
us
s + v
ub
b) V
α
had
A
α
had
,
where the V
α
had
contains the γ
α
bit, while A
α
had
contains the γ
α
γ
5
bit.
Then the amplitude we want to compute is
M =
e
(k)¯ν
e
(q)
L
eff
W
π
(p)
=
G
F
2
e
(k)¯ν
e
(q)
J
α,lept
|0ih0|J
α
had
π
(p)
=
G
F
2
e
(k)¯ν
e
(q)
¯
α
(1 γ
5
)ν
e
|0ih0|J
α
had
π
(p)
=
G
F
2
¯u
e
(k)γ
α
(1 γ
5
)v
ν
e
(q) h0|V
α
had
A
α
had
π
(p)
.
We now note that
V
α
had
does not contribute. This requires knowing something
about QCD and
π
. It is known that QCD is a P-invariant theory, and experi-
mentally, we find that
π
has spin 0 and odd parity. In other words, under P, it
transforms as a pseudoscalar. Thus, the expression
h0| ¯
α
d
π
(p)
transforms as an axial vector. But since the only physical variable involved is
p
α
, the only quantities we can construct are multiples of
p
α
, which are vectors.
So this must vanish. By a similar chain of arguments, the remaining part of the
QCD part must be of the form
h0| ¯
α
γ
5
d
π
(p)
= i
2F
π
p
α
for some constant F
π
. Then we have
M = iG
F
F
π
V
ud
¯u
e
(k)
/
p(1 γ
5
)v
ν
e
(q).
To simplify this, we use momentum conservation
p
=
k
+
q
, and some spinor
identities
¯u
e
(k)
/
k = ¯u
e
(k)m
e
,
/
qv
ν
e
(q) = 0.
Then we find that
M = iG
F
F
π
V
ud
m
e
¯u
e
(k)(1 γ
5
)v
ν
e
(q).
Doing a manipulation similar to last time’s, and noting that
(1 γ
5
)γ
µ
(1 + γ
5
) = 2(1 γ
5
)γ
µ
Tr(
/
k
/
q) = 4k · q
Tr(γ
5
/
k
/
q) = 0
we find
X
spins e,¯ν
e
|M|
2
=
X
spins
|G
F
F
π
m
e
V
ud
|
2
[¯u
e
(k)(1 γ
5
)v
ν
e
(q)¯v
ν
e
(q)(1 + γ
5
)u
e
(k)]
= 2|G
F
F
π
m
e
V
ud
|
2
Tr
h
(
/
k + m
e
)(1 γ
5
)
/
q
i
= 8|G
F
F
π
m
e
V
ud
|
2
(k · q).
This again shows helicity suppression. The spin-0
π
decays to the positive he-
licity
¯ν
e
, and hence decays to a positive helicity electron by helicity conservation.
π
e
1
2
¯ν
e
1
2
But if
m
e
= 0, then this has right-handed chirality, and so this decay is forbidden.
We can now compute an actual decay rate. We note that since we are working
in the rest frame of
π
, we have
k
+
q
= 0; and since the neutrino is massless,
we have q
0
= |q| = |k|. Finally, writing E = k
0
for the energy of e
, we obtain
Γ
πe¯ν
e
=
1
2m
π
Z
d
3
k
(2π)
3
2k
0
Z
d
3
q
(2π)
3
2q
0
(2π)
4
δ
(4)
(p k q)
8|G
F
F
π
m
e
V
ud
|
2
(k · q)
=
|G
F
F
π
m
e
V
ud
|
2
4m
π
π
2
Z
d
3
k
E|k|
δ(m
π
E |k|)(E|k|+ |k|
2
)
To simplify this further, we use the property
δ(f(k)) =
X
i
δ(k k
i
0
)
|f
0
(k
i
0
)|
,
where k
i
0
runs over all roots of f. In our case, we have a unique root
k
0
=
m
2
π
m
2
e
2m
π
,
and the corresponding derivative is
|f
0
(k
0
)| = 1 +
k
0
E
.
Then we get
Γ
πe¯ν
e
=
|G
F
F
π
m
e
V
ud
|
2
4π
2
m
π
Z
4πk
2
dk
E
E + k
1 + k
0
/E
δ(k k
0
)
=
|G
F
F
π
V
ud
|
2
4π
m
2
e
m
π
1
m
2
e
m
2
π
2
.
Note that if we set
m
e
0, then this vanishes. This time, it is the whole decay
rate, and not just some particular decay channel.
Let’s try to match this with experiment. Instead of looking at the actual
lifetime, we compare it with some other possible decay rates. The expression for
Γ
πµ¯ν
µ
is exactly the same, with m
e
replaced with m
µ
. So the ratio
Γ
πe¯ν
e
Γ
πµ¯ν
µ
=
m
2
e
m
2
µ
m
2
π
m
2
e
m
2
π
m
2
µ
2
1.28 × 10
4
.
Here all the decay constants cancel out. So we can actually compare this with
experiment just by knowing the electron and muon masses. When we actually
do experiments, we find 1
.
230(4)
×
10
4
. This is pretty good agreement, but
not agreement to within experimental error. Of course, this is not unexpected,
because we didn’t include the quantum loop effects in our calculations.
Another thing we can see is that the ratio is very small, on the order of 10
4
.
This we can understand from helicity suppression, because m
µ
m
e
.
Note that we were able to get away without knowing how QCD works!
6.5 K
0
-
¯
K
0
mixing
We now move on to consider our final example of weak decays, and look at
K
0
-
¯
K
0
mixing. We will only do this rather qualitatively, and look at the effects
of CP violation.
Kaons contain a strange quark/antiquark. There are four “flavour eigenstates”
given by
K
0
(¯sd),
¯
K
0
(
¯
ds), K
+
(¯su), K
(¯us).
These are the lightest kaons, and they have spin
J
= 0 and parity
p
=
ve
. We
can concisely write these information as J
p
= 0
. These are pseudoscalars.
We want to understand how these things transform under CP. Parity trans-
formation doesn’t change the particle contents, and charge conjugation swaps
particles and anti-particles. Thus, we would expect CP to send K
0
to
¯
K
0
, and
vice versa. For kaons at rest, we can pick the relative phases such that
ˆ
C
ˆ
P
K
0
=
¯
K
0
ˆ
C
ˆ
P
¯
K
0
=
K
0
.
So the CP eigenstates are just
K
0
±
=
1
2
(
K
0
¯
K
0
).
Then we have
ˆ
C
ˆ
P
K
0
±
= ±
K
0
±
.
Let’s consider the two possible decays
K
0
π
0
π
0
and
K
0
π
+
π
. This
requires converting one of the strange quarks into an up or down quark.
d
¯s ¯u
W
u
¯
d
K
0
π
π
+
d
¯s ¯u
W
¯
d
u
K
0
π
0
π
0
From the conservation of angular momentum, the total angular momentum of
ππ is zero. Since they are also spinless, the orbital angular momentum L = 0.
When we apply CP to the final states, we note that the relative phases of
π
+
and π
(or π
0
and π
0
) cancel out each other. So we have
ˆ
C
ˆ
P
π
+
π
= (1)
L
π
+
π
=
π
+
π
,
where the relative phases of π
+
and π
cancel out. Similarly, we have
ˆ
C
ˆ
P
π
0
π
0
=
π
0
π
0
.
Therefore ππ is always a CP eigenstate with eigenvalue +1.
What does this tell us about the possible decays? We know that CP is
conserved by the strong and electromagnetic interactions. If it were conserved
by the weak interaction as well, then there is a restriction on what can happen.
We know that
K
0
+
ππ
is allowed, because both sides have CP eigenvalue +1, but
K
0
ππ
is not. So
K
0
+
is “short-lived”, and
K
0
is “long-lived”. Of course,
K
0
will still
decay, but must do so via more elaborate channels, e.g. K
0
πππ.
Does this agree with experiments? When we actually look at Kaons, we find
two neutral Kaons, which we shall call
K
0
S
and
K
0
L
. As the subscripts suggest,
K
0
S
has a “short” lifetime of
τ 9 × 10
11
s
, while
K
0
L
has a “long” lifetime of
τ 5 × 10
8
s.
But does this actually CP is not violated? Not necessarily. For us to be
correct, we want to make sure
K
0
L
never decays to
ππ
. We define the quantities
η
+
=
|hπ
+
π
|H
K
0
L
|
|hπ
+
π
|H |K
0
S
i|
, η
00
=
|
π
0
π
0
H
K
0
L
|
|hπ
0
π
0
|H |K
0
S
i|
Experimentally, when we measure these things, we have
η
±
η
00
2.2 × 10
3
6= 0.
So K
0
L
does decay into ππ.
If we think about what is going on here, there are two ways CP can be
violated:
Direct CP violation of s u due to a phase in V
CKM
.
Indirect CP violation due to K
0
¯
K
0
or vice-versa, then decaying.
Of course, ultimately, the “indirect violation” is still due to phases in the CKM
matrix, but the second is more “higher level”.
It turns out in this particular process, it is the indirect CP violation that is
mainly responsible, and the dominant contributions are “box diagrams”, where
the change in strangeness S = 2.
d
u, c, t
s
¯s
¯u, ¯c,
¯
t
¯
d
W W
K
0
¯
K
0
d
¯s
s
¯
d
W
W
K
0
¯
K
0
Given our experimental results, we know that
K
0
S
and
K
0
L
aren’t quite
K
0
+
and
K
0
themselves, but have some corrections. We can write them as
K
0
S
=
1
p
1 + |ε
1
|
2
(
K
0
+
+ ε
1
K
0
)
K
0
+
K
0
L
=
1
p
1 + |ε
2
|
2
(
K
0
+ ε
2
K
0
+
)
K
0
,
where
ε
1
, ε
2
C
are some small complex numbers. This way, very occasionally,
K
0
L
can decay as K
0
+
.
We assume that we just have two state mixing, and ignore details of the
strong interaction. Then as time progresses, we can write have
|K
S
(t)i = a
S
(t)
K
0
+ b
S
(t)
¯
K
0
|K
L
(t)i = a
L
(t)
K
0
+ b
L
(t)
¯
K
0
for some (complex) functions
a
S
, b
S
, a
L
, b
L
. Recall that Schr¨odinger’s equation
says
i
d
dt
|ψ(t)i = H |ψ(t)i.
Thus, we can write
i
d
dt
a
b
= R
a
b
,
where
R =
K
0
H
0
K
0
K
0
H
0
¯
K
0
¯
K
0
H
0
K
0
¯
K
0
H
0
¯
K
0
and
H
0
is the next-to-leading order weak Hamiltonian. Because Kaons decay
in finite time, we know
R
is not Hermitian. By general linear algebra, we can
always write it in the form
R = M
i
2
Γ,
where
M
and Γ are Hermitian. We call
M
the mass matrix , and Γ the decay
matrix .
We are not going to actually compute
R
, but we are going to use the known
symmetries to put some constraint on
R
. We will consider the action of Θ =
ˆ
C
ˆ
P
ˆ
T
.
The CPT theorem says observables should be invariant under conjugation by Θ.
So if
A
is Hermitian, then Θ
A
Θ
1
=
A
. Now our
H
0
is not actually Hermitian,
but as above, we can write it as
H
0
= A + iB,
where A and B are Hermitian. Now noting that Θ is anti-unitary, we have
ΘH
0
Θ
1
= A iB = H
0†
.
In the rest frame of a particle
¯
K
0
, we know
ˆ
T
K
0
=
K
0
, and similarly for
¯
K
0
. So we have
Θ
¯
K
0
=
K
0
, Θ
K
0
=
¯
K
0
,
Since we are going to involve time reversal, we stop using bra-ket notation for
the moment. We have
R
11
= (K
0
, H
0
K
0
) = (Θ
1
ΘK
0
, H
0
Θ
1
ΘK
0
) = (
¯
K
0
, H
0†
¯
K
0
)
= (H
0
¯
K
0
,
¯
K
0
)
= (
¯
K
0
, H
0
¯
K
0
) = R
22
Now if
ˆ
T
was a good symmetry (i.e.
ˆ
C
ˆ
P
is good as well), then a similar
computation shows that
R
12
= R
21
.
We can show that we in fact have
ε
1
= ε
2
= ε =
R
21
R
21
R
12
+
R
21
.
So if CP is conserved, then R
12
= R
21
, and therefore ε
1
= ε
2
= ε = 0.
Thus, we see that if we want to have mixing, then we must have
ε
1
, ε
2
6
= 0.
So we need R
12
6= R
21
. In other words, we must have CP violation!
One can also show that
η
+
= ε + ε
0
, η
00
= ε 2ε
0
,
where
ε
0
measures the direct source of CP violation. By looking at these two
modes of decay, we can figure out the values of
ε
and
ε
0
. Experimentally, we find
|ε| = (2.228 ± 0.011) × 10
3
,
and
ε
ε
0
= (1.66 ± 0.23) × 10
3
.
As claimed, it is the indirect CP violation that is dominant, and the direct one
is suppressed by a factor of 10
3
.
Other decays can be used to probe
K
0
L,S
. For example, we can look at
semi-leptonic decays. We can check that
K
0
π
e
+
ν
e
is possible, while
K
0
π
+
e
¯ν
e
is not.
¯
K
0
has the opposite phenomenon. To show these, we just have to try to
write down diagrams for these events.
Now if CP is conserved, we’d expect the decay rates
Γ(K
0
L,S
π
e
+
ν
e
) = Γ(K
0
L,S
π
+
e
¯ν
e
),
since we expect K
L,S
to consist of the same amount of K
0
and
¯
K
0
.
We define
A
L
=
Γ(K
0
L
π
e
+
ν
e
) Γ(K
0
L
π
+
e
¯ν
e
)
Γ(K
0
L
π
e
+
ν
e
) + Γ(K
0
L
π
+
e
¯ν
e
)
.
If this is non-zero, then we have evidence for CP violation. Experimentally, we
find
A
L
= (3.32 ± 0.06) × 10
3
2 Re(ε).
This is small, but certainly significantly non-zero.
7 Quantum chromodynamics (QCD)
In the early days of particle physics, we didn’t really know what we were doing.
So we just smashed particles into each other and see what happened. Our initial
particle accelerators weren’t very good, so we mostly observed some low energy
particles.
Of course, we found electrons, but the more interesting discoveries were in
the hadrons. We had protons and neutrons,
n
and
p
, as well as pions
π
+
, π
0
and
π
0
. We found that
n
and
p
behaved rather similarly, with similar interaction
properties and masses. On the other hand, the three pions behaved like each
other as well. Of course, they had different charges, so this is not a genuine
symmetry.
Nevertheless, we decided to assign numbers to these things, called isospin.
We say
n
and
p
have isospin
I
=
1
2
, while the pions have isospin
I
= 1. The
idea was that if a particle has spin
1
2
, then it has two independent spin states;
If it has spin 1, then it has three independent spin states. This, we view
n, p
as different “spin states” of the same object, and similarly
π
±
, π
0
are the three
“spin states” of the same object. Of course, isospin has nothing to do with actual
spin.
As in the case of spin, we have spin projections
I
3
. So for example,
p
has
I
3
=
+
1
2
and
n
has
I
3
=
1
2
. Similarly,
π
+
, π
0
and
π
0
have
I
3
= +1
,
0
,
1 respectively.
Mathematically, we can think of these particles as living in representations of
su
(2). Each “group”
{n, p}
or
{π
+
, π
0
, π}
corresponded to a representation
of
su
(2), and the isospin labelled the representation they belonged to. The
eigenvectors corresponded to the individual particle states, and the isospin
projection I
3
referred to this eigenvalue.
That might have seemed like a stretch to invoke representation theory. We
then built better particle accelerators, and found more particles. These new
particles were quite strange, so we assigned a number called strangeness to
measure how strange they are. Four of these particles behaved quite like the
pions, and we called them Kaons. Physicists then got bored and plotted out
these particles according to isospin and strangeness:
I
3
S
π
+
K
+
¯
K
0
π
K
0
K
η
π
0
Remarkably, the diagonal lines join together particles of the same charge!
Something must be going on here. It turns out if we include these “strange”
particles into the picture, then instead of a representation of
su
(2), we now have
representations of su(3). Indeed, this just looks like a weight diagram of su(3).
Ultimately, we figured that things are made out of quarks. We now know
that there are 6 quarks, but that’s too many for us to handle. The last three
quarks are very heavy. They weren’t very good at forming hadrons, and their
large mass means the particles they form no longer “look alike”. So we only
focus on the first three.
At first, we only discovered things made up of up quarks and down quarks.
We can think of these quarks as living in the fundamental representation
V
1
of
su(2), with
u =
1
0
, d =
0
1
.
These are eigenvectors of the Cartan generator
H
, with weights +
1
2
and
1
2
(using
the “physicist’s” way of numbering). The idea is that physics is approximately
invariant under the action of
su
(2) that mixes
u
and
d
. Thus, different hadrons
made out of
u
and
d
might look alike. Nowadays, we know that the QCD part
of the Lagrangian is exactly invariant under the
SU
(2) action, while the other
parts are not.
The anti-quarks lived in the anti-fundamental representation (which is also
the fundamental representation). A meson is made of two quarks. So they live
in the tensor product
V
1
V
1
= V
0
V
2
.
The
V
2
was the pions we found previously. Similarly, the protons and neutrons
consist of three quarks, and live in
V
1
V
1
V
1
= (V
0
V
2
) V
1
= V
1
V
1
V
3
.
One of the V
1
’s contains the protons and neutrons.
The “strange” hadrons contain what is known as the strange quark,
s
. This
is significantly more massive than the
u
and
d
quarks, but are not too far off,
so we still get a reasonable approximate symmetry. This time, we have three
quarks, and they fall into an su(3) representation,
u =
1
0
0
, d =
0
1
0
, s =
0
0
1
.
This is the fundamental
3
representation, while the anti-quarks live in the
anti-fundamental
¯
3. These decompose as
3
¯
3 = 1
¯
8
3 3 3 = 1 8 8 10.
The quantum numbers correspond to the weights of the eigenvectors, and hence
when we plot the particles according to quantum numbers, they fall in such a
nice lattice.
There are a few mysteries left to solve. Experimentally, we found a baryon
++
=
uuu
with spin
3
2
. The wavefunction appears to be symmetric, but this
would violate Fermi statistics. This caused theorists to go and scratch their
heads again and see what they can do to understand this. Also, we couldn’t
explain why we only had particles coming from
3
¯
3
and
3 3 3
, and nothing
else.
The resolution is that we need an extra quantum number. This quantum
number is called colour . This resolved the problem of Fermi statistics, but also,
we postulated that any bound state must have no “net colour”. Effectively,
this meant we needed to have groups of three quarks or three anti-quarks, or
a quark-antiquark pair. This leads to the idea of confinement. This principle
predicted the
baryon sss with spin
3
2
, and was subsequently observed.
Nowadays, we understand this is all due to a
SU
(3) gauge symmetry, which
is not the
SU
(3) we encountered just now. This is what we are going to study
in this chapter.
7.1 QCD Lagrangian
The modern description of the strong interaction of quarks is quantum chromo-
dynamics, QCD. This is a gauge theory with a
SU
(3)
C
gauge group. The strong
force is mediated by gauge bosons known as gluons. This gauge symmetry is
exact, and the gluons are massless.
In QCD, each flavour of quark comes in three “copies” of different colour. It
is conventional to call these colours red, green and blue, even though they have
nothing to do with actual colours. For a flavour
f
, we can write these as
q
red
f
,
q
green
f
and q
blue
f
. We can put these into an triplet:
q
f
=
q
red
f
q
green
f
q
blue
f
.
Then QCD says this has an
SU
(3) gauge symmetry, where the triplet transforms
under the fundamental representation. Since this symmetry is exact, quarks
of all three colours behave exactly the same, and except when we are actually
doing QCD, it doesn’t matter that there are three flavours.
We do this for each quark individually, and then the QCD Lagrangian is
given by
L
QCD
=
1
4
F
aµν
F
a
µν
+
X
f
¯q
f
(i
/
D m
f
)q
f
,
where, as usual,
D
µ
=
µ
+ igA
a
µ
T
a
F
a
µν
=
µ
A
a
ν
ν
A
a
µ
gf
abc
A
b
µ
A
c
ν
.
Here T
a
for a = 1, ··· , 8 are generators of su(3), and, as usual, satisfies
[T
a
, T
b
] = if
abc
T
c
.
One possible choice of generators is
T
a
=
1
2
λ
a
,
where the
λ
a
are the Gell-Mann matrices. The fact that we have 8 independent
generators means we have 8 gluons.
One thing that is very different about QCD is that it has interactions between
gauge bosons. If we expand the Lagrangian, and think about the tree level
interactions that take place, we naturally have interactions that look like
but we also have three and four-gluon interactions
Mathematically, this is due to the non-abelian nature of
SU
(2), and physically,
we can think of this as saying the gluon themselves have colour charge.
7.2 Renormalization
We now spend some time talking about renormalization of QCD. We didn’t
talk about renormalization when we did electroweak theory, because the effect
is much less pronounced in that case. Renormalization is treated much more
thoroughly in the Advanced Quantum Field Theory course, as well as Statistical
Field Theory. Thus, we will just briefly mention the key ideas and results for
QCD.
QCD has a coupling constant, which we shall call
g
. The idea of renormal-
ization is that this coupling constant should depend on the energy scale
µ
we
are working with. We write this as
g
(
µ
). However, this dependence on
µ
is not
arbitrary. The physics we obtain should not depend on the renormalization point
µ
we chose. This imposes some restrictions on how
g
(
µ
) depends on
µ
, and this
allows us to define and compute the quantity.
β(g(µ)) = µ
d
dµ
g(µ).
The β-function for non-abelian gauge theories typically looks like
β(g) =
β
0
g
3
16π
2
+ O(g
5
)
for some
β
. For an
SU
(
N
) gauge theory coupled to fermions
{f}
(both left- and
right-handed), up to one-loop order,we have
β
0
=
11
3
N
4
3
X
f
T
f
,
where
T
f
is the Dynkin index of the representations of the fermion
f
. For the
fundamental representation, which is all we are going to care about, we have
T
f
=
1
2
.
In our model of QCD, we have 6 quarks. So
β
0
= 11 4 = 7.
So we find that the β-function is always negative!
This isn’t actually quite it. The number of “active” quarks depends on the
energy scale. At energies
m
top
173 GeV
, then the top quark is no longer
active, and
n
f
= 5. At energies
100 MeV
, we are left with three quarks, and
then
n
f
= 3. Matching the
β
functions between these regimes requires a bit of
care, and we will not go into that. But in any case, the
β
function is always
negative.
Often, we are not interested in the constant
g
itself, but the strong coupling
α
S
=
g
2
4π
.
It is an easy application of the chain rule to find that, to lowest order,
µ
dα
S
dµ
=
β
0
2π
α
2
S
.
We now integrate this equation, and see what we get. We have
Z
α
S
(µ)
α
S
(µ
0
)
dα
S
α
2
S
=
β
0
2π
Z
µ
µ
0
dµ
µ
.
So we find
α
S
(µ) =
2π
β
0
1
log(µ/µ
0
) +
2π
β
0
α
S
(µ
0
)
.
There is an energy scale where
α
S
diverges, which we shall call Λ
QCD
. This is
given by
log Λ
QCD
= log µ
0
2π
β
0
α
S
(µ
0
)
.
In terms of Λ
QCD
, we can write α
S
(µ) as
α
S
(µ) =
2π
β
0
log(µ/Λ
QCD
)
.
Note that in the way we defined it,
β
0
is positive. So we see that
α
S
decreases
with increasing
µ
. This is called asymptotic freedom. Thus, the divergence occurs
at low
µ
. This is rather different from, say, QED, which is the other way round.
Another important point to get out is that we haven’t included any mass
term yet, and so we do not have a natural “energy scale” given by the masses.
Thus,
L
QCD
is scale invariant, but quantization has led to a characteristic scale
Λ
QCD
. This is called dimensional transmutation.
What is this scale? This depends on what regularization and renormalization
scheme we are using, and tends to be
Λ
QCD
200-500MeV.
We can think of this as approximately the scale of the border between perturbative
and non-perturbative physics. Note that non-perturbative means we are in low
energies, because that is when the coupling is strong!
Of course, we have to be careful, because these results were obtained by only
looking up to one-loop, and so we cannot expect it to make sense at low energy
scales.
7.3 e
+
e
hadrons
Doing QCD computations is very hard
TM
. Partly, this is due to the problem of
confinement. Confinement in QCD means it is impossible to observe free quarks.
When we collide quarks together, we can potentially produce single quarks or
anti-quarks. Then because of confinement, jets of quarks, anti-quarks and gluons
would be produced and combine to form colour-singlet states. This process is
known as hadronization.
Confinement and hadronization are not very well understood, and these
happen in the non-perturbative regime of QCD. We will not attempt to try to
understand it. Thus, to do computations, we will first ignore hadronization,
which admittedly isn’t a very good idea. We then try to parametrize the
hadronization part, and then see if we can go anywhere.
Practically, our experiments often happen at very high energy scales. At
these energy scales,
α
S
is small, and we can expect perturbation theory to work.
We now begin by ignoring hadronization, and try to compute the amplitudes
for the interaction
e
+
e
q ¯q.
The leading process is
e
e
+
q
¯q
p
1
p
2
k
2
k
1
γ
We let
q = k
1
+ k
2
= p
1
+ p
2
,
and
Q
be the quark charge. Repeating computations as before, and neglecting
fermion masses, we find that
M = (ie)
2
Q¯u
q
(k
1
)γ
µ
v
q
(k
2
)
ig
µν
q
2
¯v
e
(p
2
)γ
ν
u
e
(p
1
).
We average over initial spins and sum over final states. Then we have
1
4
X
spins
|M|
2
=
e
4
Q
2
4q
4
Tr(
/
k
1
γ
µ
/
k
2
γ
ν
) Tr(
/
p
1
γ
µ
/
p
2
γ
ν
)
=
8e
4
Q
2
q
4
(p
1
· k
1
)(p
2
· k
2
) + (p
1
· k
2
)(p
2
· k
1
)
= e
4
Q
2
(1 + cos
2
θ),
where
θ
is the angle between
k
1
and
p
1
, and we are working in the COM frame
of the e
+
and e
.
Now what we actually want to do is to work out the cross section. We have
dσ =
1
|v
1
v
2
|
1
4p
0
1
p
0
2
d
3
k
1
(2π)
3
2k
0
1
d
3
k
2
(2π)
3
2k
0
2
(2π)
4
δ
(4)
(q k
1
k
2
) ×
1
4
X
spins
|M|
2
.
We first take care of the
|v
1
v
2
|
factor. Since
m
= 0 in our approximation,
they travel at the speed of light, and we have
|v
1
v
2
|
= 2. Also, we note that
k
0
1
= k
0
2
|k| due to working in the center of mass frame.
Using these, and plugging in our expression for |M|, we have
dσ =
e
4
Q
2
2
·
1
16
·
d
3
k
1
d
3
k
2
(2π)
2
|k|
4
δ
(4)
(q k
1
k
2
)(1 + cos
2
θ).
This is all, officially, inside an integral, and if we are only interested in what
directions things fly out, we can integrate over the momentum part. We let dΩ
denote the solid angle, and then
d
3
k
1
= |k|
2
d|k| dΩ.
Then, integrating out some delta functions, we obtain
dσ
dΩ
=
Z
d|k|
e
4
Q
2
8π
2
q
4
1
2
δ
p
q
2
2
|k|
!
(1 + cos
2
θ)
=
α
2
Q
2
4q
2
(1 + cos
2
θ),
where, as usual
α =
e
2
4π
.
We can integrate over all solid angle to obtain
σ(e
+
e
q ¯q) =
4πα
2
3q
2
Q
2
.
We can compare this to
e
+
e
µ
+
µ
, which is exactly the same, except we
put Q = 1 because that’s the charge of a muon.
But this isn’t it. We want to include the effects of hadronization. Thus, we
want to consider the decay of
e
+
e
into any possible hadronic final state. For
any final state X, the invariant amplitude is given by
M
X
=
e
2
q
2
hX|J
µ
h
|0i¯v
e
(p
2
)γ
ν
u
e
(p
1
),
where
J
µ
h
=
X
f
Q
f
¯q
f
γ
µ
q
f
,
and Q
f
is the quark charge. Then the total cross section is
σ(e
+
e
hadrons) =
1
8p
0
1
p
0
2
X
X
1
4
X
spins
(2π)
4
δ
(4)
(q p
X
)|M
X
|
2
.
We can’t compute perturbatively the hadronic bit, because it is non-perturbative
physics. So we are going to parameterize it in some way. We introduce a hadronic
spectral density
ρ
µν
h
(q) = (2π)
3
X
X,p
X
δ
(4)
(q p
X
) h0|J
µ
h
|XihX|J
ν
h
|0i.
We now do some dodgy maths. By current conservation, we have
q
µ
ρ
µν
= 0.
Also, we know
X
has positive energy. Then Lorentz covariance forces
ρ
h
to take
the form
ρ
µν
h
(q) = (g
µν
q
2
+ q
µ
q
ν
)Θ(q
0
)ρ
h
(q
2
).
We can then plug this into the cross-section formula, and doing annoying
computations, we find
σ =
1
8p
0
1
p
0
2
(2π)e
4
4q
4
4(p
1µ
p
2ν
p
1
· p
2
g
µν
+ p
1ν
p
2µ
)(g
µν
q
2
+ q
µ
q
ν
)ρ
h
(q
2
)
=
16π
3
α
2
q
2
ρ
h
(q
2
).
Of course, we can’t compute this
ρ
h
directly. However, if we are lazy, we can
consider only quark-antiquark final states
X
. It turns out this is a reasonably
good approximation. Then we obtain something similar to what we had at the
beginning of the section. We will be less lazy and include the quark masses this
time. Then we have
ρ
µν
h
(q
2
) = N
c
X
f
Q
2
f
Z
d
3
k
1
(2π)
3
2k
0
1
Z
d
3
k
2
(2π)
3
2k
0
2
(2π)
3
δ
(4)
(q k
1
k
2
)
× Tr[(
/
k
1
+ m
f
)γ
µ
(
/
k
2
m
f
)γ
ν
]|
k
2
1
=k
2
2
=m
2
f
,
where
N
c
is the number of colours and
m
f
is the quark
q
f
mass. We consider
the quantity
I
µν
=
Z
d
3
k
1
k
0
1
Z
d
3
k
2
k
0
2
δ
(4)
(q k
1
k
2
)k
µ
1
k
ν
2
k
2
1
=k
2
2
=m
2
f
.
We can argue that we can write
I
µν
= A(q
2
)q
µ
q
ν
+ B(q
2
)g
µν
.
We contract this with
g
µν
and
q
µ
q
ν
(separately) to obtain equations for
A, B
.
We also use
q
2
= (k
1
+ k
2
)
2
= 2m
2
f
+ 2k
1
· k
2
.
We then find that
ρ
h
(q
2
) =
N
c
12π
2
X
f
Q
2
f
Θ(q
2
4m
2
f
)
1
4m
2
f
q
2
!
1/2
q
2
+ 2m
2
f
q
2
.
That’s it! We can now plug this into the equation we had for the cross-section.
It’s still rather messy. If all
m
f
0, then this simplifies very nicely, and we find
that
ρ
h
(q
2
) =
N
c
12π
2
X
f
Q
2
f
.
Then after some hard work, we find that
σ
LO
(e
+
e
hadrons) = N
c
4πα
2
3q
2
X
f
Q
2
f
,
where LO denotes “leading order”.
An experimentally interesting quantity is the following ratio:
R =
σ(e
+
e
hadrons)
σ(e
+
e
µ
+
µ
)
.
Then we find that
R
LO
= N
c
X
f
Q
2
f
=
2
3
N
c
when u, d, s are active
10
9
N
c
when u, d, s, c are active
11
9
N
c
when u, d, s, c, b are active
In particular, we expect “jumps” as we go between the quark masses. Of course,
it is not going to be a sharp jump, but some continuous transition.
We’ve been working with tree level diagrams so far. The one-loop diagrams
are UV finite but have IR divergences, where the loop momenta
0. The
diagrams include
e
e
+
q
¯q
γ
e
e
+
q
¯q
γ
e
e
+
q
¯q
γ
However, it turns out the IR divergence is cancelled by tree level
e
+
e
q ¯qg
such as
e
e
+
q
¯q
g
γ
7.4 Deep inelastic scattering
In this chapter, we are going to take an electron, accelerate it to really high
speeds, and then smash it into a proton.
If we do this at low energies, then the proton appears pointlike. This is
Rutherford and Mott scattering we know and love from A-levels Physics. If we
increase the energy a bit, then the wavelength of the electron decreases, and
now the scattering would be sensitive to charge distributions within the proton.
But this is still elastic scattering. After the interactions, the proton remains a
proton and the electron remains an electron.
What we are interested in is inelastic scattering. At very high energies, what
tends to happen is that the proton breaks up into a lot of hadrons
X
. We can
depict this interaction as follows:
H
e
p
e
p
0
γ
X
θ
This led to the idea that hadrons are made up of partons. When we first
studied this, we thought these partons are weakly interacting, but nowadays, we
know this is due to asymptotic freedom.
Let’s try to understand this scattering. The final state
X
can be very
complicated in general, and we have no interest in this part. We are mostly
interested in the difference in momentum,
q = p p
0
,
as well as the scattering angle
θ
. We will denote the mass of the initial hadron
H by M , and we shall treat the electron as being massless.
It is conventional to define
Q
2
q
2
= 2p · p
0
= 2EE
0
(1 cos θ) 0,
where
E
=
p
0
and
E
0
=
p
00
, since we assumed electrons are massless. We also let
ν = p
H
· q.
It is an easy manipulation to show that p
2
X
= (p
H
+ q)
2
M
2
. This implies
Q
2
2ν.
For simplicity, we are going to consider the scattering in the rest frame of the
hadron. In this case, we simply have
ν = M(E E
0
)
We can again compute the amplitude, which is confusingly also called M:
M = (ie)
2
¯u
e
(p
0
)γ
µ
u
e
(p)
ig
µν
q
2
hX|J
ν
h
|H(p
H
)i
.
Then we can write down the differential cross-section
dσ =
1
4ME|v
e
v
H
|
d
3
p
0
(2π)
3
2p
00
X
X,p
X
(2π)
4
δ
(4)
(q + p
H
p
X
)
1
2
X
spins
|M|
2
.
Note that in the rest frame of the hadron, we simply have |v
e
v
H
| = 1.
We can’t actually compute this non-perturbatively. So we again have to
parametrize this. We can write
1
2
X
spins
|M|
2
=
e
4
2q
4
L
µν
hH(p
H
)|J
µ
h
|XihX|J
ν
h
|H(p
H
)i,
where
L
µν
= Tr
/
µ
/
p
0
γ
ν
) = 4(p
µ
p
0
ν
g
µν
p · p
0
+ p
ν
p
0
µ
.
We define another tensor
W
µν
H
=
1
4π
X
X
(2π)
4
δ
(4)
(p + p
H
o
X
) hH|J
µ
h
|XihX|J
ν
H
|Hi.
Note that this
P
X
should also include the sum over the initial state spins. Then
we have
E
0
dσ
d
3
p
0
=
1
8ME(2π)
3
4π
e
4
2q
4
L
µν
W
µν
H
.
We now use our constraints on
W
µν
H
such as Lorentz covariance, current conser-
vation and parity, and argue that W
µν
H
can be written in the form
W
µν
H
=
g
µν
+
q
µ
q
ν
q
2
W
1
(ν, Q
2
)
+
p
µ
H
p
H
· q
q
2
q
µ
p
ν
H
p
H
· q
q
2
q
ν
× W
2
(ν, Q
2
).
Now, using
q
µ
L
µν
= q
ν
L
µν
= 0,
we have
L
µν
W
µν
H
= 4(2p · p
0
+ 4p · p
0
)W
1
+ 4(2p · p
H
p
0
· p
H
p
2
H
p · p
0
)
= 4Q
2
W
1
+ 2M
2
(4EE
0
Q
2
)W
2
.
We now want to examine what happens as we take the energy
E
. In
this case, for a generic collision, we have
Q
2
, which necessarily implies
ν
. To understand how this behaves, it is helpful to introduce dimensionless
quantities
x =
Q
2
2ν
, y =
ν
p
H
· p
,
known as the Bjorken
x
and the inelasticity respectively. We can interpret
y
as the fractional energy loss of the electron. Then it is not difficult to see that
0
x, y
1. So these are indeed bounded quantities. In the rest frame of the
hadron, we further have
y =
ν
ME
=
E E
0
E
.
This allows us to write L
µν
W
µν
H
as
L
µν
W
µν
H
8EM
xyW
1
+
(1 y)
y
νW
2
,
where we dropped the 2M
2
Q
2
W
2
term, which is of lower order.
To understand the cross section, we need to simplify d
3
p
0
. We integrate out
the angular φ coordinate to obtain
d
3
p
0
7→ 2πE
02
d(cos θ) dE
0
.
We also note that by definition of Q, x, y, we have
dx = d
Q
2
2ν
= 2EE
0
d cos θ + (···) dE
0
dy =
dE
0
E
.
Since (dE
0
)
2
= 0, the d
3
p
0
part becomes
d
3
p
0
7→ πE
0
dQ
2
dy = 2πE
0
ν dx dy.
Then we have
dσ
dx dy
=
1
8(2π)
2
1
EM
e
4
q
4
2πν · 8EM
xyW
1
+
(1 y)
y
νW
2
=
8πα
2
ME
Q
4
xy
2
F
1
+ (1 y)F
2
,
where
F
1
W
2
, F
2
νW
2
.
By varying
x
and
y
in experiments, we can figure out the values of
F
1
and
F
2
.
Moreover, we expect if we do other sorts of experiments that also involve hadrons,
then the same
F
1
and
F
2
will appear. So if we do other sorts of experiments and
measure the same
F
1
and
F
2
, we can increase our confidence that our theory is
correct.
Without doing more experiments, can we try to figure out something about
F
1
and
F
2
? We make a simplifying assumption that the electron interacts with
only a single constituent of the hadron:
H
e
p
e
p
0
q
k
k + q
X
0
θ
We further suppose that the EM interaction is unaffected by strong inter-
actions. This is known as factorization. This leading order model we have
constructed is known as the parton model . This was the model used before we
believed in QCD. Nowadays, since we do have QCD, we know these “partons” are
actually quarks, and we can use QCD to make some more accurate predictions.
We let f range over all partons. Then we can break up the sum
P
X
as
X
X
=
X
X
0
X
f
1
(2π)
3
Z
d
4
k Θ(k
0
)δ(k
2
)
X
spins
,
where we put δ(k
2
) because we assume that partons are massless.
To save time (and avoid unpleasantness), we are not going to go through the
details of the calculations. The result is that
W
µν
H
=
X
f
Z
d
4
k Tr
W
µν
f
Γ
H,f
(p
H
, k) +
¯
W
µν
f
¯
Γ
H,f
(p
H
, k)
,
where
W
µν
f
=
¯
W
µν
f
=
1
2
Q
2
f
γ
µ
(
/
k +
/
q)γ
ν
δ((k + q)
2
)
Γ
H,f
(p
H
, k)
βα
=
X
X
0
δ
(4)
(p
H
k p
0
X
) hH(p
H
)| ¯q
f
|X
0
ihX
0
|q
f
|H(p
H
)i,
where α, β are spinor indices.
In the deep inelastic scattering limit, putting everything together, we find
F
1
(x, Q
2
)
1
2
X
f
Q
2
f
[q
f
(x) + ¯q
f
(x)]
F
2
(x, Q
2
) 2xF
1
,
for some functions
q
f
(
x
),
¯q
f
(
x
). These functions are known as the parton
distribution functions (PDF ’s). They are roughly the distribution of partons
with the longitudinal momentum function.
The very simple relation between
F
2
and
F
1
is called the Callan–Gross
relation, which suggests the partons are spin
1
2
. This relation between
F
2
and
F
1
is certainly something we can test in experiments, and indeed they happen.
We also see the Bjorken scaling phenomenon
F
1
and
F
2
are independent of
Q
2
. This boils down to the fact that we are scattering with point-like particles.