6Non-abelian gauge theory
III Advanced Quantum Field Theory
6.4 Faddeev–Popov ghosts
To understand how to do this, we will first consider a particular finite-dimensional
example. Suppose we have a field (
x, y
) :
{pt} → R
2
on a zero-dimensional
universe and an action
S
[
x, y
]. For simplicity, we will often ignore the existence
of the origin in R
2
.
The partition function we are interested in is
Z
R
2
dx dy e
−S[x,y]
.
Suppose the action is rotationally invariant. Then we can write the integral as
Z
R
2
dx dy e
−S[x,y]
=
Z
2π
0
dθ
Z
∞
0
r dr e
−S[r]
= 2π
Z
∞
0
r dr e
−S[r]
.
We can try to formulate this result in more abstract terms. Our space
R
2
of
fields has an action of the group
SO
(2) by rotation. The quotient/orbit space of
this action is
R
2
\ {0}
SO(2)
∼
=
R
>0
.
Then what we have done is that we have replaced the integral over the whole of
R
2
, namely the (
x, y
) integral, with an integral over the orbit space
R
>0
, namely
the r integral. There are two particularly important things to notice:
–
The measure on
R
>0
is not the “obvious” measure d
r
. In general, we have
to do some work to figure out what the correct measure is.
–
We had a factor of 2
π
sticking out at the front, which corresponds to the
“volume” of the group SO(2).
In general, how do we figure out what the quotient space
R
>0
is, and how do we
find the right measure to integrate against? The idea, as we have always done,
is to “pick a gauge”.
We do so via a gauge fixing function. We specify a function
f
:
R
2
→ R
, and
then our gauge condition will be
f
(
x
) = 0. In other words, the “space of gauge
orbits” will be
C = {x ∈ R
2
: f(x) = 0}
f(x) = 0
For this to work out well, we need the following two conditions:
(i) For each x ∈ R
2
, there exists some R ∈ SO(2) such that f(Rx) = 0.
(ii) f
is non-degenerate. Technically, we require that for any
x
such that
f(x) = 0, we have
∆
f
(x) =
∂
∂θ
f(R
θ
(x))
θ=0
6= 0,
where R
θ
is rotation by θ.
The first condition is an obvious requirement — our function
f
does pick out a
representative for each gauge orbit. The second condition is technical, but also
crucial. We want the curve to pick out a unique point in each gauge orbit. This
prevents a gauge orbit from looking like
where the dashed circle intersects the curved line three times. For any curve that
looks like this, we would have a vanishing ∆
f
(
x
) at the turning points of the
curve. The non-degeneracy condition forces the curve to always move radially
outwards, so that we pick out a good gauge representative.
It is important to note that the non-degeneracy condition in general does not
guarantee that each gauge orbit has a unique representative. In fact, it forces
each gauge orbit to have two representatives instead. Indeed, if we consider a
simple gauge fixing function
f
(
x, y
) =
x
, then this is non-degenerate, but the
zero set looks like
It is an easy exercise with the intermediate value theorem to show that there
must be at least two representatives in each gauge orbit (one will need to use
the non-degeneracy condition).
This is not a particularly huge problem, since we are just double counting
each gauge orbit, and we know how to divide by 2. Let’s stick with this and
move on.
To integrate over the gauge orbit, it is natural to try the integral.
Z
R
2
dx dy δ(f(x))e
−S(x,y)
.
Then the δ-function restricts us to the curve C, known as the gauge slice.
However, this has a problem. We would want our result to not depend on
how we choose our gauge. But not only does this integral depend on
C
. It in
fact depends on f as well!
To see this, we can simply replace f by cf for some constant c ∈ R. Then
δ(f(x)) 7→ δ(cf(x)) =
1
|c|
δ(f(x)).
So our integral changes. It turns out the trick is to include the factor of ∆
f
we
previously defined. Consider the new integral
Z
R
2
dx dy δ(f(x))|∆
f
(x)|e
−S(x)
. (∗)
To analyze how this depends on
f
, we pretend that the zero set
C
of
f
actually
looks like this:
rather than
Of course, we know the former is impossible, and the zero set must look like the
latter. However, the value of the integral (
∗
) depends only on how
f
behaves
locally near the zero set, and so we may analyze each “branch” separately,
pretending it looks like the former. This will make it much easier to say what
we want to say.
Theorem. The integral (∗) is independent of the choice of f and C.
Proof.
We first note that if we replace
f
(
x
) by
c
(
r
)
f
(
x
) for some
c >
0, then we
have
δ(cf) =
1
|c|
δ(f), |∆
cf
(x)| = c(r)|∆
f
|,
and so the integral doesn’t change.
Next, suppose we replace
f
with some
˜
f
, but they have the same zero set.
Now notice that
δ
(
f
) and
|
∆
f
|
depend only on the first-order behaviour of
f
at
C
. In particular, it depends only on
∂f
∂θ
on
C
. So for all practical purposes,
changing
f
to
˜
f
is equivalent to multiplying
f
by the ratio of their derivatives.
So changing the function
f
while keeping
C
fixed doesn’t affect the value of (
∗
).
Finally, suppose we have two arbitrary
f
and
˜
f
, with potentially different
zero sets. Now for each value of r, we pick a rotation R
θ(r)
∈ SO(2) such that
˜
f(x) ∝ f(R
θ(r)
x).
By the previous part, we can rescale
f
or
˜
f
, and assume we in fact have equality.
We let
x
0
=
R
θ(r)
x
. Now since the action only depends on the radius, it
in particular is invariant under the action of
R
θ(r)
. The measure d
x
d
y
is also
invariant, which is particularly clear if we write it as d
θ r
d
r
instead. Then we
have
Z
R
2
dx dy δ(f(x))|∆
f
(x)|e
−S(x)
=
Z
R
2
dx
0
dy
0
δ(f(x
0
))|∆
f
(x
0
)|e
−S(x
0
)
=
Z
R
2
dx
0
dy
0
δ(
˜
f(x))|∆
˜
f
(x)|e
−S(x
0
)
=
Z
R
2
dx dy δ(
˜
f(x))|∆
˜
f
(x)|e
−S(x)
Example.
We choose
C
to be the
x
-axis with
f
(
x
) =
y
. Then under a rotation
f(x) = y 7→ y sin θ − x sin θ,
we have
∆
f
(x) = −x.
So we have
Z
R
2
dx dy δ(f)∆
f
(x)e
−S(x,y)
=
Z
R
2
dx dy δ(y)|x|e
−S(x,y)
=
Z
∞
−∞
dx |x|e
−S(x,0)
= 2
Z
∞
0
d|x| |x|e
−S(|x|)
= 2
Z
∞
0
r dr e
−S(r)
.
So this gives us back the original integral
Z
∞
0
r dr e
−S(r)
of along the gauge orbit we wanted, except for the factor of 2. As we mentioned
previously, this is because our gauge fixing condition actually specifies two points
on each gauge orbit, instead of one. This is known as the Gribov ambiguity.
When we do perturbation theory later on, we will not be sensitive to this
global factor, because in perturbation theory, we only try to understand the
curve locally, and the choice of gauge is locally unique.
The advantage of looking at the integral
Z
R
2
dx dy δ(f)∆
f
e
−S(x,y)
is that it only refers to functions and measures on the full space
R
2
, which we
understand well.
More generally, suppose we have a (well-understood) space
X
with a measure
d
µ
. We then have a Lie group
G
acting on
X
. Suppose locally (near the identity),
we can parametrize elements of
G
by parameters
θ
a
for
a
= 1
, ··· , dim G
. We
write
R
θ
for the corresponding element of
G
(technically, we are passing on to
the Lie algebra level).
To do gauge fixing, we now need many gauge fixing functions, say
f
a
, again
with a = 1, ··· , dim G. We then let
∆
f
= det
∂f
a
(R
θ
x)
∂θ
b
θ=0
.
This is known as the Fadeev–Popov determinant.
Then if we have a function
e
−S[x]
that is invariant under the action of
G
,
then to integrate over the gauge orbits, we integrate
Z
X
dµ |∆
f
|
dim G
Y
a=1
δ(f
a
(x))e
−S[x]
.
Now in Yang–Mills, our spaces and groups are no longer finite-dimensional, and
nothing makes sense. Well, we can manipulate expressions formally. Suppose we
have some gauge fixing condition f . Then the expression we want is
Z =
Z
A/G
Dµ e
−S
Y M
=
Z
A
DA δ[f]|∆
f
(A)|e
−S
Y M
[A]
,
Suppose the gauge group is
G
, with Lie algebra
g
. We will assume the gauge
fixing condition is pointwise, i.e. we have functions
f
a
:
g → g
, and the gauge
fixing condition is
f(A(x)) = 0 for all x ∈ M.
Then writing n = dim g, we can write
δ[f] =
Y
x∈M
δ
(n)
(f(A(x))).
We don’t really know what to do with these formal expressions. Our strategy is
to write this integral in terms of more “usual” path integrals, and then we can
use our usual tools of perturbation theory to evaluate the integral.
We first take care of the
δ
[
f
] term. We introduce a new bosonic field
h ∈ Ω
0
M
(g), and then we can write
δ[f] =
Z
Dh exp
−
Z
ih
a
(x)f
a
(A(x)) d
d
x
,
This
h
acts as a “Lagrange multiplier” that enforces the condition
f
(
A
(
x
)) = 0,
and we can justify this by comparing to the familiar result that
Z
e
ip·x
dp = δ(x).
To take care of the determinant term, we wanted to have
∆
f
= det
δf
a
[A
λ
(x)]
δλ
b
(y)
,
where λ
a
(y) are our gauge parameters, and A
λ
is the gauge-transformed field.
Now recall that for a finite-dimensional n × n matrix M , we have
det(M) =
Z
d
n
c d
n
¯c e
¯cMc
,
where
c, ¯c
are
n
-dimensional fermion fields. Thus, in our case, we can write the
Fadeev—Popov determinant as the path integral
∆
f
=
Z
Dc D¯c exp
Z
M×M
d
d
x d
d
y ¯c
a
(x)
δf
a
(A
λ
(x))
δλ
b
(y)
c
b
(y)
,
where
c, ¯c
are fermionic scalars, again valued in
g
under the adjoint action. Since
we assumed that
f
a
is local, i.e.
f
a
(
A
)(
x
) is a function of
A
(
x
) and its derivative
at x only, we can simply write this as
∆
f
=
Z
Dc d¯c exp
Z
M
d
d
x ¯c
a
(x)
δf
a
(A
λ
)
δλ
b
(x)c
b
(x)
.
The fermionic fields
c
and
¯c
are known as ghosts and anti-ghosts respectively.
We might find these a bit strange, since they are spin 0 fermionic fields, which
violates the spin statistic theorem. So, if we do canonical quantization with this,
then we find that we don’t get a Hilbert space, as we get states with negative
norm! Fortunately, there is a subspace of gauge invariant states that do not
involve the ghosts, and the inner product is positive definite in this subspace.
When we focus on these states, and on operators that do not involve ghosts,
then we are left with a good, unitary theory. These
c
and
¯c
aren’t “genuine”
fields, and they are just there to get rid of the extra fields we don’t want. The
“physically meaningful” theory is supposed to happen in A/G, where no ghosts
exist.
Will all factors included, the full action is given by
S[A, ¯c, c, h] =
Z
d
d
x
1
4g
2
Y M
F
a
µν
F
a,µν
+ ih
a
f
a
(A) − ¯c
a
δf
a
(A
λ
)
δλ
b
c
b
,
and the path integral is given by
Z =
Z
DA Dc D¯c Dh exp
−S[A, ¯c, c, h]
.
Example.
We often pick Lorenz gauge
f
a
(
A
) =
∂
µ
A
a
µ
. Under a gauge transfor-
mation, we have A 7→ A
λ
= A + ∇λ. More explicitly, we have
(A
λ
)
a
µ
= A
a
µ
+ ∂
µ
λ
a
+ f
a
bc
A
b
µ
λ
c
.
So the matrix appearing in the Fadeev–Popov determinant is
δf
a
(A
λ
)
δλ
b
= ∂
µ
∇
µ
.
Thus, the full Yang–Mills action is given by
S[A, ¯c, c, h] =
Z
d
d
x
1
4g
2
Y M
F
a
µν
F
a,µν
+
i
2
h
a
∂
µ
A
a
µ
− ¯c
a
∂
µ
∇
µ
c
a
.
Why do have to do this? Why didn’t we have to bother when we did
electrodynamics?
If we did this for an abelian gauge theory, then the structure constants were
zero. Consequently, all
f
a
bc
terms do not exist. So the ghost kinetic operator
does not involve the gauge field. So the path integral over the ghosts would be
completely independent of the gauge field, and so as long as we worked in a
gauge, then we can ignore the ghosts. However, in a non-abelian gauge theory,
this is not true. We cannot just impose a gauge and get away with it. We need
to put back the Jacobian to take into account the change of variables, and this
is where the ghosts come in.
The benefit of doing all this work is that it now looks very familiar. It seems
like something we can tackle using Feynman rules with perturbation theory.