Part IV — Topics in Geometric Group Theory
Based on lectures by H. Wilton
Notes taken by Dexter Chua
Michaelmas 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
The subject of geometric group theory is founded on the observation that the algebraic
and algorithmic properties of a discrete group are closely related to the geometric
features of the spaces on which the group acts. This graduate course will provide an
introduction to the basic ideas of the subject.
Suppose Γ is a discrete group of isometries of a metric space
X
. We focus on the
theorems we can prove about Γ by imposing geometric conditions on
X
. These
conditions are motivated by curvature conditions in differential geometry, but apply
to general metric spaces and are much easier to state. First we study the case when
X
is Gromovhyperbolic, which corresponds to negative curvature. Then we study
the case when
X
is CAT(0), which corresponds to nonpositive curvature. In order
for this theory to be useful, we need a rich supply of negatively and nonpositively
curved spaces. We develop the theory of nonpositively curved cube complexes, which
provide many examples of CAT(0) spaces and have been the source of some dramatic
developments in lowdimensional topology over the last twenty years.
Part 1.
We will introduce the basic notions of geometric group theory: Cayley graphs,
quasiisometries, the Schwarz–Milnor Lemma, and the connection with algebraic
topology via presentation complexes. We will discuss the word problem, which
is quantified using the Dehn functions of a group.
Part 2.
We will cover the basic theory of wordhyperbolic groups, including the Morse
lemma, local characterization of quasigeodesics, linear isoperimetric inequality,
finitely presentedness, quasiconvex subgroups etc.
Part 3.
We will cover the basic theory of CAT(0) spaces, working up to the Cartan–
Hadamard theorem and Gromov’s Link Condition. These two results together
enable us to check whether the universal cover of a complex admits a CAT(1)
metric.
Part 4.
We will introduce cube complexes, in which Gromov’s link condition becomes
purely combinatorial. If there is time, we will discuss Haglund–Wise’s special
cube complexes, which combine the good geometric properties of CAT(0) spaces
with some strong algebraic and topological properties.
Prerequisites
Part IB Geometry and Part II Algebraic topology are required.
Contents
1 Cayley graphs and the word metric
1.1 The word metric
1.2 Free groups
1.3 Finitelypresented groups
1.4 The word problem
2 Van Kampen diagrams
3 Bass–Serre theory
3.1 Graphs of spaces
3.2 The Bass–Serre tree
4 Hyperbolic groups
4.1 Definitions and examples
4.2 Quasigeodesics and hyperbolicity
4.3 Dehn functions of hyperbolic groups
5 CAT(0) spaces and groups
5.1 Some basic motivations
5.2 CAT(κ) spaces
5.3 Length metrics
5.4 Alexandrov’s lemma
5.5 Cartan–Hadamard theorem
5.6 Gromov’s link condition
5.7 Cube complexes
5.8 Special cube complexes
1 Cayley graphs and the word metric
1.1 The word metric
There is an unfortunate tendency for people to think of groups as being part
of algebra. Which is, perhaps, reasonable. There are elements and there are
operations on them. These operations satisfy some algebraic laws. But there is
also geometry involved. For example, one of the simplest nonabelian groups
we know is
D
6
, the group of symmetries of a triangle. This is fundamentally a
geometric idea, and often this geometric interpretation can let us prove a lot of
things about D
6
.
In general, the way we find groups in nature is that we have some object,
and we look at its symmetry. In geometric group theory, what we do is that we
want to remember the object the group acts on, and often these objects have
some geometric structure. In fact, we shall see that all groups are symmetries of
some geometric object.
Let Γ be a group, and
S
a generating set for Γ. For the purposes of this
course,
S
will be finite. So in this course, we are mostly going to think about
finitelygenerated groups. Of course, there are nonfinitelygenerated groups out
there in the wild, but we will have to put in some more effort to make our ideas
work.
The simplest geometric idea we can think of is that of distance, i.e. a metric.
So we want to use our generating set S to make Γ into a metric space.
Definition
(Word length)
.
Let Γ be a group and
S
a finite generating set. If
γ ∈ Γ, the word length of γ is
S
(γ) = min{n : γ = s
±1
1
· · · s
±1
n
for some s
i
∈ S}.
Definition
(Word metric)
.
Let Γ be a group and
S
a finite generating set. The
word metric on Γ is given by
d
S
(γ
1
, γ
2
) =
s
(γ
−1
1
γ
2
).
If we stare at the definition, we see that the word metric is leftinvariant, i.e.
for any g, γ
1
, γ
2
∈ Γ, we have
d
S
(gγ
1
, gγ
2
) = d
S
(γ
1
, γ
2
).
However, it is usually not the case that the metric is right invariant, i.e.
d
S
(
γ
1
g, γ
2
g
) is not equal to
d
S
(
γ
1
, γ
2
). This is one of the prices we have to
pay for wanting to do geometry.
If we tried to draw our metric space out, it looks pretty unhelpful — it’s just
a bunch of dots. It is not path connected. We can’t really think about it well.
To fix this, we talk about Cayley graphs.
Definition (Cayley graph). The Cayley graph Cay
S
(Γ) is defined as follows:
– V (Cay
S
(Γ)) = Γ
– For each γ ∈ Γ and s ∈ S, we draw an edge from γ to γs.
If we want it to be a directed and labelled graph, we can label each edge by the
generator s.
There is, of course, an embedding of Γ into
Cay
S
(Γ). Moreover, if we
metrize
Cay
S
(Γ) such that each edge has length [0
,
1], then this is an isometric
embedding.
It will be useful for future purposes to note that there is a left action of Γ
on
Cay
S
(Γ) which extends the left action of Γ on itself. This works precisely
because the edges are defined by multiplication on the right, and left and right
multiplication commute.
Example.
Take Γ =
C
2
=
Z/
2
Z
, and take
S
=
{
1
}
. Then the Cayley graph
looks like
0 1
Example. Take Γ = S
3
and S = {(1 2), (1 2 3)}. The Cayley graph is
Example. Take Γ = Z, and S = {1}. Then the Cayley graph looks like
−4 −3 −2 −1 0 1 2 3 4
Example. Take Γ = Z and S = {2, 3}. Then the Cayley graph looks like
Now this seems quite complicated. It’s still the same group Γ =
Z
, but with
a different choice of generating set, it looks very different. It seems like what we
do depends heavily on the choice of the generating set.
Example. If Γ = Z
2
and S = {(1, 0), (0, 1)}, then the Cayley graph is a grid
Example. If Γ = F
2
= ha, bi, S = {a, b}, then the Cayley graph looks like
If one has done algebraic topology, then one might recognize this as the
universal cover of a space. This is not a coincidence, and we will talk about this
later.
Let’s now thing a bit more about the choice of generating set. As one would
expect, the Cayley graph depends a lot on the generating set chosen, but often we
only care about the group itself, and not a groupwithachoiceofgeneratingset.
The key observation is that while the two Cayley graphs of
Z
we drew seem
quite different, if we looked at them from 100 meters away, they look quite
similar — they both look like a long, thin line.
Definition
(Quasiisometry)
.
Let
λ ≥
1 and
c ≥
0. A function between metric
spaces f : X → Y is a (λ, c)quasiisometric embedding if for all x
1
, x
2
∈ X
1
λ
d
X
(x
1
, x
2
) − c ≤ d
Y
(f(x
1
), f(x
2
)) ≤ λd
X
(x,
1
, x
2
) + c
If, in addition, there is a
C
such that for all
y ∈ Y
, there exists
x ∈ X
such that
d
Y
(
y, f
(
x
))
≤ C
, we say
f
is a quasiisometry, and
X
is quasiisometric to
Y
.
We write X '
qi
Y .
We can think of the first condition as saying we are quasiinjective, and the
second as saying we are quasisurjective.
The right way to think about the definition is that the
c
says we don’t care
about what happens at scales less than
c
, and the
λ
is saying we allow ourselves
to stretch distances. Note that if we take
c
= 0, then this is just the notion of
a biLipschitz map. But in general, we don’t even require
f
to be continuous,
since continuity is a rather finegrained property.
Exercise. Check that X '
qi
Y is an equivalence relation.
This is not immediately obvious, because there is no inverse to
f
. In fact, we
need the axiom of choice to prove this result.
Example.
Any bounded metric spaces is quasiisometric to a point. In particu
lar, if Γ is finite, then (Γ, d
S
) '
qi
1.
For this reason, this is not a great point of view for studying finite groups.
On the other hand, for infinite groups, this is really useful.
Example.
For any Γ and
S
, the inclusion (Γ
, d
S
)
→
(
Cay
S
(Γ)
, d
S
) is a quasi
isometry (take λ = 1, c = 0, C =
1
2
).
Of course, the important result is that any two word metrics on the same
group are quasiisomorphic.
Theorem.
For any two finite generating sets
S, S
0
of a group Γ, the identity
map (Γ, d
S
) → (Γ, d
S
0
) is a quasiisometry.
Proof. Pick
λ = max
s∈S
S
0
(s), λ
0
= max
s∈S
0
S
(s),
We then see
S
(γ) ≤ λ
0
S
0
(γ),
S
0
(γ) ≤ λ
s
(γ).
for all γ ∈ Γ. Then the claim follows.
So as long as we are willing to work up to quasiisometry, we have a canonically
defined geometric object associated to each finitelygenerated group.
Our next objective is to be able to state and prove the Schwarz–Milnor
lemma. This is an important theorem in geometric group theory, saying that if
a group Γ acts “nicely” on a metric space
X
, then Γ must be finitely generated,
and in fact (Γ
, d
s
) is quasiisomorphic to
X
. This allows us to produce some
rather concrete geometric realizations of (Γ, d
s
), as we will see in examples.
In order to do that, we must write down a bunch of definitions.
Definition
(Geodesic)
.
Let
X
be a metric space. A geodesic in
X
is an isometric
embedding of a closed interval γ : [a, b] → X.
This is not exactly the same as the notion in differential geometry. For
example, if we have a sphere and two nonantipodal points on the sphere, then
there are many geodesics connecting them in the differential geometry sense,
but only the shortest one is a geodesic in our sense. To recover the differential
geometry notion, we need to insert the words “locally” somewhere, but we shall
not need that.
Definition
(Geodesic metric space)
.
A metric space
X
is called geodesic if every
pair of points x, y ∈ X is joined by a geodesic denoted by [x, y].
Note that the notation [
x, y
] is not entirely honest, since there may be
many geodesics joining two points. However, this notation turns out to work
surprisingly well.
Definition
(Proper metric space)
.
A metric space is proper if closed balls in
X
are compact.
For example, infinitedimensional Hilbert spaces are not proper, but finite
dimensional ones are.
Example.
If Γ =
hSi
, then
Cay
S
(Γ) is geodesic. If
S
is finite, then
Cay
S
(Γ) is
proper.
Example.
Let
M
be a connected Riemannian manifold. Then there is a metric
on M defined by
d(x, y) = inf
α:x→y
length(α),
where we take the infimum over all smooth paths from x to y.
The Hopf–Rinow theorem says if
M
is complete (as a metric space), then the
metric is proper and geodesic. This is great, because completeness is in some
sense a local property, but “proper” and “geodesic” are global properties.
We need two more definitions, before we can state the Schwarz–Milnor lemma.
Definition
(Proper discontinuous action)
.
An action Γ on a topological space
X is proper discontinuous if for every compact set K, the set
{g ∈ Γ : gK ∩ K 6= ∅}
is finite.
Definition
(Cocompact action)
.
An action Γ on a topological space
X
is
cocompact if the quotient Γ \ X is compact.
Lemma
(Schwarz–Milnor lemma)
.
Let
X
be a proper geodesic metric space,
and let Γ act properly discontinuously, cocompactly on X by isometries. Then
(i) Γ is finitelygenerated.
(ii) For any x
0
∈ X, the orbit map
Γ → X
γ 7→ γx
0
is a quasiisometry (Γ, d
s
) '
qi
(X, d).
An easy application is that Γ acting on its Caylay graph satisfies these
conditions. So this reproduces our previous observation that a group is quasi
isometric to its Cayley graph. More interesting examples involve manifolds.
Example.
Let
M
be a closed (i.e. compact without boundary), connected
Riemannian manifold. Then the universal cover
˜
M
is also a complete, connected,
Riemannian manifold. By the Hopf–Rinow theorem, this is proper and geodesic.
Since the metric of
˜
M
is pulled back from
M
, we know
π
1
(
M
) acts on
˜
M
by
isometries. Therefore by the Schwarz–Milnor lemma, we know
π
1
(M) '
qi
˜
M.
Example.
The universal cover of the torus
S
1
× S
1
is
R
2
, and the fundamental
group is Z
2
. So we know Z
2
'
qi
R
2
, which is not surprising.
Example. Let M = Σ
2
, the surface of genus 2.
We then have
π
1
Σ
2
∼
=
ha
1
, b
1
, a
2
, b
2
 [a
1
, b
1
][a
2
, b
2
]i.
On the other hand, the universal cover can be thought of as the hyperbolic plane
H
2
, as we saw in IB Geometry.
So it follows that
π
1
Σ
2
'
qi
H
2
.
This gives us some concrete hold on what the group
π
1
Σ
2
is like, since people
have been thinking about the hyperbolic plane for centuries.
Proof of Schwarz–Milnor lemma.
Let
¯
B
=
¯
B
(
x, R
) be such that Γ
¯
B
=
X
. This
is possible since the quotient is compact.
Let S = {γ ∈ Γ : γ
¯
B ∩
¯
B 6= ∅}. By proper discontinuity, this set is finite.
We let
r = inf
γ
0
6∈S
d(
¯
B, γ
0
¯
B).
If we think about it, we see that in fact
r
is the minimum of this set, and in
particular r > 0.
Finally, let
λ = max
s∈S
d(x
0
, sx
0
).
We will show that
S
generates Γ, and use the word metric given by
S
to show
that Γ is quasiisometric to X.
We first show that Γ =
hSi
. We let
γ ∈
Γ be arbitrary. Let [
x
0
, γx
0
] be a
geodesic from x
0
to γx
0
. Let be such that
( − 1)r ≤ d(x
0
, γx
0
) < r.
Then we can divide the geodesic into
pieces of length about
r
. We can choose
x
1
, . . . x
`−1
, x
`
= γx
0
such that d(x
i−1
, x
i
) < r.
By assumption, we can pick
γ
i
∈
Γ such that
x
i
∈ γ
i
¯
B
, and further we pick
γ
`
= γ, γ
0
= e. Now for each i, we know
d(
¯
B, γ
−1
i−1
γ
i
¯
B) = d(γ
i−1
¯
B, γ
i
¯
B) ≤ d(x
i−1
, x
i
) < r.
So it follows that γ
−1
i−1
γ
i
∈ S. So we have
γ = γ
`
= (γ
−1
0
γ
1
)(γ
−1
1
γ
2
) · · · (γ
−1
`−1
γ
`
) ∈ hSi.
This proves Γ = hSi.
To prove the second part, we simply note that
r − r ≤ d(x
0
γx
0
).
We also saw that
is an upper bound for the word length of
γ
under
S
. So we
have
r
s
(γ) − r ≤ d(x
0
, γx
0
).
On the other hand, by definition of λ, we have
d(x
0
, γx
0
) ≤ λ
s
(γ).
So the orbit map is an orbitembedding, and quasisurjectivity follows from
cocompactness.
1.2 Free groups
If we want to understand a group, then a reasonable strategy now would be to
come up with a wellunderstood (proper geodesic) space for Γ to act (proper
discontinuously, cocompactly) on, and see what we can tell from this quasi
isometry. There is no general algorithm for doing so, but it turns out for certain
groups, there are some “obvious” good choices to make.
Let’s begin by understanding the free group. In some sense, they are the
“simplest” groups, but they also contain some rich structure. Roughly speaking,
a free group on a set
S
is a group that is “freely” generated by
S
. This freeness
is made precise by the following universal property.
Definition
(Free group)
.
Let
S
be a (usually finite) set. A group
F
(
S
) with a
map
S → F
(
S
) is called the free group on
S
if it satisfies the following universal
property: for any set map
S → G
, there is a unique group homomorphism
F (S) → G such that the following diagram commutes:
S F (S)
G
.
Usually, if S = r, we just write F (S) = F
r
.
In fancy categorytheoretic language, this says the functor
F
:
Sets → Grps
is left adjoint to the forgetful functor U : Grps → Sets.
This definition is good for some things, but not good for others. One thing it
does well is it makes it clear that
F
(
S
) is unique up to isomorphism if it exists.
However, it is not immediately clear that
F
(
S
) exists! (unless one applies the
adjoint functor theorem)
Thus, we must concretely construct a group satisfying this property. Apart
from telling us that
F
(
S
) actually exists, a concrete construction lets us actually
do things with the group.
For concreteness, suppose
S
=
{a
1
, . . . , a
r
}
is a finite set. We consider a
graph
X
r
=
.
.
.
where there are r “petals”. This is known as the rose with r petals.
We claim that its fundamental group is
F
(
S
). To see this, we have to
understand the universal cover
˜
X
r
of
X
r
. We know
˜
X
r
is a simply connected
graph, so it is a tree. Moreover, since it is a covering space of
X
r
, it is regular
with degree 2r on each vertex.
If r = 2, then this is our good old picture
∗
a
1
a
1
a
2
a
2
We can use this to understand the fundamental group
π
1
(
X
r
). The elements
on
π
1
(
X
r
) are homotopy classes [
γ
] of based loops in
X
r
. Using the homotopy
lifting lemma, this map can be lifted to a path
˜γ
in the universal cover starting
at ∗.
∗
In general, there are many paths that represent the same element. However, there
is a “canonical” such path. Indeed, we can eliminate all backtracks in the path
to obtain an embedded path from
∗
to the end point of
˜γ
. We may reparametrize
so that it spends equal time on each edge, but that is not important.
Algebraically, how do we understand this? Each loop in
X
r
represents an
element
a
i
in
π
1
(
X
r
). If we look at the “canonical” form of the path above, then
we see that we are writing γ in the form
γ = a
±1
i
1
a
±1
i
2
a
±1
i
3
· · · a
±1
i
n
.
The fact that we have eliminated all backtracks says this expression is reduced,
i.e. we don’t see anything looking like
a
i
a
−1
i
, or
a
−1
i
a
i
. Importantly, there is a
unique way to write
γ
as a reduced word, corresponding to the unique embedded
path from
∗
to the end point of
˜γ
in the universal cover. Thus, if we somehow
expressed
γ
as a nonreduced word in the
a
1
, . . . , a
r
, then we can reduce it by
removing instances of
a
i
a
−1
i
or
a
−1
i
a
i
, and keep doing so until we don’t see any.
This terminates since the length of the word is finite, and the result does not
depend on the order we performed the reduction in, by uniqueness.
Example. Take the case π
1
X
2
= ha, bi. We might be given a word
a
3
a
−2
b
−2
b
3
a
5
a
−3
a
−3
b
−1
.
We can reduce this to
aba
−1
b
−1
.
In particular, since this is not the identity, we know the original path was not
nullhomotopic.
This gives us a very concrete understanding of
π
1
(
X
r
) — it is generated
by
a
1
, . . . , a
r
, and each element is represented uniquely by a reduced word. To
multiply words together, write them one after another, and then reduce the
result.
From this description, it is clear that
Corollary. π
1
(
X
r
) has the universal property of
F
(
S
), with
S
=
{a
1
, . . . , a
r
}
.
So π
1
(X
r
)
∼
=
F
r
.
Proof. Given any map f : S → G, define
˜
f : F (S) → G by
˜
f(a
±1
i
1
· · · a
±1
i
n
) =
˜
f(a
i
)
±1
· · ·
˜
f(a
i
n
)
±1
for any reduced word
a
±1
i
1
· · · a
±1
i
n
. This is easily seen to be welldefined and is
the unique map making the diagram commute.
1.3 Finitelypresented groups
Let’s try to consider groups that are slightly more complex than free groups. If a
group Γ is generated by
S
, then we have a surjective homomorphism
F
(
S
)
→
Γ.
Let K = ker η. Then the first isomorphism theorem tells us
Γ
∼
=
F (S)
K
.
Since we understand
F
(
S
) quite explicitly, it would be nice if we have a solid
gasp on
K
as well. If
R
normally generates
K
, so that
K
=
hhRii
, then we say
that hS  Ri is a presentation for Γ. We will often write that
Γ
∼
=
hS  Ri.
Example.
Z
2
= ha, b  aba
−1
b
−1
i
Definition
(Finitelypresented group)
.
A finitelypresentable group is a group
Γ such that there are finite S and R such that Γ
∼
=
hS  Ri.
A finitelypresented group is a group Γ equipped
S
and
R
such that Γ
∼
=
hS 
Ri.
Presentations give us ways to geometrically understand a group. Given a
presentation
P
=
hS  Ri
, we can construct space
X
P
such that
π
1
X
P
∼
=
hS  Ri
To do so, we first construct a rose with
S
many petals, each labeled by an
element of
S
. For each
r ∈ R
, glue a disk onto the rose along the path specified
by r. The Seifert–van Kampen theorem then tells us π
1
X
P
∼
=
Γ.
Example.
We take the presentation
Z
2
∼
=
ha, b  aba
−1
b
−1
i
. If we think hard
enough (or a bit), we see this construction gives the torus.
Conversely, if we are given a connected cell complex
X
, we can obtain a
presentation of the fundamental group. This is easy if the cell complex has a
single 0cell. If it has multiple 0cells, then we choose a maximal tree in
X
(1)
,
and the edges
S
=
{a
i
}
not in the maximal tree define a generating set for
π
1
X
(1)
. The attaching maps of the 2cells in
X
define elements
R
=
{r
j
}
of
π
1
X
(1)
, and these data define a presentation P
X
= hS  Ri for π
1
X.
This is not canonical, since we have to pick a maximal tree, but let’s not
worry too much about that. The point of maximal trees is to get around the
problem that we might have more than one vertex.
Exercise. If X has one vertex, then Cay
S
π
1
X =
˜
X
(1)
.
1.4 The word problem
Long time ago, Poincar´e was interested in characterizing the 3sphere. His
first conjecture was that if a closed 3manifold
M
had
H
∗
(
M
)
∼
=
H
∗
(
S
3
), then
M
∼
=
S
3
. Poincar´e thought very hard about the problem, and came up with a
counterexample. The counterexample is known as the Poincar´e homology sphere.
This is a manifold P
3
with
H
∗
(P )
∼
=
H
∗
(S
3
),
but it turns out there is a surjection
π
1
(
P
3
)
→ A
5
. In particular,
P
3
is not
homeomorphic to a sphere.
So he made a second conjecture, that if
M
is a compact manifold with
π
1
M
∼
=
1, then
M
∼
=
S
3
. Note that the condition already implies
H
∗
(
M
)
∼
=
H
∗
(
S
3
)
by Hurewicz theorem and Poincar´e duality. This is known as the Poincar´e
conjecture, which was proved in 2002 by Perelman.
But there is more to say about Poincar´e’s initial conjecture. Some time later,
Max Dehn constructed an infinite family of 3d homology spheres. In order to
prove that he genuinely had infinitely many distinct homology spheres, he had
to show that his homology spheres had different fundamental groups. He did
manage to write down the presentations of the fundamental groups, and so what
was left to do is to distinguish whether the presentations actually presented the
same group.
To make sense of this statement, we must have some way to distinguish
between the homology spheres. To do so, he managed to write down the
presentation of the fundamental group of his homology spheres, and he had to
figure out if those groups are genuinely the same.
For our purposes, perhaps we should be slightly less ambitious. Suppose we
are given a presentation
hS  Ri
of a finitelypresented group Γ. Define the set
of alphabets
S
±
= S q S
−1
= {s, s
−1
: s ∈ S},
and let
S
∗
be the set of all finite words in
S
±
. Then the fundamental question is,
given a word
w ∈ S
∗
, when does it represent the identity element 1
∈
Γ? Ideally,
we would want an algorithm that determines whether this is the case.
Example.
Recall that in the free group
F
(
S
) =
hS  ∅i
, we had a normal form
theorem, namely that every element in Γ can be written uniquely as a product
of generator (and inverses) such that there aren’t any occurrences of
ss
−1
or
s
−1
s
. This gives a way of determining whether a given word
w ∈ S
∗
represents
the identity. We perform elementary reductions, removing every occurrence of
ss
−1
or
s
−1
s
. Since each such procedure reduces the length of the word, this
eventually stops, and we would end up with a reduced word. If this reduced
word is empty, then w represents the identity. If it is nonempty, w does not.
Thus, we have a complete solution to the word problem for free groups, and
moreover the algorithm is something we can practically implement.
For a general finitelypresented group, this is more difficult. We can first
reformulate our problem. The presentation Γ =
hS  Ri
gives us a map
F
(
S
)
→
Γ,
sending
s
to
s
. Each word gives us an element in
F
(
S
), and thus we can
reformulate our problem as identifying the elements in the kernel of this map.
Lemma.
Let Γ =
hS  Ri
. Then the elements of
ker
(
F
(
S
)
→
Γ) are precisely
those of the form
d
Y
i=1
g
i
r
±1
i
g
−1
i
,
where g
i
∈ F (S) and r
i
∈ R.
Proof.
We know that
ker
(
F
(
S
)
→
Γ) =
hhRii
, and the set described above is
exactly hhRii, noting that gxyg
−1
= (gxg
−1
)(gyg
−1
).
This tells us the set of elements in
S
∗
that represent the identity is recursively
enumerable, i.e. there is an algorithm that lists out all words that represent the
identity. However, this is not very helpful when we want to identify whether a
word represents the identity. If it does, then our algorithm will eventually tell
us it is (maybe after 3 trillion years), but if it doesn’t, the program will just run
forever, and we will never know that it doesn’t represent the identity.
Example. Let Γ = Z
2
∼
=
ha, b  [a, b]i. Consider the word
w = a
n
b
n
a
−n
b
−n
We see that this represents our identity element. But how long would it take for
our algorithm to figure this fact out? Equivalently, what
d
do we need to pick
so that w is of the form
d
Y
i=1
g
i
r
±1
i
g
−1
i
?
We can draw a picture. Our element [a, b] looks like
bb
a
a
If, say, n = 2, then w is given by
a
a
bb
a
a
bb
If we think about it hard, then we see that what we need to do is to find out how
to fill in this big square with the small squares, and we need 4 squares. Indeed,
we can read off the sequence
w = [a, b] · (b[a, b]b
−1
) · (b
2
ab
−1
[a, b]ba
−1
b
−2
) · (b
2
a
2
b
−1
a
−1
b
−1
[a, b]baba
−2
b
−2
),
which corresponds to filling in the square one by one as follows:
Of course, this is fine if we know very well what the Cayley graph of the
group looks like, but in general it is quite hard. Indeed, solving the word problem
is part of what you do if you want to draw the Cayley graph, since you need to
know when two words give the same element!
So how do we solve the word problem? Our previous partial algorithm would
make a good, full solution if we knew how far we had to search. If we know that
we will only need at most
d
= 10
10
100
, then if we searched for all expressions of
the form
Q
d
i=1
g
i
r
±
i
g
−1
i
for
d <
10
10
100
, and didn’t find
w
, then we know
w
does
not represent the identity element (we will later argue that we don’t have to
worry about there being infinitely many g
i
’s to look through).
Definition (Nullhomotopic). We say w ∈ S
∗
is nullhomotopic if w = 1 in Γ.
Definition
(Algebraic area)
.
Let
w ∈ S
∗
be mullhomotopic. Its algebraic area
is
Area
a,P
(w) = min
(
d : w =
d
Y
i=1
g
i
r
±1
i
g
−1
i
)
.
We write the subscript
a
to emphasize this is the algebraic area. We will
later define the geometric area, and we will write it as
Area
g
. Afterwards, we
will show that they are the same, and we will stop writing the subscript.
Let us use
 · 
S
to denote word length in
F
(
S
), while
S
continues to denote
the word length in Γ.
Definition
(Dehn function)
.
Then Dehn function is the function
δ
P
:
N → N
mapping
n 7→ max {Area
a,P
(w)  w
S
≤ n, w is nullhomotopic} .
This Dehn function measures the difficulty of the word problem in P.
Proposition.
The word problem for
P
is solvable iff
δ
P
is a computable function.
We will postpone the proof. The hard part of the proof is that we don’t have
to worry about the infinitely many possible g
i
that may be used to conjugate.
It would be really nice if the Dehn function can be viewed as a property of
the group, and not the presentation. This requires us to come up with a notion
of equivalence relation on functions N → [0, ∞) (or [0, ∞) → [0, ∞)).
Definition (4). We write f 4 g iff for some C > 0, we have
f(x) ≤ Cg(Cx + C) + Cx + C.
for all x.
We write f ≈ g if f 4 g and g 4 f .
This captures the (superlinear) asymptotic behaviour of f.
Example. For α, β ≥ 1, n
α
4 n
β
iff α ≤ β.
Proposition. If P and Q are two finite presentations for Γ, then δ
P
≈ δ
Q
.
We start with two special cases, and then show that we can reduce everything
else to these two special cases.
Lemma. If R
0
⊆ hhRii is a finite set, and
P = hS  Ri, Q = hS  R ∪ R
0
i,
then δ
P
' δ
Q
.
Proof. Clearly, δ
Q
≤ δ
P
. Let
m = max
r
0
∈R
0
Area
P
(r
0
).
It is then easy to see that
Area
P
(w) ≤ m Area
Q
(w).
Lemma. Let P = hS  Ri, and let
Q = hS q T  R q R
0
i,
where
R
0
= {tw
−1
t
: t ∈ T, w
t
∈ F (S)}.
Then δ
P
≈ δ
Q
.
Proof. We first show δ
P
4 δ
Q
. Define
ρ : F (S q T ) → F (s)
s 7→ s
t 7→ w
t
.
In particular, ρ(r) = r for all r ∈ R and ρ(r
0
) = 1 for all r
0
∈ R
0
.
Given w ∈ F (S), we need to show that
Area
P
(w) ≤ Area
Q
(w).
Let d = Area
Q
(w). Then
w =
d
Y
i=1
g
i
r
±1
i
g
−1
i
,
where
g
i
∈ F
(
S
`
T
)
, r
i
∈ R ∪ R
0
. We now apply
ρ
. Since
ρ
is a retraction,
ρ(w) = w. Thus,
w =
d
Y
i=1
ρ(g
i
)ρ(r
i
)
±1
ρ(g
i
)
−1
.
Now
ρ
(
r
i
) is either
r
i
or 1. In the first case, nothing happens, and in the second
case, we can just forget the ith term. So we get
w =
d
Y
i=1,r
i
∈R
ρ(g
i
)r
±1
i
ρ(g
i
)
−1
.
Since this is a valid proof in P that w = 1, we know that
Area
P
(w) ≤ d = Area
Q
(w).
We next prove that
δ
Q
4 δ
P
. It is unsurprising that some constants will appear
this time, instead of just inequality on the nose. We let
C = max
t∈T
w
t

S
.
Consider a nullhomotopic word w ∈ F (S q T ). This word looks like
w = s
±
i
1
s
±
i
2
· · · t
j
1
· · · s · · · t · · · ∈ F (S q T ).
We want to turn these t’s into s’s, and we need to use the relators to do so.
We apply relations from R
0
to write this as
w
0
= s
±
i
1
s
±
i
2
· · · w
t
j
1
· · · s · · · w
t
· · · ∈ F (S).
We certainly have w
0

S
≤ Cw
SqT
. With a bit of thought, we see that
Area
Q
(w) ≤ Area
P
(w
0
) + w
SqT
≤ δ
P
(Cw
SqT
) + w
SqT
.
So it follows that
δ
Q
(n) ≤ δ
P
(Cn) + n.
Proof of proposition.
Let
P
=
hS  Ri
and
Q
=
hS
0
 R
0
i
. If
P, Q
present
isomorphic groups, then we can write
s = u
s
∈ F (S
0
) for all s ∈ S
Similarly, we have
s
0
= v
s
0
∈ F (S) for all s
0
∈ S
0
We let
T = {su
−1
s
 s ∈ S}
T
0
= {s
0
v
−1
s
0
 s
0
∈ S
0
}
We then let
M = hS q S
0
 R ∪ R
0
∪ T ∪ T
0
i.
We can then apply our lemmas several times to show that δ
P
≈ δ
M
≈ δ
Q
.
Now we can talk about
δ
Γ
, the Dehn function of Γ, as long as we know we
are only talking up to ≈. In fact, it is true that
Fact. If Γ
1
'
qi
Γ
2
, then δ
Γ
1
≈ δ
Γ
2
.
The proof is rather delicate, and we shall not attempt to reproduce it.
2 Van Kampen diagrams
Recall we previously tried to understand the Dehn function of
Z
2
in terms of
filling in squares on a grid. In this chapter, we pursue this idea further, and
come up with an actual theorem.
To make this more precise, recall that in the case of
Z
2
, we could embed a
nullhomotopic word into the plane of the form
We saw, at least heuristically, the algebraic area of
w
is just the geometric area
of this figure. Of course, sometimes it is slightly more complicated. For example,
if we have the word w = [a, b]
2
, then we have to use the cell [a, b] twice.
Definition
(Singular disc diagram)
.
A (singular) disc diagram is a compact,
contractible subcomplex of R
2
, e.g.
We usually endow the disc diagram D with a base point p, and define
Area
g
(D) = number of 2cells of D
and
Diam
p
(D) = length of the longest embedded path in D
(1)
, starting at p.
If we orient
R
2
, then
D
has a welldefined boundary cycle. A disc diagram is
labelled if each (oriented) edge is labelled by an element
s ∈ S
±
. The set of face
labels is the set of boundary cycles of the 2cells.
If all the face labels are elements of
R
±
, then we say
D
is a diagram over
hS  Ri.
Definition
(van Kampen diagram)
.
If
w ∈ hhRii
, and
w
is the boundary cycle
of a singular disc diagram
D
over the presentation
hS  Ri
, we say that
D
is a
van Kampen diagram for w.
Example. The diagram
a
b
is a van Kampen diagram for abababa
−2
b
−1
a
−1
b
−1
ab
−1
a
−1
.
Note that in this case, the map
S
1
→ X
P
that represents
w
factors through
a map D → X
P
.
Lemma
(van Kampen’s lemma)
.
Let
P
=
hS  Ri
be a presentation and
w ∈ S
∗
.
Then the following are equivalent:
(i) w = 1 in Γ presented by P (i.e. w is nullhomotopic)
(ii) There is a van Kampen diagram for w given P.
If so, then
Area
a
(w) = min{Area
g
(D) : D is a van Kampen diagram for w over P}.
Proof. In one direction, given
w =
n
Y
i=1
g
i
r
±
i
g
−1
i
∈ F (S)
such that w = 1 ∈ Γ, we start by writing down a “lollipop diagram”
g
d
r
±
d
g
3
r
±
3
g
2
r
±
2
g
1
r
±
1
.
.
.
This defines a diagram for the word
Q
n
i=1
g
i
r
±
i
g
−1
i
which is equal in
F
(
S
) to
w
, but is not exactly
w
. However, note that performing elementary reductions
(or their inverses) correspond to operations on the van Kampen diagram that
does not increase the area. We will be careful and not say that the area doesn’t
change, since we may have to collapse paths that look like
In the other direction, given a diagram
D
for
w
, we need to produce an
expression
w =
d
Y
i=1
g
i
r
±
i
g
−1
i
such that d ≤ Area(D).
We let
e
be the first 2cell that the boundary curve arrives at, reading
anticlockwise around
∂D
. Let
g
be the path from
p
to
e
, and let
r
be the relator
read anticlockwise around e. Let
D
0
= D − e,
and let w
0
= ∂D
0
. Note that
w = gr
±1
g
−1
w
0
.
Since Area(D
0
) = Area(D) − 1, we see by induction that
w =
d
Y
i=1
g
i
r
±1
i
g
−1
i
with d = Area(D).
We observe that in our algorithm about, the
g
i
’s produced have length
≤ Diam(D). So we see that
Corollary. If w is nullhomotopic, then we can write w in the form
w =
d
Y
i=1
g
i
r
±
i
g
−1
i
,
where
g
i

S
≤ Diam D
with D a minimal van Kampen diagram for w. We can further bound this by
(max r
i

S
) Area(D) + w
S
≤ constant · δ
P
(w
S
) + w
S
.
So in particular, if we know
δ
P
(
w
S
), then we can bound the maximum
length of g
i
needed.
It is now easy to prove that
Proposition.
The word problem for a presentation
P
is solvable iff
δ
P
is
computable.
Proof.
(⇐)
By the corollary, the maximum length of a conjugator
g
i
that we need to
consider is computable. Therefore we know how long the partial algorithm
needs to run for.
(⇒)
To compute
δ
P
(
n
), we use the word problem solving ability to find all
nullhomotopic words in
F
(
S
) of length
≤ n
. Then for each
d
, go out and
look for expressions
w =
d
Y
i=1
g
i
r
±1
i
g
−1
i
.
A naive search would find the smallest area expression, and this gives us
the Dehn function.
It is a hard theorem that
Theorem
(Novikov–Boone theorem)
.
There exists a finitelypresented group
with an unsolvable word problem.
Corollary. δ
P
is sometimes noncomputable.
Let’s look at some applications to geometry. In geometry, people wanted to
classify manifolds. Classifying (orientable) 2dimensional manifolds is easy. They
are completely labeled by the genus. This classification can in fact be performed
by a computer. If we manage to triangulate our manifold, then we can feed the
information of the triangulation into a computer, and then the computer can
compute the Euler characteristic.
Is it possible to do this in higher dimensions? It turns out the word problem
gives a severe hindrance to doing so.
Theorem.
Let
n ≥
4 and Γ =
hS  Ri
be a finitelypresented group. Then we
can construct a closed, smooth, orientable manifold M
n
such that π
1
M
∼
=
Γ.
This is a nice, little surgery argument.
Proof. Let S = {a
1
, . . . , a
m
} and R = {r
1
, . . . , r
n
}. We start with
M
0
= #
m
i=0
(S
1
× S
n−1
).
Note that when we perform this construction, as n ≥ 3, we have
π
1
M
0
∼
=
F
m
by Seifert–van Kampen theorem. We now construct M
k
from M
k−1
such that
π
1
M
k
∼
=
ha
1
, . . . , a
m
 r
1
, . . . , r
k
i.
We realize
r
k
as a loop in
M
k−1
. Because
n ≥
3, we may assume (after a small
homotopy) that this is represented by a smooth embedded map
r
k
:
S
1
→ M
k−1
.
We take
N
k
to be a smooth tubular neighbourhood of
r
k
. Then
N
k
∼
=
S
1
× D
n−1
⊆ M
k−1
. Note that ∂N
k
∼
=
S
1
× S
n−2
.
Let
U
k
=
D
2
× S
n−2
. Notice that
∂U
k
∼
=
∂N
k
. Since
n ≥
4, we know
U
k
is
simply connected. So we let
M
0
k−1
= M
k
\
˚
N
k
,
a manifold with boundary
S
1
× S
n−2
. Choose an orientationreversing diffeo
morphism ϕ
k
: ∂U
k
→ ∂M
0
k−1
. Let
M
k
= M
0
k−1
∪
ϕ
k
U
k
.
Then by applying Seifert van Kampen repeatedly, we see that
π
1
M
k
= π
1
M
k−1
/hhr
k
ii,
as desired.
Thus, if we have an algorithm that can distinguish between the fundamental
groups of manifolds, then that would solve (some variant of) the word problem
for us, which is impossible.
Finally, we provide some (even more) geometric interpretation of the Dehn
function.
Definition
(Filling disc)
.
Let (
M, g
) be a closed Riemannian manifold. Let
γ
:
S
1
→ M
be a smooth nullhomotopic loop. A filling disk for
γ
in
M
is a
smooth map f : D
2
→ M such that the diagram
S
1
D
2
M
γ
f
Since there is a metric
g
on
M
, we can pull it back to a (possibly degenerate)
metric
f
∗
g
on
D
2
, and hence measure quantities like the length
(
γ
) of
γ
and
the area Area(f) of D
2
.
A classic result is
Theorem
(Douglas, Rad´u, Murray)
.
If
γ
is embedded, then there is a leastarea
filling disc.
So we can define
Definition (FArea).
FArea(γ) = inf{Area(f)  f : D
2
→ M is a filling disc for γ}.
Definition (Isoperimetric function). The isoperimetric function of (M, g) is
F iU
M
: [0, ∞) → [0, ∞)
7→ sup{FArea(γ) : γ : S
1
→ M is a smooth nullhomotopic loop, (γ) ≤ }
Theorem
(Filling theorem)
.
let
M
be a closed Riemannian manifold. Then
F iU
M
' δ
π
1
M
.
By our previous construction, we know this applies to every finitelypresented
group.
3 Bass–Serre theory
3.1 Graphs of spaces
Bass–Serre theory is a way of building spaces by gluing old spaces together, in a
way that allows us to understand the fundamental group of the resulting space.
In this section, we will give a brief (and sketchy) introduction to BassSerre
theory, which generalizes some of the ideas we have previously seen, but they
will not be used in the rest of the course.
Suppose we have two spaces
X, Y
, and we want to glue them along some
subspace. For concreteness, suppose we have another space
Z
, and maps
∂
−
:
Z → X
and
∂
+
:
Z → Y
. We want to glue
X
and
Y
by identifying
∂
−
(
z
)
∼ ∂
+
(
z
)
for all z ∈ Z.
If we simply take the disjoint union of
X
and
Y
and then take the quotient,
then this is a pretty poorlybehaved construction. Crucially, if we want to
understand the fundamental group of the resulting space via Seifert–van Kampen,
then the maps
∂
±
must be very wellbehaved for the theorem to be applicable.
The homotopy pushout corrects this problem by gluing like this:
∂
−
∂
+
X Y
Z
Definition
(Homotopy pushout)
.
Let
X, Y, Z
be spaces, and
∂
−
:
Z → X
and
∂
+
: Z → Y be maps. We define
X q
Z
Y = (X q Y q Z × [−1, 1])/ ∼,
where we identify ∂
±
(z) ∼ (z, ±1) for all z ∈ Z.
By Seifert–van Kampen, we know π
1
(X q
Z
Y ) is the pushout
π
1
Z π
1
(X)
π
1
Y π
1
(X q
Z
Y )
∂
+
∗
∂
−
∗
In other words, we have
π
1
(X q
Z
Y )
∼
=
π
1
X ∗
π
1
Z
π
1
Y.
In general, this is more wellbehaved if the maps
∂
±
∗
are in fact injective, and we
shall focus on this case.
With this construction in mind, we can try to glue together something more
complicated:
Definition
(Graph of spaces)
.
A graph of spaces
X
consists of the following
data
– A connected graph Ξ.
– For each vertex v ∈ V (Ξ), a pathconnected space X
v
.
– For each edge e ∈ E(Ξ), a pathconnected space X
e
.
–
For each edge
e ∈ E
(Ξ) attached to
v
±
∈ V
(Ξ), we have
π
1
injective maps
∂
±
e
: X
e
→ X
v
±
.
The realization of X is
X  = X =
`
v∈V (Ξ)
X
v
q
`
e∈E(Ξ)
(X
e
× [−1, 1])
(∀e ∈ E(Ξ), ∀x ∈ X
e
, (x, ±1) ∼ ∂
±
e
(x))
.
These conditions are not too restrictive. If our vertex or edge space were
not pathconnected, then we can just treat each path component as a separate
vertex/edge. If our maps are not
π
1
injective, as long as we are careful enough,
we can attach 2cells to kill the relevant loops.
Example.
A homotopy pushout is a special case of a realization of a graph of
spaces.
Example. Suppose that the underlying graph looks like this:
v
e
The corresponding gluing diagram looks like
Fix a basepoint
∗ ∈ X
e
. Pick a path from
∂
−
e
(
∗
) to
∂
+
e
(
∗
). This then gives a
loop
t
that “goes around”
X
e
×
[
−
1
,
1] by first starting at
∂
−
e
(
∗
), move along
∗ × [−1, 1], then returning using the path we chose.
Since every loop inside
X
e
can be pushed down along the “tube” to a loop
in X
v
, it should not be surprising that the group π
1
(X) is in fact generated by
π
1
(X
v
) and t.
In fact, we can explicitly write
π
1
X =
π
1
X
v
∗ hti
hh(∂
+
e
)
∗
(g) = t(∂
−
e
)
∗
(g)t
−1
∀g ∈ π
1
X
e
ii
.
This is known as an HNN extension. The way to think about this is as follows
— we have a group
π
1
X
v
, and we have two subgroups that are isomorphic to each
other. Then the HNN extension is the “freeest” way to modify the group so
that these two subgroups are conjugate.
How about for a general graph of spaces? If Ξ is a graph of spaces, then its
fundamental group has the structure of a graph of groups G.
Definition (Graph of groups). A graph of groups G consists of
– A graph Γ
– Groups G
v
for all v ∈ V (Γ)
– Groups G
e
for all e ∈ E(Γ)
– For each edge e with vertices v
±
(e), injective group homomorphisms
∂
±
e
: G
e
→ G
v
±
(e)
.
In the case of a graph of spaces, it was easy to define a realization. One way
to do so is that we have already see how to do so in the two above simple cases,
and we can build the general case up inductively from the simple cases, but that
is not so canonical. However, this has a whole lot of choices involved. Instead,
we are just going to do is to associate to a graph of groups
G
a graph of spaces
X
which “inverts” the natural map from graphs of spaces to graphs of groups,
given by taking π
1
of everything.
This can be done by taking Eilenberg–Maclane spaces.
Definition
(Aspherical space)
.
A space
X
is aspherical if
˜
X
is contractible. By
Whitehead’s theorem and the lifting criterion, this is true iff
π
n
(
X
) = 0 for all
n ≥ 2.
Proposition.
For all groups
G
there exists an aspherical space
BG
=
K
(
G,
1)
such that
π
1
(
K
(
G,
1))
∼
=
G
. Moreover, for any two choices of
K
(
G,
1) and
K
(
H,
1), and for every homomorphism
f
:
G → H
, there is a unique map (up to
homotopy)
¯
f
:
K
(
G,
1)
→ K
(
H,
1) that induces this homomorphism on
π
1
. In
particular, K(G, 1) is welldefined up to homotopy equivalence.
Moreover, we can choose
K
(
G,
1) functorially, namely there are choices of
K
(
G,
1) for each
G
and choices of
¯
f
such that
f
1
◦ f
2
=
f
1
◦
¯
f
2
and
id
G
=
id
K(G,1)
for all f, G, H.
These K(G, 1) are known as Eilenberg–MacLane spaces.
When we talked about presentations, we saw that this is true if we don’t
have the word “aspherical”. But the aspherical requirement makes the space
unique (up to homotopy).
Using Eilenberg–MacLane spaces, given any graph of groups
G
, we can
construct a graph of spaces Ξ such that when we apply
π
1
to all the spaces in
X , we recover G.
We can now set
π
1
G = π
1
X .
Note that if Γ is finite, and all the G
v
’s are finitelygenerated, then π
1
G is also
finitelygenerated, which one can see by looking into the construction of
K
(
G,
1).
If Γ is finite, all
G
v
’s are finitelypresented and all
G
e
’s are finitelygenerated,
then π
1
G is finitelypresented.
For more details, read Trees by Serre, or Topological methods in group theory
by Scott and Wall.
3.2 The Bass–Serre tree
Given a graph of groups
G
, we want to analyze
π
1
G
=
π
1
X
, and we do this by
the natural action of G on
˜
X by deck transformations.
Recall that to understand the free group, we looked at the universal cover of
the rose with r petals. Since the rose was a graph, the universal cover is also a
graph, and because it is simply connected, it must be a tree, and this gives us a
normal form theorem. The key result was that the covering space of a graph is
a graph.
Lemma.
If
X
is a graph of spaces and
ˆ
X → X
is a covering map, then
ˆ
X
naturally has the structure of a graph of spaces
ˆ
X
, and
p
respects that structure.
Note that the underlying graph of
ˆ
X
is not necessarily a covering space of
the underlying graph of X .
Proof sketch. Consider
[
v∈V (Ξ)
X
v
⊆ X.
Let
p
−1
[
v∈V (Ξ)
X
V
=
a
ˆv∈V (
ˆ
Ξ)
ˆ
W
ˆv
.
This defines the vertices of
ˆ
Ξ
, the underlying graph of
ˆ
X
. The path components
ˆ
X
ˆv
are going to be the vertex spaces of
ˆ
X
. Note that for each
ˆv
, there exists a
unique v ∈ V (Ξ) such that p :
ˆ
X
ˆv
→ X
v
is a covering map.
Likewise, the path components of
p
−1
[
e∈E(Ξ)
X
e
× {0}
form the edge spaces
`
e∈E(Ξ)
ˆ
X
ˆe
of
ˆ
X
, which again are covering spaces of the
edge space of X.
Now let’s define the edge maps
∂
±
ˆe
for each
ˆe ∈ E
(
ˆ
Ξ
)
7→ e ∈ E
(Ξ). To do so,
we consider the diagram
ˆ
X
ˆe
ˆ
X
ˆe
× [−1, 1]
ˆ
X
X
e
X
e
× [−1, 1] X
∼
p
∼
By the lifting criterion, for the dashed map to exist, t