Part IV Topics in Geometric Group Theory
Based on lectures by H. Wilton
Notes taken by Dexter Chua
Michaelmas 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
The subject of geometric group theory is founded on the observation that the algebraic
and algorithmic properties of a discrete group are closely related to the geometric
features of the spaces on which the group acts. This graduate course will provide an
introduction to the basic ideas of the subject.
Suppose Γ is a discrete group of isometries of a metric space
X
. We focus on the
theorems we can prove about Γ by imposing geometric conditions on
X
. These
conditions are motivated by curvature conditions in differential geometry, but apply
to general metric spaces and are much easier to state. First we study the case when
X
is Gromov-hyperbolic, which corresponds to negative curvature. Then we study
the case when
X
is CAT(0), which corresponds to non-positive curvature. In order
for this theory to be useful, we need a rich supply of negatively and non-positively
curved spaces. We develop the theory of non-positively curved cube complexes, which
provide many examples of CAT(0) spaces and have been the source of some dramatic
developments in low-dimensional topology over the last twenty years.
Part 1.
We will introduce the basic notions of geometric group theory: Cayley graphs,
quasiisometries, the Schwarz–Milnor Lemma, and the connection with algebraic
topology via presentation complexes. We will discuss the word problem, which
is quantified using the Dehn functions of a group.
Part 2.
We will cover the basic theory of word-hyperbolic groups, including the Morse
lemma, local characterization of quasigeodesics, linear isoperimetric inequality,
finitely presentedness, quasiconvex subgroups etc.
Part 3.
We will cover the basic theory of CAT(0) spaces, working up to the Cartan–
Hadamard theorem and Gromov’s Link Condition. These two results together
enable us to check whether the universal cover of a complex admits a CAT(1)
metric.
Part 4.
We will introduce cube complexes, in which Gromov’s link condition becomes
purely combinatorial. If there is time, we will discuss Haglund–Wise’s special
cube complexes, which combine the good geometric properties of CAT(0) spaces
with some strong algebraic and topological properties.
Pre-requisites
Part IB Geometry and Part II Algebraic topology are required.
Contents
1 Cayley graphs and the word metric
1.1 The word metric
1.2 Free groups
1.3 Finitely-presented groups
1.4 The word problem
2 Van Kampen diagrams
3 Bass–Serre theory
3.1 Graphs of spaces
3.2 The Bass–Serre tree
4 Hyperbolic groups
4.1 Definitions and examples
4.2 Quasi-geodesics and hyperbolicity
4.3 Dehn functions of hyperbolic groups
5 CAT(0) spaces and groups
5.1 Some basic motivations
5.2 CAT(κ) spaces
5.3 Length metrics
5.4 Alexandrov’s lemma
5.5 Cartan–Hadamard theorem
5.6 Gromov’s link condition
5.7 Cube complexes
5.8 Special cube complexes
1 Cayley graphs and the word metric
1.1 The word metric
There is an unfortunate tendency for people to think of groups as being part
of algebra. Which is, perhaps, reasonable. There are elements and there are
operations on them. These operations satisfy some algebraic laws. But there is
also geometry involved. For example, one of the simplest non-abelian groups
we know is
D
6
, the group of symmetries of a triangle. This is fundamentally a
geometric idea, and often this geometric interpretation can let us prove a lot of
things about D
6
.
In general, the way we find groups in nature is that we have some object,
and we look at its symmetry. In geometric group theory, what we do is that we
want to remember the object the group acts on, and often these objects have
some geometric structure. In fact, we shall see that all groups are symmetries of
some geometric object.
Let Γ be a group, and
S
a generating set for Γ. For the purposes of this
course,
S
will be finite. So in this course, we are mostly going to think about
finitely-generated groups. Of course, there are non-finitely-generated groups out
there in the wild, but we will have to put in some more effort to make our ideas
work.
The simplest geometric idea we can think of is that of distance, i.e. a metric.
So we want to use our generating set S to make Γ into a metric space.
Definition
(Word length)
.
Let Γ be a group and
S
a finite generating set. If
γ Γ, the word length of γ is
S
(γ) = min{n : γ = s
±1
1
· · · s
±1
n
for some s
i
S}.
Definition
(Word metric)
.
Let Γ be a group and
S
a finite generating set. The
word metric on Γ is given by
d
S
(γ
1
, γ
2
) =
s
(γ
1
1
γ
2
).
If we stare at the definition, we see that the word metric is left-invariant, i.e.
for any g, γ
1
, γ
2
Γ, we have
d
S
(gγ
1
, gγ
2
) = d
S
(γ
1
, γ
2
).
However, it is usually not the case that the metric is right invariant, i.e.
d
S
(
γ
1
g, γ
2
g
) is not equal to
d
S
(
γ
1
, γ
2
). This is one of the prices we have to
pay for wanting to do geometry.
If we tried to draw our metric space out, it looks pretty unhelpful it’s just
a bunch of dots. It is not path connected. We can’t really think about it well.
To fix this, we talk about Cayley graphs.
Definition (Cayley graph). The Cayley graph Cay
S
(Γ) is defined as follows:
V (Cay
S
(Γ)) = Γ
For each γ Γ and s S, we draw an edge from γ to γs.
If we want it to be a directed and labelled graph, we can label each edge by the
generator s.
There is, of course, an embedding of Γ into
Cay
S
(Γ). Moreover, if we
metrize
Cay
S
(Γ) such that each edge has length [0
,
1], then this is an isometric
embedding.
It will be useful for future purposes to note that there is a left action of Γ
on
Cay
S
(Γ) which extends the left action of Γ on itself. This works precisely
because the edges are defined by multiplication on the right, and left and right
multiplication commute.
Example.
Take Γ =
C
2
=
Z/
2
Z
, and take
S
=
{
1
}
. Then the Cayley graph
looks like
0 1
Example. Take Γ = S
3
and S = {(1 2), (1 2 3)}. The Cayley graph is
Example. Take Γ = Z, and S = {1}. Then the Cayley graph looks like
4 3 2 1 0 1 2 3 4
Example. Take Γ = Z and S = {2, 3}. Then the Cayley graph looks like
Now this seems quite complicated. It’s still the same group Γ =
Z
, but with
a different choice of generating set, it looks very different. It seems like what we
do depends heavily on the choice of the generating set.
Example. If Γ = Z
2
and S = {(1, 0), (0, 1)}, then the Cayley graph is a grid
Example. If Γ = F
2
= ha, bi, S = {a, b}, then the Cayley graph looks like
If one has done algebraic topology, then one might recognize this as the
universal cover of a space. This is not a coincidence, and we will talk about this
later.
Let’s now thing a bit more about the choice of generating set. As one would
expect, the Cayley graph depends a lot on the generating set chosen, but often we
only care about the group itself, and not a group-with-a-choice-of-generating-set.
The key observation is that while the two Cayley graphs of
Z
we drew seem
quite different, if we looked at them from 100 meters away, they look quite
similar they both look like a long, thin line.
Definition
(Quasi-isometry)
.
Let
λ
1 and
c
0. A function between metric
spaces f : X Y is a (λ, c)-quasi-isometric embedding if for all x
1
, x
2
X
1
λ
d
X
(x
1
, x
2
) c d
Y
(f(x
1
), f(x
2
)) λd
X
(x,
1
, x
2
) + c
If, in addition, there is a
C
such that for all
y Y
, there exists
x X
such that
d
Y
(
y, f
(
x
))
C
, we say
f
is a quasi-isometry, and
X
is quasi-isometric to
Y
.
We write X '
qi
Y .
We can think of the first condition as saying we are quasi-injective, and the
second as saying we are quasi-surjective.
The right way to think about the definition is that the
c
says we don’t care
about what happens at scales less than
c
, and the
λ
is saying we allow ourselves
to stretch distances. Note that if we take
c
= 0, then this is just the notion of
a bi-Lipschitz map. But in general, we don’t even require
f
to be continuous,
since continuity is a rather fine-grained property.
Exercise. Check that X '
qi
Y is an equivalence relation.
This is not immediately obvious, because there is no inverse to
f
. In fact, we
need the axiom of choice to prove this result.
Example.
Any bounded metric spaces is quasi-isometric to a point. In particu-
lar, if Γ is finite, then , d
S
) '
qi
1.
For this reason, this is not a great point of view for studying finite groups.
On the other hand, for infinite groups, this is really useful.
Example.
For any Γ and
S
, the inclusion
, d
S
)
(
Cay
S
(Γ)
, d
S
) is a quasi-
isometry (take λ = 1, c = 0, C =
1
2
).
Of course, the important result is that any two word metrics on the same
group are quasi-isomorphic.
Theorem.
For any two finite generating sets
S, S
0
of a group Γ, the identity
map , d
S
) , d
S
0
) is a quasi-isometry.
Proof. Pick
λ = max
sS
S
0
(s), λ
0
= max
sS
0
S
(s),
We then see
S
(γ) λ
0
S
0
(γ),
S
0
(γ) λ
s
(γ).
for all γ Γ. Then the claim follows.
So as long as we are willing to work up to quasi-isometry, we have a canonically
defined geometric object associated to each finitely-generated group.
Our next objective is to be able to state and prove the Schwarz–Milnor
lemma. This is an important theorem in geometric group theory, saying that if
a group Γ acts “nicely” on a metric space
X
, then Γ must be finitely generated,
and in fact
, d
s
) is quasi-isomorphic to
X
. This allows us to produce some
rather concrete geometric realizations of , d
s
), as we will see in examples.
In order to do that, we must write down a bunch of definitions.
Definition
(Geodesic)
.
Let
X
be a metric space. A geodesic in
X
is an isometric
embedding of a closed interval γ : [a, b] X.
This is not exactly the same as the notion in differential geometry. For
example, if we have a sphere and two non-antipodal points on the sphere, then
there are many geodesics connecting them in the differential geometry sense,
but only the shortest one is a geodesic in our sense. To recover the differential
geometry notion, we need to insert the words “locally” somewhere, but we shall
not need that.
Definition
(Geodesic metric space)
.
A metric space
X
is called geodesic if every
pair of points x, y X is joined by a geodesic denoted by [x, y].
Note that the notation [
x, y
] is not entirely honest, since there may be
many geodesics joining two points. However, this notation turns out to work
surprisingly well.
Definition
(Proper metric space)
.
A metric space is proper if closed balls in
X
are compact.
For example, infinite-dimensional Hilbert spaces are not proper, but finite-
dimensional ones are.
Example.
If Γ =
hSi
, then
Cay
S
(Γ) is geodesic. If
S
is finite, then
Cay
S
(Γ) is
proper.
Example.
Let
M
be a connected Riemannian manifold. Then there is a metric
on M defined by
d(x, y) = inf
α:xy
length(α),
where we take the infimum over all smooth paths from x to y.
The Hopf–Rinow theorem says if
M
is complete (as a metric space), then the
metric is proper and geodesic. This is great, because completeness is in some
sense a local property, but “proper” and “geodesic” are global properties.
We need two more definitions, before we can state the Schwarz–Milnor lemma.
Definition
(Proper discontinuous action)
.
An action Γ on a topological space
X is proper discontinuous if for every compact set K, the set
{g Γ : gK K 6= ∅}
is finite.
Definition
(Cocompact action)
.
An action Γ on a topological space
X
is
cocompact if the quotient Γ \ X is compact.
Lemma
(Schwarz–Milnor lemma)
.
Let
X
be a proper geodesic metric space,
and let Γ act properly discontinuously, cocompactly on X by isometries. Then
(i) Γ is finitely-generated.
(ii) For any x
0
X, the orbit map
Γ X
γ 7→ γx
0
is a quasi-isometry , d
s
) '
qi
(X, d).
An easy application is that Γ acting on its Caylay graph satisfies these
conditions. So this reproduces our previous observation that a group is quasi-
isometric to its Cayley graph. More interesting examples involve manifolds.
Example.
Let
M
be a closed (i.e. compact without boundary), connected
Riemannian manifold. Then the universal cover
˜
M
is also a complete, connected,
Riemannian manifold. By the Hopf–Rinow theorem, this is proper and geodesic.
Since the metric of
˜
M
is pulled back from
M
, we know
π
1
(
M
) acts on
˜
M
by
isometries. Therefore by the Schwarz–Milnor lemma, we know
π
1
(M) '
qi
˜
M.
Example.
The universal cover of the torus
S
1
× S
1
is
R
2
, and the fundamental
group is Z
2
. So we know Z
2
'
qi
R
2
, which is not surprising.
Example. Let M = Σ
2
, the surface of genus 2.
We then have
π
1
Σ
2
=
ha
1
, b
1
, a
2
, b
2
| [a
1
, b
1
][a
2
, b
2
]i.
On the other hand, the universal cover can be thought of as the hyperbolic plane
H
2
, as we saw in IB Geometry.
So it follows that
π
1
Σ
2
'
qi
H
2
.
This gives us some concrete hold on what the group
π
1
Σ
2
is like, since people
have been thinking about the hyperbolic plane for centuries.
Proof of Schwarz–Milnor lemma.
Let
¯
B
=
¯
B
(
x, R
) be such that Γ
¯
B
=
X
. This
is possible since the quotient is compact.
Let S = {γ Γ : γ
¯
B
¯
B 6= ∅}. By proper discontinuity, this set is finite.
We let
r = inf
γ
0
6∈S
d(
¯
B, γ
0
¯
B).
If we think about it, we see that in fact
r
is the minimum of this set, and in
particular r > 0.
Finally, let
λ = max
sS
d(x
0
, sx
0
).
We will show that
S
generates Γ, and use the word metric given by
S
to show
that Γ is quasi-isometric to X.
We first show that Γ =
hSi
. We let
γ
Γ be arbitrary. Let [
x
0
, γx
0
] be a
geodesic from x
0
to γx
0
. Let be such that
( 1)r d(x
0
, γx
0
) < r.
Then we can divide the geodesic into
pieces of length about
r
. We can choose
x
1
, . . . x
`1
, x
`
= γx
0
such that d(x
i1
, x
i
) < r.
By assumption, we can pick
γ
i
Γ such that
x
i
γ
i
¯
B
, and further we pick
γ
`
= γ, γ
0
= e. Now for each i, we know
d(
¯
B, γ
1
i1
γ
i
¯
B) = d(γ
i1
¯
B, γ
i
¯
B) d(x
i1
, x
i
) < r.
So it follows that γ
1
i1
γ
i
S. So we have
γ = γ
`
= (γ
1
0
γ
1
)(γ
1
1
γ
2
) · · · (γ
1
`1
γ
`
) hSi.
This proves Γ = hSi.
To prove the second part, we simply note that
r r d(x
0
γx
0
).
We also saw that
is an upper bound for the word length of
γ
under
S
. So we
have
r
s
(γ) r d(x
0
, γx
0
).
On the other hand, by definition of λ, we have
d(x
0
, γx
0
) λ
s
(γ).
So the orbit map is an orbit-embedding, and quasi-surjectivity follows from
cocompactness.
1.2 Free groups
If we want to understand a group, then a reasonable strategy now would be to
come up with a well-understood (proper geodesic) space for Γ to act (proper
discontinuously, cocompactly) on, and see what we can tell from this quasi-
isometry. There is no general algorithm for doing so, but it turns out for certain
groups, there are some “obvious” good choices to make.
Let’s begin by understanding the free group. In some sense, they are the
“simplest” groups, but they also contain some rich structure. Roughly speaking,
a free group on a set
S
is a group that is “freely” generated by
S
. This freeness
is made precise by the following universal property.
Definition
(Free group)
.
Let
S
be a (usually finite) set. A group
F
(
S
) with a
map
S F
(
S
) is called the free group on
S
if it satisfies the following universal
property: for any set map
S G
, there is a unique group homomorphism
F (S) G such that the following diagram commutes:
S F (S)
G
.
Usually, if |S| = r, we just write F (S) = F
r
.
In fancy category-theoretic language, this says the functor
F
:
Sets Grps
is left adjoint to the forgetful functor U : Grps Sets.
This definition is good for some things, but not good for others. One thing it
does well is it makes it clear that
F
(
S
) is unique up to isomorphism if it exists.
However, it is not immediately clear that
F
(
S
) exists! (unless one applies the
adjoint functor theorem)
Thus, we must concretely construct a group satisfying this property. Apart
from telling us that
F
(
S
) actually exists, a concrete construction lets us actually
do things with the group.
For concreteness, suppose
S
=
{a
1
, . . . , a
r
}
is a finite set. We consider a
graph
X
r
=
.
.
.
where there are r “petals”. This is known as the rose with r petals.
We claim that its fundamental group is
F
(
S
). To see this, we have to
understand the universal cover
˜
X
r
of
X
r
. We know
˜
X
r
is a simply connected
graph, so it is a tree. Moreover, since it is a covering space of
X
r
, it is regular
with degree 2r on each vertex.
If r = 2, then this is our good old picture
a
1
a
1
a
2
a
2
We can use this to understand the fundamental group
π
1
(
X
r
). The elements
on
π
1
(
X
r
) are homotopy classes [
γ
] of based loops in
X
r
. Using the homotopy
lifting lemma, this map can be lifted to a path
˜γ
in the universal cover starting
at .
In general, there are many paths that represent the same element. However, there
is a “canonical” such path. Indeed, we can eliminate all backtracks in the path
to obtain an embedded path from
to the end point of
˜γ
. We may reparametrize
so that it spends equal time on each edge, but that is not important.
Algebraically, how do we understand this? Each loop in
X
r
represents an
element
a
i
in
π
1
(
X
r
). If we look at the “canonical” form of the path above, then
we see that we are writing γ in the form
γ = a
±1
i
1
a
±1
i
2
a
±1
i
3
· · · a
±1
i
n
.
The fact that we have eliminated all backtracks says this expression is reduced,
i.e. we don’t see anything looking like
a
i
a
1
i
, or
a
1
i
a
i
. Importantly, there is a
unique way to write
γ
as a reduced word, corresponding to the unique embedded
path from
to the end point of
˜γ
in the universal cover. Thus, if we somehow
expressed
γ
as a non-reduced word in the
a
1
, . . . , a
r
, then we can reduce it by
removing instances of
a
i
a
1
i
or
a
1
i
a
i
, and keep doing so until we don’t see any.
This terminates since the length of the word is finite, and the result does not
depend on the order we performed the reduction in, by uniqueness.
Example. Take the case π
1
X
2
= ha, bi. We might be given a word
a
3
a
2
b
2
b
3
a
5
a
3
a
3
b
1
.
We can reduce this to
aba
1
b
1
.
In particular, since this is not the identity, we know the original path was not
null-homotopic.
This gives us a very concrete understanding of
π
1
(
X
r
) it is generated
by
a
1
, . . . , a
r
, and each element is represented uniquely by a reduced word. To
multiply words together, write them one after another, and then reduce the
result.
From this description, it is clear that
Corollary. π
1
(
X
r
) has the universal property of
F
(
S
), with
S
=
{a
1
, . . . , a
r
}
.
So π
1
(X
r
)
=
F
r
.
Proof. Given any map f : S G, define
˜
f : F (S) G by
˜
f(a
±1
i
1
· · · a
±1
i
n
) =
˜
f(a
i
)
±1
· · ·
˜
f(a
i
n
)
±1
for any reduced word
a
±1
i
1
· · · a
±1
i
n
. This is easily seen to be well-defined and is
the unique map making the diagram commute.
1.3 Finitely-presented groups
Let’s try to consider groups that are slightly more complex than free groups. If a
group Γ is generated by
S
, then we have a surjective homomorphism
F
(
S
)
Γ.
Let K = ker η. Then the first isomorphism theorem tells us
Γ
=
F (S)
K
.
Since we understand
F
(
S
) quite explicitly, it would be nice if we have a solid
gasp on
K
as well. If
R
normally generates
K
, so that
K
=
hhRii
, then we say
that hS | Ri is a presentation for Γ. We will often write that
Γ
=
hS | Ri.
Example.
Z
2
= ha, b | aba
1
b
1
i
Definition
(Finitely-presented group)
.
A finitely-presentable group is a group
Γ such that there are finite S and R such that Γ
=
hS | Ri.
A finitely-presented group is a group Γ equipped
S
and
R
such that Γ
=
hS |
Ri.
Presentations give us ways to geometrically understand a group. Given a
presentation
P
=
hS | Ri
, we can construct space
X
P
such that
π
1
X
P
=
hS | Ri
To do so, we first construct a rose with
|S|
many petals, each labeled by an
element of
S
. For each
r R
, glue a disk onto the rose along the path specified
by r. The Seifert–van Kampen theorem then tells us π
1
X
P
=
Γ.
Example.
We take the presentation
Z
2
=
ha, b | aba
1
b
1
i
. If we think hard
enough (or a bit), we see this construction gives the torus.
Conversely, if we are given a connected cell complex
X
, we can obtain a
presentation of the fundamental group. This is easy if the cell complex has a
single 0-cell. If it has multiple 0-cells, then we choose a maximal tree in
X
(1)
,
and the edges
S
=
{a
i
}
not in the maximal tree define a generating set for
π
1
X
(1)
. The attaching maps of the 2-cells in
X
define elements
R
=
{r
j
}
of
π
1
X
(1)
, and these data define a presentation P
X
= hS | Ri for π
1
X.
This is not canonical, since we have to pick a maximal tree, but let’s not
worry too much about that. The point of maximal trees is to get around the
problem that we might have more than one vertex.
Exercise. If X has one vertex, then Cay
S
π
1
X =
˜
X
(1)
.
1.4 The word problem
Long time ago, Poincar´e was interested in characterizing the 3-sphere. His
first conjecture was that if a closed 3-manifold
M
had
H
(
M
)
=
H
(
S
3
), then
M
=
S
3
. Poincar´e thought very hard about the problem, and came up with a
counterexample. The counterexample is known as the Poincar´e homology sphere.
This is a manifold P
3
with
H
(P )
=
H
(S
3
),
but it turns out there is a surjection
π
1
(
P
3
)
A
5
. In particular,
P
3
is not
homeomorphic to a sphere.
So he made a second conjecture, that if
M
is a compact manifold with
π
1
M
=
1, then
M
=
S
3
. Note that the condition already implies
H
(
M
)
=
H
(
S
3
)
by Hurewicz theorem and Poincar´e duality. This is known as the Poincar´e
conjecture, which was proved in 2002 by Perelman.
But there is more to say about Poincar´e’s initial conjecture. Some time later,
Max Dehn constructed an infinite family of 3d homology spheres. In order to
prove that he genuinely had infinitely many distinct homology spheres, he had
to show that his homology spheres had different fundamental groups. He did
manage to write down the presentations of the fundamental groups, and so what
was left to do is to distinguish whether the presentations actually presented the
same group.
To make sense of this statement, we must have some way to distinguish
between the homology spheres. To do so, he managed to write down the
presentation of the fundamental group of his homology spheres, and he had to
figure out if those groups are genuinely the same.
For our purposes, perhaps we should be slightly less ambitious. Suppose we
are given a presentation
hS | Ri
of a finitely-presented group Γ. Define the set
of alphabets
S
±
= S q S
1
= {s, s
1
: s S},
and let
S
be the set of all finite words in
S
±
. Then the fundamental question is,
given a word
w S
, when does it represent the identity element 1
Γ? Ideally,
we would want an algorithm that determines whether this is the case.
Example.
Recall that in the free group
F
(
S
) =
hS | ∅i
, we had a normal form
theorem, namely that every element in Γ can be written uniquely as a product
of generator (and inverses) such that there aren’t any occurrences of
ss
1
or
s
1
s
. This gives a way of determining whether a given word
w S
represents
the identity. We perform elementary reductions, removing every occurrence of
ss
1
or
s
1
s
. Since each such procedure reduces the length of the word, this
eventually stops, and we would end up with a reduced word. If this reduced
word is empty, then w represents the identity. If it is non-empty, w does not.
Thus, we have a complete solution to the word problem for free groups, and
moreover the algorithm is something we can practically implement.
For a general finitely-presented group, this is more difficult. We can first
reformulate our problem. The presentation Γ =
hS | Ri
gives us a map
F
(
S
)
Γ,
sending
s
to
s
. Each word gives us an element in
F
(
S
), and thus we can
reformulate our problem as identifying the elements in the kernel of this map.
Lemma.
Let Γ =
hS | Ri
. Then the elements of
ker
(
F
(
S
)
Γ) are precisely
those of the form
d
Y
i=1
g
i
r
±1
i
g
1
i
,
where g
i
F (S) and r
i
R.
Proof.
We know that
ker
(
F
(
S
)
Γ) =
hhRii
, and the set described above is
exactly hhRii, noting that gxyg
1
= (gxg
1
)(gyg
1
).
This tells us the set of elements in
S
that represent the identity is recursively
enumerable, i.e. there is an algorithm that lists out all words that represent the
identity. However, this is not very helpful when we want to identify whether a
word represents the identity. If it does, then our algorithm will eventually tell
us it is (maybe after 3 trillion years), but if it doesn’t, the program will just run
forever, and we will never know that it doesn’t represent the identity.
Example. Let Γ = Z
2
=
ha, b | [a, b]i. Consider the word
w = a
n
b
n
a
n
b
n
We see that this represents our identity element. But how long would it take for
our algorithm to figure this fact out? Equivalently, what
d
do we need to pick
so that w is of the form
d
Y
i=1
g
i
r
±1
i
g
1
i
?
We can draw a picture. Our element [a, b] looks like
bb
a
a
If, say, n = 2, then w is given by
a
a
bb
a
a
bb
If we think about it hard, then we see that what we need to do is to find out how
to fill in this big square with the small squares, and we need 4 squares. Indeed,
we can read off the sequence
w = [a, b] · (b[a, b]b
1
) · (b
2
ab
1
[a, b]ba
1
b
2
) · (b
2
a
2
b
1
a
1
b
1
[a, b]baba
2
b
2
),
which corresponds to filling in the square one by one as follows:
Of course, this is fine if we know very well what the Cayley graph of the
group looks like, but in general it is quite hard. Indeed, solving the word problem
is part of what you do if you want to draw the Cayley graph, since you need to
know when two words give the same element!
So how do we solve the word problem? Our previous partial algorithm would
make a good, full solution if we knew how far we had to search. If we know that
we will only need at most
d
= 10
10
100
, then if we searched for all expressions of
the form
Q
d
i=1
g
i
r
±
i
g
1
i
for
d <
10
10
100
, and didn’t find
w
, then we know
w
does
not represent the identity element (we will later argue that we don’t have to
worry about there being infinitely many g
i
’s to look through).
Definition (Null-homotopic). We say w S
is null-homotopic if w = 1 in Γ.
Definition
(Algebraic area)
.
Let
w S
be mull-homotopic. Its algebraic area
is
Area
a,P
(w) = min
(
d : w =
d
Y
i=1
g
i
r
±1
i
g
1
i
)
.
We write the subscript
a
to emphasize this is the algebraic area. We will
later define the geometric area, and we will write it as
Area
g
. Afterwards, we
will show that they are the same, and we will stop writing the subscript.
Let us use
| · |
S
to denote word length in
F
(
S
), while
S
continues to denote
the word length in Γ.
Definition
(Dehn function)
.
Then Dehn function is the function
δ
P
:
N N
mapping
n 7→ max {Area
a,P
(w) | |w|
S
n, w is null-homotopic} .
This Dehn function measures the difficulty of the word problem in P.
Proposition.
The word problem for
P
is solvable iff
δ
P
is a computable function.
We will postpone the proof. The hard part of the proof is that we don’t have
to worry about the infinitely many possible g
i
that may be used to conjugate.
It would be really nice if the Dehn function can be viewed as a property of
the group, and not the presentation. This requires us to come up with a notion
of equivalence relation on functions N [0, ) (or [0, ) [0, )).
Definition (4). We write f 4 g iff for some C > 0, we have
f(x) Cg(Cx + C) + Cx + C.
for all x.
We write f g if f 4 g and g 4 f .
This captures the (super-linear) asymptotic behaviour of f.
Example. For α, β 1, n
α
4 n
β
iff α β.
Proposition. If P and Q are two finite presentations for Γ, then δ
P
δ
Q
.
We start with two special cases, and then show that we can reduce everything
else to these two special cases.
Lemma. If R
0
hhRii is a finite set, and
P = hS | Ri, Q = hS | R R
0
i,
then δ
P
' δ
Q
.
Proof. Clearly, δ
Q
δ
P
. Let
m = max
r
0
R
0
Area
P
(r
0
).
It is then easy to see that
Area
P
(w) m Area
Q
(w).
Lemma. Let P = hS | Ri, and let
Q = hS q T | R q R
0
i,
where
R
0
= {tw
1
t
: t T, w
t
F (S)}.
Then δ
P
δ
Q
.
Proof. We first show δ
P
4 δ
Q
. Define
ρ : F (S q T ) F (s)
s 7→ s
t 7→ w
t
.
In particular, ρ(r) = r for all r R and ρ(r
0
) = 1 for all r
0
R
0
.
Given w F (S), we need to show that
Area
P
(w) Area
Q
(w).
Let d = Area
Q
(w). Then
w =
d
Y
i=1
g
i
r
±1
i
g
1
i
,
where
g
i
F
(
S
`
T
)
, r
i
R R
0
. We now apply
ρ
. Since
ρ
is a retraction,
ρ(w) = w. Thus,
w =
d
Y
i=1
ρ(g
i
)ρ(r
i
)
±1
ρ(g
i
)
1
.
Now
ρ
(
r
i
) is either
r
i
or 1. In the first case, nothing happens, and in the second
case, we can just forget the ith term. So we get
w =
d
Y
i=1,r
i
R
ρ(g
i
)r
±1
i
ρ(g
i
)
1
.
Since this is a valid proof in P that w = 1, we know that
Area
P
(w) d = Area
Q
(w).
We next prove that
δ
Q
4 δ
P
. It is unsurprising that some constants will appear
this time, instead of just inequality on the nose. We let
C = max
tT
|w
t
|
S
.
Consider a null-homotopic word w F (S q T ). This word looks like
w = s
±
i
1
s
±
i
2
· · · t
j
1
· · · s · · · t · · · F (S q T ).
We want to turn these t’s into s’s, and we need to use the relators to do so.
We apply relations from R
0
to write this as
w
0
= s
±
i
1
s
±
i
2
· · · w
t
j
1
· · · s · · · w
t
· · · F (S).
We certainly have |w
0
|
S
C|w|
SqT
. With a bit of thought, we see that
Area
Q
(w) Area
P
(w
0
) + |w|
SqT
δ
P
(C|w|
SqT
) + |w|
SqT
.
So it follows that
δ
Q
(n) δ
P
(Cn) + n.
Proof of proposition.
Let
P
=
hS | Ri
and
Q
=
hS
0
| R
0
i
. If
P, Q
present
isomorphic groups, then we can write
s = u
s
F (S
0
) for all s S
Similarly, we have
s
0
= v
s
0
F (S) for all s
0
S
0
We let
T = {su
1
s
| s S}
T
0
= {s
0
v
1
s
0
| s
0
S
0
}
We then let
M = hS q S
0
| R R
0
T T
0
i.
We can then apply our lemmas several times to show that δ
P
δ
M
δ
Q
.
Now we can talk about
δ
Γ
, the Dehn function of Γ, as long as we know we
are only talking up to . In fact, it is true that
Fact. If Γ
1
'
qi
Γ
2
, then δ
Γ
1
δ
Γ
2
.
The proof is rather delicate, and we shall not attempt to reproduce it.
2 Van Kampen diagrams
Recall we previously tried to understand the Dehn function of
Z
2
in terms of
filling in squares on a grid. In this chapter, we pursue this idea further, and
come up with an actual theorem.
To make this more precise, recall that in the case of
Z
2
, we could embed a
null-homotopic word into the plane of the form
We saw, at least heuristically, the algebraic area of
w
is just the geometric area
of this figure. Of course, sometimes it is slightly more complicated. For example,
if we have the word w = [a, b]
2
, then we have to use the cell [a, b] twice.
Definition
(Singular disc diagram)
.
A (singular) disc diagram is a compact,
contractible subcomplex of R
2
, e.g.
We usually endow the disc diagram D with a base point p, and define
Area
g
(D) = number of 2-cells of D
and
Diam
p
(D) = length of the longest embedded path in D
(1)
, starting at p.
If we orient
R
2
, then
D
has a well-defined boundary cycle. A disc diagram is
labelled if each (oriented) edge is labelled by an element
s S
±
. The set of face
labels is the set of boundary cycles of the 2-cells.
If all the face labels are elements of
R
±
, then we say
D
is a diagram over
hS | Ri.
Definition
(van Kampen diagram)
.
If
w hhRii
, and
w
is the boundary cycle
of a singular disc diagram
D
over the presentation
hS | Ri
, we say that
D
is a
van Kampen diagram for w.
Example. The diagram
a
b
is a van Kampen diagram for abababa
2
b
1
a
1
b
1
ab
1
a
1
.
Note that in this case, the map
S
1
X
P
that represents
w
factors through
a map D X
P
.
Lemma
(van Kampen’s lemma)
.
Let
P
=
hS | Ri
be a presentation and
w S
.
Then the following are equivalent:
(i) w = 1 in Γ presented by P (i.e. w is null-homotopic)
(ii) There is a van Kampen diagram for w given P.
If so, then
Area
a
(w) = min{Area
g
(D) : D is a van Kampen diagram for w over P}.
Proof. In one direction, given
w =
n
Y
i=1
g
i
r
±
i
g
1
i
F (S)
such that w = 1 Γ, we start by writing down a “lollipop diagram”
g
d
r
±
d
g
3
r
±
3
g
2
r
±
2
g
1
r
±
1
.
.
.
This defines a diagram for the word
Q
n
i=1
g
i
r
±
i
g
1
i
which is equal in
F
(
S
) to
w
, but is not exactly
w
. However, note that performing elementary reductions
(or their inverses) correspond to operations on the van Kampen diagram that
does not increase the area. We will be careful and not say that the area doesn’t
change, since we may have to collapse paths that look like
In the other direction, given a diagram
D
for
w
, we need to produce an
expression
w =
d
Y
i=1
g
i
r
±
i
g
1
i
such that d Area(D).
We let
e
be the first 2-cell that the boundary curve arrives at, reading
anticlockwise around
D
. Let
g
be the path from
p
to
e
, and let
r
be the relator
read anti-clockwise around e. Let
D
0
= D e,
and let w
0
= D
0
. Note that
w = gr
±1
g
1
w
0
.
Since Area(D
0
) = Area(D) 1, we see by induction that
w =
d
Y
i=1
g
i
r
±1
i
g
1
i
with d = Area(D).
We observe that in our algorithm about, the
g
i
’s produced have length
Diam(D). So we see that
Corollary. If w is null-homotopic, then we can write w in the form
w =
d
Y
i=1
g
i
r
±
i
g
1
i
,
where
|g
i
|
S
Diam D
with D a minimal van Kampen diagram for w. We can further bound this by
(max |r
i
|
S
) Area(D) + |w|
S
constant · δ
P
(|w|
S
) + |w|
S
.
So in particular, if we know
δ
P
(
|w|
S
), then we can bound the maximum
length of g
i
needed.
It is now easy to prove that
Proposition.
The word problem for a presentation
P
is solvable iff
δ
P
is
computable.
Proof.
()
By the corollary, the maximum length of a conjugator
g
i
that we need to
consider is computable. Therefore we know how long the partial algorithm
needs to run for.
()
To compute
δ
P
(
n
), we use the word problem solving ability to find all
null-homotopic words in
F
(
S
) of length
n
. Then for each
d
, go out and
look for expressions
w =
d
Y
i=1
g
i
r
±1
i
g
1
i
.
A naive search would find the smallest area expression, and this gives us
the Dehn function.
It is a hard theorem that
Theorem
(Novikov–Boone theorem)
.
There exists a finitely-presented group
with an unsolvable word problem.
Corollary. δ
P
is sometimes non-computable.
Let’s look at some applications to geometry. In geometry, people wanted to
classify manifolds. Classifying (orientable) 2-dimensional manifolds is easy. They
are completely labeled by the genus. This classification can in fact be performed
by a computer. If we manage to triangulate our manifold, then we can feed the
information of the triangulation into a computer, and then the computer can
compute the Euler characteristic.
Is it possible to do this in higher dimensions? It turns out the word problem
gives a severe hindrance to doing so.
Theorem.
Let
n
4 and Γ =
hS | Ri
be a finitely-presented group. Then we
can construct a closed, smooth, orientable manifold M
n
such that π
1
M
=
Γ.
This is a nice, little surgery argument.
Proof. Let S = {a
1
, . . . , a
m
} and R = {r
1
, . . . , r
n
}. We start with
M
0
= #
m
i=0
(S
1
× S
n1
).
Note that when we perform this construction, as n 3, we have
π
1
M
0
=
F
m
by Seifert–van Kampen theorem. We now construct M
k
from M
k1
such that
π
1
M
k
=
ha
1
, . . . , a
m
| r
1
, . . . , r
k
i.
We realize
r
k
as a loop in
M
k1
. Because
n
3, we may assume (after a small
homotopy) that this is represented by a smooth embedded map
r
k
:
S
1
M
k1
.
We take
N
k
to be a smooth tubular neighbourhood of
r
k
. Then
N
k
=
S
1
× D
n1
M
k1
. Note that N
k
=
S
1
× S
n2
.
Let
U
k
=
D
2
× S
n2
. Notice that
U
k
=
N
k
. Since
n
4, we know
U
k
is
simply connected. So we let
M
0
k1
= M
k
\
˚
N
k
,
a manifold with boundary
S
1
× S
n2
. Choose an orientation-reversing diffeo-
morphism ϕ
k
: U
k
M
0
k1
. Let
M
k
= M
0
k1
ϕ
k
U
k
.
Then by applying Seifert van Kampen repeatedly, we see that
π
1
M
k
= π
1
M
k1
/hhr
k
ii,
as desired.
Thus, if we have an algorithm that can distinguish between the fundamental
groups of manifolds, then that would solve (some variant of) the word problem
for us, which is impossible.
Finally, we provide some (even more) geometric interpretation of the Dehn
function.
Definition
(Filling disc)
.
Let (
M, g
) be a closed Riemannian manifold. Let
γ
:
S
1
M
be a smooth null-homotopic loop. A filling disk for
γ
in
M
is a
smooth map f : D
2
M such that the diagram
S
1
D
2
M
γ
f
Since there is a metric
g
on
M
, we can pull it back to a (possibly degenerate)
metric
f
g
on
D
2
, and hence measure quantities like the length
(
γ
) of
γ
and
the area Area(f) of D
2
.
A classic result is
Theorem
(Douglas, Rad´u, Murray)
.
If
γ
is embedded, then there is a least-area
filling disc.
So we can define
Definition (FArea).
FArea(γ) = inf{Area(f) | f : D
2
M is a filling disc for γ}.
Definition (Isoperimetric function). The isoperimetric function of (M, g) is
F iU
M
: [0, ) [0, )
7→ sup{FArea(γ) : γ : S
1
M is a smooth null-homotopic loop, (γ) }
Theorem
(Filling theorem)
.
let
M
be a closed Riemannian manifold. Then
F iU
M
' δ
π
1
M
.
By our previous construction, we know this applies to every finitely-presented
group.
3 Bass–Serre theory
3.1 Graphs of spaces
Bass–Serre theory is a way of building spaces by gluing old spaces together, in a
way that allows us to understand the fundamental group of the resulting space.
In this section, we will give a brief (and sketchy) introduction to Bass-Serre
theory, which generalizes some of the ideas we have previously seen, but they
will not be used in the rest of the course.
Suppose we have two spaces
X, Y
, and we want to glue them along some
subspace. For concreteness, suppose we have another space
Z
, and maps
:
Z X
and
+
:
Z Y
. We want to glue
X
and
Y
by identifying
(
z
)
+
(
z
)
for all z Z.
If we simply take the disjoint union of
X
and
Y
and then take the quotient,
then this is a pretty poorly-behaved construction. Crucially, if we want to
understand the fundamental group of the resulting space via Seifert–van Kampen,
then the maps
±
must be very well-behaved for the theorem to be applicable.
The homotopy pushout corrects this problem by gluing like this:
+
X Y
Z
Definition
(Homotopy pushout)
.
Let
X, Y, Z
be spaces, and
:
Z X
and
+
: Z Y be maps. We define
X q
Z
Y = (X q Y q Z × [1, 1])/ ,
where we identify
±
(z) (z, ±1) for all z Z.
By Seifert–van Kampen, we know π
1
(X q
Z
Y ) is the pushout
π
1
Z π
1
(X)
π
1
Y π
1
(X q
Z
Y )
+
In other words, we have
π
1
(X q
Z
Y )
=
π
1
X
π
1
Z
π
1
Y.
In general, this is more well-behaved if the maps
±
are in fact injective, and we
shall focus on this case.
With this construction in mind, we can try to glue together something more
complicated:
Definition
(Graph of spaces)
.
A graph of spaces
X
consists of the following
data
A connected graph Ξ.
For each vertex v V (Ξ), a path-connected space X
v
.
For each edge e E(Ξ), a path-connected space X
e
.
For each edge
e E
(Ξ) attached to
v
±
V
(Ξ), we have
π
1
-injective maps
±
e
: X
e
X
v
±
.
The realization of X is
|X | = X =
`
vV (Ξ)
X
v
q
`
eE(Ξ)
(X
e
× [1, 1])
(e E(Ξ), x X
e
, (x, ±1)
±
e
(x))
.
These conditions are not too restrictive. If our vertex or edge space were
not path-connected, then we can just treat each path component as a separate
vertex/edge. If our maps are not
π
1
injective, as long as we are careful enough,
we can attach 2-cells to kill the relevant loops.
Example.
A homotopy pushout is a special case of a realization of a graph of
spaces.
Example. Suppose that the underlying graph looks like this:
v
e
The corresponding gluing diagram looks like
Fix a basepoint
X
e
. Pick a path from
e
(
) to
+
e
(
). This then gives a
loop
t
that “goes around”
X
e
×
[
1
,
1] by first starting at
e
(
), move along
× [1, 1], then returning using the path we chose.
Since every loop inside
X
e
can be pushed down along the “tube” to a loop
in X
v
, it should not be surprising that the group π
1
(X) is in fact generated by
π
1
(X
v
) and t.
In fact, we can explicitly write
π
1
X =
π
1
X
v
hti
hh(
+
e
)
(g) = t(
e
)
(g)t
1
g π
1
X
e
ii
.
This is known as an HNN extension. The way to think about this is as follows
— we have a group
π
1
X
v
, and we have two subgroups that are isomorphic to each
other. Then the HNN extension is the “free-est” way to modify the group so
that these two subgroups are conjugate.
How about for a general graph of spaces? If Ξ is a graph of spaces, then its
fundamental group has the structure of a graph of groups G.
Definition (Graph of groups). A graph of groups G consists of
A graph Γ
Groups G
v
for all v V (Γ)
Groups G
e
for all e E(Γ)
For each edge e with vertices v
±
(e), injective group homomorphisms
±
e
: G
e
G
v
±
(e)
.
In the case of a graph of spaces, it was easy to define a realization. One way
to do so is that we have already see how to do so in the two above simple cases,
and we can build the general case up inductively from the simple cases, but that
is not so canonical. However, this has a whole lot of choices involved. Instead,
we are just going to do is to associate to a graph of groups
G
a graph of spaces
X
which “inverts” the natural map from graphs of spaces to graphs of groups,
given by taking π
1
of everything.
This can be done by taking Eilenberg–Maclane spaces.
Definition
(Aspherical space)
.
A space
X
is aspherical if
˜
X
is contractible. By
Whitehead’s theorem and the lifting criterion, this is true iff
π
n
(
X
) = 0 for all
n 2.
Proposition.
For all groups
G
there exists an aspherical space
BG
=
K
(
G,
1)
such that
π
1
(
K
(
G,
1))
=
G
. Moreover, for any two choices of
K
(
G,
1) and
K
(
H,
1), and for every homomorphism
f
:
G H
, there is a unique map (up to
homotopy)
¯
f
:
K
(
G,
1)
K
(
H,
1) that induces this homomorphism on
π
1
. In
particular, K(G, 1) is well-defined up to homotopy equivalence.
Moreover, we can choose
K
(
G,
1) functorially, namely there are choices of
K
(
G,
1) for each
G
and choices of
¯
f
such that
f
1
f
2
=
f
1
¯
f
2
and
id
G
=
id
K(G,1)
for all f, G, H.
These K(G, 1) are known as Eilenberg–MacLane spaces.
When we talked about presentations, we saw that this is true if we don’t
have the word “aspherical”. But the aspherical requirement makes the space
unique (up to homotopy).
Using Eilenberg–MacLane spaces, given any graph of groups
G
, we can
construct a graph of spaces Ξ such that when we apply
π
1
to all the spaces in
X , we recover G.
We can now set
π
1
G = π
1
|X |.
Note that if Γ is finite, and all the G
v
’s are finitely-generated, then π
1
G is also
finitely-generated, which one can see by looking into the construction of
K
(
G,
1).
If Γ is finite, all
G
v
’s are finitely-presented and all
G
e
’s are finitely-generated,
then π
1
G is finitely-presented.
For more details, read Trees by Serre, or Topological methods in group theory
by Scott and Wall.
3.2 The Bass–Serre tree
Given a graph of groups
G
, we want to analyze
π
1
G
=
π
1
X
, and we do this by
the natural action of G on
˜
X by deck transformations.
Recall that to understand the free group, we looked at the universal cover of
the rose with r petals. Since the rose was a graph, the universal cover is also a
graph, and because it is simply connected, it must be a tree, and this gives us a
normal form theorem. The key result was that the covering space of a graph is
a graph.
Lemma.
If
X
is a graph of spaces and
ˆ
X X
is a covering map, then
ˆ
X
naturally has the structure of a graph of spaces
ˆ
X
, and
p
respects that structure.
Note that the underlying graph of
ˆ
X
is not necessarily a covering space of
the underlying graph of X .
Proof sketch. Consider
[
vV (Ξ)
X
v
X.
Let
p
1
[
vV (Ξ)
X
V
=
a
ˆvV (
ˆ
Ξ)
ˆ
W
ˆv
.
This defines the vertices of
ˆ
Ξ
, the underlying graph of
ˆ
X
. The path components
ˆ
X
ˆv
are going to be the vertex spaces of
ˆ
X
. Note that for each
ˆv
, there exists a
unique v V (Ξ) such that p :
ˆ
X
ˆv
X
v
is a covering map.
Likewise, the path components of
p
1
[
eE(Ξ)
X
e
× {0}
form the edge spaces
`
eE(Ξ)
ˆ
X
ˆe
of
ˆ
X
, which again are covering spaces of the
edge space of X.
Now let’s define the edge maps
±
ˆe
for each
ˆe E
(
ˆ
Ξ
)
7→ e E
(Ξ). To do so,
we consider the diagram
ˆ
X
ˆe
ˆ
X
ˆe
× [1, 1]
ˆ
X
X
e
X
e
× [1, 1] X
p
By the lifting criterion, for the dashed map to exist, there is a necessary and
sufficient condition on (
ˆ
X
ˆe
×
[
1
,
1]
X
e
×
[
1
,
1]
X
)
. But since this
condition is homotopy invariant, we can check it on the composition (
ˆ
X
ˆe
X
e
X
)
instead, and we know it must be satisfied because a lift exists in this
case.
The attaching maps
±
ˆe
:
ˆ
X
e
ˆ
X
are precisely the restriction to
ˆ
X
e
×{±
1
}
ˆ
X.
Finally, check using covering space theory that the maps
ˆ
X
ˆe
×
[
1
,
1]
ˆ
X
can injective on the interior of the cylinder, and verify that the appropriate maps
are π
1
-injective.
Now let’s apply this to the universal cover
˜
X X
. We see that
˜
X
has a
natural action of G = π
1
X, which preserves the graph of spaces structure.
Note that for any graph of spaces X, there are maps
ι : Ξ X
ρ : X Ξ
such that
ρ ι ' id
Ξ
. In particular, this implies
ρ
is surjective (and of course
ρ
itself is also surjective).
In the case of the universal cover
˜
X
, we see that the underlying graph
˜
Ξ
=
T
is connected and simply connected, because
π
1
˜
Ξ
=
ρ
(
π
1
˜
X
) = 1. So it is a tree!
The action of
G
on
˜
X
descends to an action of
G
on
˜
Ξ
. So whenever we have
a graph of spaces, or a graph of groups
G
, we have an action of the fundamental
group on a tree. This tree is called the Bass–Serre tree of G.
Just like the case of the free group, careful analysis of the Bass–Serre tree
leads to theorems relating π
1
(G) to the vertex groups G
v
and edge groups G
e
.
Example. Let
G = F
2
= hai hbi = Z Z.
In this case, we take X to be
X
e
X
u
X
v
Here we view this as a graph where the two vertex spaces are circles, and there
is a single edge connecting them. The covering space
˜
X then looks like
This is not the Bass–Serre tree. The Bass–Serre tree is an infinite-valent bipartite
tree, which looks roughly like
This point of view gives two important results that relate elements of
G
to
the vertex groups G
v
i
and the edge maps
±
e
j
: G
e
j
G
v
±
j
.
Lemma
(Britton’s lemma)
.
For any vertex Ξ, the natural map
G
v
G
is
injective.
Unsurprisingly, this really requires that the edge maps are injective. It is an
exercise to find examples to show that this fails if the boundary maps are not
injective.
Proof sketch.
Observe that the universal cover
˜
X
can be produce by first building
universal covers of the vertex space, which are then glued together in a way that
doesn’t kill the fundamental groups.
More importantly, Bass-Serre trees give us normal form theorems! Pick a
base vertex v V (Ξ). We can then represent elements of G in the form
γ = g
0
e
±1
1
g
1
e
±1
2
· · · e
±1
n
g
n
where each
e
i
is an edge such that
e
±1
1
e
±1
2
· · · e
±1
n
forms a closed loop in the
underlying graph of Ξ, and each
g
i
is an element of the graph at the vertex
connecting e
±1
i
and e
±1
i+1
.
We say a pinch is a sub-word of the form
e
±1
±1
e
(g)e
1
,
which can be replaced by
1
e
(g).
We say a loop is reduced if it contains no pinches.
Theorem
(Normal form theorem)
.
Every element can be represented by a
reduced loop, and the only reduced loop representing the identity is the trivial
loop.
This is good enough, since if we can recognize is something is the identity,
then we see if the product of one with the inverse of the other is the identity.
An exact normal form for all words would be a bit too ambitious.
Proof idea.
It all boils down to the fact that the Bass–Serre tree is a tree.
Connectedness gives the existence, and the simply-connectedness gives the
“uniqueness”.
4 Hyperbolic groups
So far, we obtained a lot of information about groups by seeing how they act
on trees. Philosophically, the reason for this is that trees are very “negatively
curved” spaces. We shall see in this chapter than in general, whenever a group
acts on a negatively curved space, we can learn a lot about the group.
4.1 Definitions and examples
We now want to define a negatively-curved space in great generality. Let
X
be a geodesic metric space. Given
x, y X
, we will write [
x, y
] for a choice of
geodesic between x and y.
Definition
(Geodesic triangle)
.
A geodesic triangle ∆ is a choice of three points
x, y, z and geodesics [x, y], [y, z], [z, x].
Geodesic triangles look like this:
Note that in general, the geodesics may intersect.
Definition
(
δ
-slim triangle)
.
We say is
δ
-slim if every side of is contained
in the union of the δ-neighbourhoods of the other two sides.
Definition
(Hyperbolic space)
.
A metric space is (Gromov) hyperbolic if there
exists
δ
0 such that every geodesic triangle in
X
is
δ
-slim. In this case, we say
it is δ-hyperbolic.
Example. R
2
is not Gromov hyperbolic.
Example.
If
X
is a tree, then
X
is 0-hyperbolic! Indeed, each triangle looks
like
We call this a tripod.
Unfortunately, none of these examples really justify why we call these things
hyperbolic. Let’s look at the actual motivating example.
Example.
Let
X
=
H
2
, the hyperbolic plane. Let
H
2
be a triangle. Then
is
δ
-slim, where
δ
is the maximum radius of an inscribed semi-circle
D
in
with the center on one of the edges.
But we know that the radius of
D
is bounded by some increasing function
of the area of
D
, and the area of
D
is bounded above by the area of ∆. On
the other hand, by hyperbolic geometry, we know the area of any triangle is
bounded by π. So H
2
is δ-hyperbolic for some δ.
If we worked a bit harder, then we can figure out the best value of
δ
. However,
for the arguments we are going to do, we don’t really care about what δ is.
Example.
Let
X
be any bounded metric space, e.g.
S
2
. Then
X
is Gromov
hyperbolic, since we can just take δ to be the diameter of the metric space.
This is rather silly, but it makes sense if we take the “coarse point of view”,
and we have to ignore bounded things.
What we would like to do is to say is that a group Γ if for every finite
generating set
S
, the Cayley graph
Cay
S
(Γ) equipped with the word metric is
δ-hyperbolic for some δ.
However, this is not very helpful, since we have to check it for all finite
generating sets. So we want to say that being hyperbolic is quasi-isometry
invariant, in some sense.
This is slightly difficult, because we lose control of how the geodesic behaves
if we only look at things up to isometry. To do so, we have to talk about
quasi-geodesics.
4.2 Quasi-geodesics and hyperbolicity
Definition
(Quasi-geodesic)
.
A (
λ, ε
)-quasi-geodesic for
λ
1
, ε
0 is a
(λ, ε)-quasi-isometric embedding I X, where I R is a closed interval.
Note that our definition allows for
I
to be unbounded, i.e.
I
may be of the
forms [
a, b
], [0
,
) or
R
respectively. We call these quasi-geodesic intervals,
quasi-geodesic rays and quasi-geodesic lines respectively.
Example. The map [0, ) R
2
given in polar coordinates by
t 7→ (t, log(1 + t))
is a quasigeodesic ray.
This should be alarming. The quasi-geodesics in the Euclidean plane can
look unlike any genuine geodesic. However, it turns out things are well-behaved
in hyperbolic spaces.
Theorem
(Morse lemma)
.
For all
δ
0
, λ
1 there is
R
(
δ, λ, ε
) such that the
following holds:
If
X
is a
δ
-hyperbolic metric space, and
c
: [
a, b
]
X
is a (
λ, ε
)-quasigeodesic
from p to q, and [p, q] is a choice of geodesic from p to q, then
d
Haus
([p, q], im(c)) R(δ, λ, ε),
where
d
Haus
(A, B) = inf{ε > 0 | A N
ε
(B) and B N
ε
(A)}
is the Hausdorff distance.
This has nothing to do with the Morse lemma in differential topology.
Corollary.
There is an
M
(
δ, λ, ε
) such that a geodesic metric space
X
is
δ
-
hyperbolic iff any (λ, ε)-quasigeodesic triangle is M-slim.
Corollary.
Suppose
X, X
0
are geodesic metric spaces, and
f
:
X X
00
is a
quasi-isometric embedding. If X
0
is hyperbolic, then so is X.
In particular, hyperbolicity is a quasi-isometrically invariant property, when
restricted to geodesic metric spaces.
Thus, we can make the following definition:
Definition
(Hyperbolic group)
.
A group Γ is hyperbolic if it acts properly
discontinuously and cocompactly by isometries on a proper, geodesic hyperbolic
metric space. Equivalently, if it is finitely-generated and with a hyperbolic
Cayley graph.
Let’s prove the Morse lemma. To do so, we need the following definition:
Definition
(Length of path)
.
Let
c
: [
a, b
]
X
be a continuous path. Let
a = t
0
< t
1
< · · · < t
n
= b be a dissection D of [a, b]. Define
(c) = sup
D
n
X
i=1
d(c(t
i1
), c(t
i
)).
In general, this number is extremely hard to compute, and may even be
infinite.
Definition (Rectifiable path). We say a path c is rectifiable if (c) < .
Example. Piecewise geodesic paths are rectifiable.
Lemma.
Let
X
be a geodesic space. For any (
λ, ε
)-quasigeodesic
c
: [
a, b
]
X
,
there exists a continuous, rectifiable (
λ, ε
0
)-quasigeodesic
c
0
: [
a, b
]
X
with
ε
0
= 2(λ + ε) such that
(i) c
0
(a) = c(a), c
0
(b) = c(b).
(ii) For all a t < t
0
b, we have
(c
0
|
[t,t
0
]
) k
1
d(c
0
(t), c
0
(t
0
)) + k
2
where k
1
= λ(λ + ε) and k
2
= (λε
0
+ 3)(λ + 3).
(iii) d
Haus
(im c, im c
0
) λ + ε.
Proof sketch.
Let Σ =
{a, b}
((
a, b
)
Z
). For
t
Σ, we let
c
0
(
t
) =
c
(
t
), and
define
c
0
to be geodesic between the points of Σ. Then claims (i) and (iii) are clear,
and to prove quasigeodesicity and (ii), let
σ
: [
a, b
]
Σ be a choice of closest
point in Σ, and then estimate d(c
0
(t), c
0
(t
0
)) in terms of d(c(σ(t)), c(σ(t
0
))).
Lemma.
let
X
be
δ
-hyperbolic, and
c
: [
a, b
]
X
a continuous, rectifiable path
in X joining p to q. For [p, q] a geodesic, for any x [p, q], we have
d(x, im c) δ| log
2
(c)| + 1.
Proof.
We may assume
c
: [0
,
1]
X
is parametrized proportional to arc length.
Suppose
(c)
2
N
< 1
(c)
2
N1
.
Let
x
0
=
x
. Pick a geodesic triangle between
p, q, c
(
1
2
). By
δ
-hyperbolicity, there
exists a point
x
1
lying on the other two edges such that
d
(
x
0
, x
1
)
δ
. We wlog
assume x
1
[p, c(
1
2
)]. We can repeat the argument with c|
[0,
1
2
]
.
p q
c(
1
2
)
x
0
x
1
Formally, we proceed by induction on
N
. If
N
= 0 so that
(
c
)
<
1, then we
are done by taking desired point on
im c
to be
p
(or
q
). Otherwise, there is some
x
1
[p, c(
1
2
)] such that d(x
0
, x
1
) δ. Then
(c|
[0,
1
2
]
)
2
N1
< 1
(c|
[0,
1
2
])
2
N2
.
So by the induction hypothesis,
d(x
1
, im c) δ| log
2
(c|
[0,
1
2
]
| + 1
= δ
1
2
log
2
(c)
+ 1
= δ(|log
2
(c)| 1) + 1.
Note that we used the fact that (c) > 1, so that log
2
(c) > 0.
Then we are done since
d(x, im c) d(x, x
1
) + d(x
1
, im c).
Proof of Morse lemma.
By the first lemma, we may assume that
c
is continuous
and rectifiable, and satisfies the properties as in the lemma.
Let
p, q
be the end points of
c
, and [
p, q
] a geodesic. First we show that [
p, q
]
is contained in a bounded neighbourhood of im c. Let
D = sup
x[p,q]
d(x, im c).
By compactness of the interval, let
x
0
[
p, q
] where the supremum is attained.
Then by assumption,
im c
lies outside
˚
B
(
x
0
, D
). Choose
y, z
[
p, q
] be such that
d
(
x
0
, y
) =
d
(
x
0
, z
) = 2
D
and
y, x
0
, z
appear on the geodesic in this order (take
y = p, or z = q if that is not possible).
Let
y
0
=
c
(
s
)
im c
be such that
d
(
y
0
, y
)
D
, and similarly let
z
0
=
c
(
t
)
im c be such that d(z, z
0
) D.
p q
c
x
0
D
y z
2D 2D
y
0
z
0
D D
Let γ = [y, y
0
] · c|
[s,t]
· [z
0
, z]. Then
(γ) = d(y, y
0
) + d(z, z
0
) + (c|
[s,t]
) D + D + k
1
d(y
0
, z
0
) + k
2
,
by assumption. Also, we know that d(y
0
, z
0
) 6D. So we have
(γ) 6k
1
D + 2D + k
2
.
But we know that
d(x
0
, im γ) = D.
So the second lemma tells us
D δ| log
2
(6k
1
D + 2D + k
2
)| + 1.
The left hand side is linear in
D
, and the right hand side is a logarithm in
D
.
So it must be the case that
D
is bounded. Hence [
p, q
]
N
D
0
(
im c
), where
D
0
is some constant.
It remains to find a bound
M
such that
im c N
M
([
p, q
]). Let [
a
0
, b
0
] be a
maximal subinterval of [
a, b
] such that
c
[
a
0
, b
0
] lies entirely outside
˚
N
D
0
([
p, q
]).
Since
¯
N
D
(
c
[
a, a
0
]) and
¯
N
D
(
c
[
b
0
, b
]) are both closed, and they collectively cover
the connected set [p, q], there exists
w [p, q]
¯
N
D
0
(c[a, a
0
])
¯
N
D
0
(c[b
0
, b]).
Therefore there exists
t
[
a, a
0
] and
t
0
[
b
0
, b
] such that
d
(
w, c
(
t
))
D
0
and
d(w, c(t
0
)) D
0
. In particular, d(c(t), c(t
0
)) 2D
0
.
By the first lemma, we know
(c|
[t,t
0
]
) 2k
1
D
0
+ k
2
.
So we know that for s [a
0
, b
0
], we have
d(c(s), [p, q]) d(c(s), w)
d(c(s), c(t)) + d(c(t), w)
(c|
[t,t
0
]
) + D
0
D
0
+ 2k
1
D
0
+ k
2
.
With the Morse lemma proven, we can sensibly talk about hyperbolic groups.
We have already seen many examples of hyperbolic spaces, such as trees and the
hyperbolic plane.
Example.
Any finite group is hyperbolic since they have bounded Cayley
graphs.
Example.
Finitely-generated free groups are hyperbolic, since their Cayley
graphs are trees.
Example.
Surface groups Σ
g
for
g
2 acts on the hyperbolic plane properly
discontinuously and cocompactly by isometries (since the universal cover of the
genus
g
surface is the hyperbolic plane by tessellating the hyperbolic plane with
4g-gons, cf. IB Geometry).
The following notion of “virtually” would be helpful
Definition
(virtually-P)
.
Let P be a property of groups. Then we say
G
is
virtually-P if there is a finite index subgroup G
0
G such that G
0
is P.
Example. Finite groups are virtually trivial!
Note that
G
0
G
is finite-index, then
G
0
acts on
Cay
S
(
G
) in a way that
satisfies the properties of the Schwarz–Milnor lemma (the only property we may
worry about is cocompactness, which is what we need finite-index for) and hence
G
0
'
qi
G. In particular,
Example.
A virtually hyperbolic group is hyperbolic. For example, virtually
free groups such that (Z/2) (Z/3) = PSL
2
Z are hyperbolic.
These are some nice classes of examples, but they will be dwarfed by our
next class of examples.
A “random group” is hyperbolic. More precisely, fix
m
2 and
n
1. We
temporarily fix 0. Consider groups of the form
Γ = ha
1
, . . . , a
m
| r
1
, . . . , r
n
i.
such that
r
i
are all cyclicly reduced words of length
(i.e. reduced words such
that the first letter is not the inverse of the last). Put the uniform probability
distribution on the set of all such groups. This defines a group-valued random
variable Γ
`
. For a property P , we say that “a random group is P if
lim
`→∞
P
`
is hyperbolic) = 1
for all n, m.
Theorem (Gromov). A random group is infinite and hyperbolic.
There are, of course, other ways to define a “random group”. As long as we
control the number of relations well so that we don’t end up with finite groups
all the time, the “random group” should still be hyperbolic.
Recall that in differential geometry, geodesics are defined locally. However,
we defined our geodesics to be embedded interval, which is necessarily a global
notion. We want an analogous local version. However, if we want to work up to
quasi-isomorphism, then we cannot go completely local, because locally, you are
allowed to be anything.
Definition
(
k
-local geodesic)
.
Let
X
be a geodesic metric space, and
k >
0. A
path c : [a, b] X is a k-local geodesic if
d(c(t), c(t
0
)) = |t t
0
|
for all t, t
0
[a, b] with |t t
0
| k.
We know very well that on a general Riemannian manifold, local geodesics
need not be actual geodesics. In fact, they can be far from being an actual
geodesic, since, for example, we can wrap around the sphere many times. Again,
things are much better in the hyperbolic world.
Theorem.
Let
X
be
δ
-hyperbolic and
c
: [
a, b
]
X
be a
k
-local geodesic where
k > 8δ. Then c is a (λ, ε)-quasigeodesic for some λ = λ(δ, k) and ε = ε(δ, k).
First, we prove
Lemma.
Let
X
be
δ
-hyperbolic and
k >
8
δ
. If
c
: [
a, b
]
X
is a
k
-local
geodesic, then im c is contained in the 2δ-neighbourhood of [c(a), c(b)].
Observe that by iterating the definition of hyperbolicity, on a
δ
-hyperbolic
space, any point on an
n
-gon is at most (
n
2)
δ
away from a point on another
side.
Proof. Let x = c(t) maximize d(x, [c(a), c(b)]). Let
y = c
t
k
2
, z = c
t
k
2
.
If t
k
2
< a, we set y = c(a) instead, and similarly for z.
Let
y
0
[
c
(
a
)
, c
(
b
)] minimize
d
(
y, y
0
), and likewise let
z
0
[
c
(
a
)
, c
(
b
)] minimize
d(z, z
0
).
Fix geodesics [
y, y
0
] and [
z, z
0
]. Then we have a geodesic rectangle with
vertices
y, y
0
, z, z
0
. By
δ
-hyperbolicity, there exists
w
on the rectangle not on
im c, such that d(x, w) = 2δ.
If
w
[
y
0
, z
0
], then we win. Otherwise, we may wlog assume
w
[
y, y
0
].
Note that in the case
y
=
c
(
a
), we must have
y
=
y
0
, and so this would imply
w
=
y
[
c
(
a
)
, c
(
b
)]. So we are only worried about the case
y
=
c
t
k
2
.
So
d
(
y, x
) =
k
2
>
4
δ
. But then by the triangle inequality, we must have
d(y, w) > 2δ = d(x, w).
However,
d(x, y
0
) d(x, w) + d(w, y
0
) < d(y, w) + d(w, y
0
) = d(y, y
0
).
So it follows that
d(x, [c(a), c(b)]) < d(y, y
0
) = d(y, [c(a), c(b)]).
This contradicts our choice of x.
Proof of theorem.
Let
c
: [
a, b
]
X
be a
k
-local geodesic, and
t t
0
[
a, b
].
Choose
t
0
=
t < t
1
< · · · < t
n
< t
0
such that
t
i
=
t
i1
+
k
for all
i
and
t
0
t
n
< k
.
Then by definition, we have
d(c(t
i1
), c(t
i
)) = k, d(c(t
n
), c(t
0
)) = |t
n
t
0
|.
for all i. So by the triangle inequality, we have
d(c(t), c(t
0
))
n
X
i=1
d(c(t
i1
), c(t
i
)) + d(t
n
, t
0
) = |t t
0
|.
We now have to establish a coarse lower bound on d(c(t), c(t
0
)).
We may wlog assume t = a and t
0
= b. We need to show that
d(c(a), c(b))
1
λ
|b a| ε.
We divide
c
up into regular subintervals [
x
i
, x
i+1
], and choose
x
0
i
close to
x
i
. The
goal is then to prove that the x
0
i
appear in order along [c(a), c(b)].
Let
k
0
=
k
2
+ 2δ > 6δ.
Let
b a
=
Mk
0
+
η
for 0
η < k
0
and
M N
. Put
x
i
=
c
(
ik
0
) for
i
= 1
, . . . , M
,
and let
x
0
i
be a closest point on [
c
(
a
)
, c
(
b
)] to
x
i
. By the lemma, we know
d(x
i
, x
0
i
) 2δ.
Claim. x
0
1
, . . . , x
0
m
appear in the correct order along [c(a), c(b)].
Let’s finish the proof assuming the claim. If this holds, then note that
d(x
0
i
, x
0
i+1
) k
0
4δ > 2δ
because we know d(x
i
, x
i+1
) = 6δ, and also d(x
m
, c(b)) η 2δ. Therefore
d(c(a), c(b)) =
M
X
i=1
d(x
i
, x
i1
) + d(x
m
, c(b)) 2δM + η 2δ 2δ(M 1).
On the other hand, we have
M =
|b a| η
k
0
|b a|
k
0
1.
Thus, we find that
d(c(a), c(b))
2δ
k
0
|b a| 4δ.
To prove the claim, let x
i
= c(t
i
) for all i. We let
y = c(t
i1
+ 2δ)
z = c(t
i+1
2δ).
Define
= ∆(x
i1
, x
0
i1
, y)
+
= ∆(x
i+1
, x
0
i+1
, z).
Both
and
+
are disjoint from
B
(
x
i
,
3
δ
). Indeed, if
w
with
d
(
x
i
, w
)
3
δ
, then by
δ
-slimness of
, we know
d
(
w, x
i1
)
3
δ
, and so
d
(
x
i
, x
i1
)
6
δ
,
which is not possible.
Therefore, since the rectangle
y, z, x
0
i+1
, x
0
i1
is 2
δ
-slim, and
x
i
is more than
2
δ
away from the sides
yx
0
i1
and
zx
0
i+1
. So there must be some
x
00
i
[
x
0
i1
, x
0
i+1
]
with d(x
i
, x
00
i
) 2δ.
Now consider ∆ = ∆(
x
i
, x
0
i
, x
00
i
). We know
x
i
x
0
i
and
x
i
x
00
i
are both of length
2
δ
. Note that every point in this triangle is within 3
δ
of
x
i
by
δ
-slimness. So
B
(
x
i
,
3
δ
), and this implies is disjoint from
B
(
x
i1
,
3
δ
) and
B
(
x
i+1
,
3
δ
)
as before.
But
x
0
i1
B
(
x
i1
,
3
δ
) and
x
0
i+1
B
(
x
i+1
,
3
δ
). Moreover, contains the
segment of [
c
(
a
)
, c
(
b
)] joining
x
0
i
and
x
00
i
. Therefore, it must be the case that
x
0
i
[x
0
i1
, x
0
i+1
].
4.3 Dehn functions of hyperbolic groups
We now use our new understanding of quasi-geodesics in hyperbolic spaces to
try to understand the word problem in hyperbolic groups. Note that by the
Schwarz–Milnor lemma, hyperbolic groups are finitely-generated and their Cayley
graphs are hyperbolic.
Corollary.
Let
X
be
δ
-hyperbolic. Then there exists a constant
C
=
C
(
δ
) such
that any non-constant loop in X is not C-locally geodesic.
Proof. Take k = 8δ + 1, and let
C = max{λε, k}
where λ, ε are as in the theorem.
Let
γ
: [
a, b
]
X
be a closed loop. If
γ
were
C
-locally geodesic, then it
would be (λ, ε)-quasigeodesic. So
0 = d(γ(a), γ(b))
|b a|
λ
ε.
So
|b a| λε < C.
But γ is a C-local geodesic. This implies γ is a constant loop.
Definition
(Dehn presentation)
.
A finite presentation
hS | Ri
for a group Γ
is called Dehn if for every null-homotopic reduced word
w S
, there is (a
cyclic conjugate of) a relator
r R
such that
r
=
u
1
v
with
S
(
u
)
<
S
(
v
), and
w = w
1
vw
2
(without cancellation).
The point about this is that if we have a null-homotopic word, then there is
some part in the word that can be replaced with a shorter word using a single
relator.
If a presentation is Dehn, then the naive way of solving the word problem
just works. In fact,
Lemma. If Γ has a Dehn presentation, then δ
Γ
is linear.
Proof. Exercise.
Theorem.
Every hyperbolic group Γ is finitely-presented and admits a Dehn
presentation.
In particular, the Dehn function is linear, and the word problem is solvable.
So while an arbitrary group can be very difficult, the generic group is easy.
Proof.
Let
S
be a finite generating set for Γ, and
δ
a constant of hyperbolicity
for Cay
S
(Γ).
Let C = C(δ) be such that every non-trivial loop is not C-locally geodesic.
Take
{u
i
}
to be the set of all words in
F
(
S
) representing geodesics [1
, u
i
] in
Cay
S
(Γ) with
|u
i
| < C
. Let
{v
j
} F
(
S
) be the set of all non-geodesic words of
length C in Cay
S
(Γ). Let R = {u
1
i
v
j
F (S) : u
i
=
Γ
v
j
}.
We now just observe that this gives the desired Dehn presentation! Indeed,
any non-trivial loop must contain one of the
v
j
’s, since
Cay
S
(Γ) is not
C
-locally
geodesic, and re can replace it with u
i
!
This argument was developed by Dehn to prove results about the fundamental
group of surface groups in 1912. In the 1980’s, Gromov noticed that Dehn’s
argument works for an arbitrary hyperbolic group!
One can keep on proving new things about hyperbolic groups if we wished
to, but there are better uses of our time. So for the remaining of the chapter,
we shall just write down random facts about hyperbolic groups without proof.
So hyperbolic groups have linear Dehn functions. In fact,
Theorem
(Gromov, Bowditch, etc)
.
If Γ is a finitely-presented group and
δ
Γ
ň n
2
, then Γ is hyperbolic.
Thus, there is a “gap” in the isoperimetric spectrum. We can collect our
results as
Theorem. If Γ is finitely-generated, then the following are equivalent:
(i) Γ is hyperbolic.
(ii) Γ has a Dehn presentation.
(iii) Γ satisfies a linear isoperimetric inequality.
(iv) Γ has a subquadratic isoperimetric inequality.
In general, we can ask the question for which
α R
is
n
α
'
a Dehn
function of a finitely-presented group? As we saw,
α
cannot lie in (1
,
2), and
it is a theorem that the set of such
α
is dense in [2
,
). In fact, it includes all
rationals in the interval.
Subgroup structure
When considering subgroups of a hyperbolic group Γ, it is natural to consider
“geometrically nice” subgroups, i.e. finitely-generated subgroups
H
Γ such
that the inclusion is a quasi-isometric embedding. Such subgroups are called
quasi-convex , and they are always hyperbolic.
What sort of such subgroups can we find? There are zillions of free quasi-
convex subgroups!
Lemma
(Ping-pong lemma)
.
Let Γ be hyperbolic and torsion-free (for conve-
nience of statement). If
γ
1
, γ
2
Γ do not commute, then for large enough
n
, the
subgroup hγ
n
1
, γ
n
2
i
=
F
2
and is quasi-convex.
How about non-free subgroups? Can we find surface groups? Of course, we
cannot always guarantee the existence of such surface groups, since all subgroups
of free groups are free.
Question.
Let Γ be hyperbolic and torsion-free, and not itself free. Must Γ
contain a quasi-convex subgroup isomorphic to
π
1
Σ for some closed hyperbolic
surface Σ?
We have no idea if it is true or false.
Another open problem we can ask is the following:
Question.
If Γ is hyperbolic and not the trivial group, must Γ have a proper
subgroup of finite index?
Proposition.
Let Γ be hyperbolic, and
γ
Γ. Then
C
(
γ
) is quasiconvex. In
particular, it is hyperbolic.
Corollary. Γ does not contain a copy of Z
2
.
The boundary
Recall that if Σ is a compact surface of genus
g
2, then
π
1
Σ
'
qi
H
2
. If we try
to draw the hyperbolic plane in the disc model, then we would probably draw a
circle and fill it in. One might think the drawing of the circle is just an artifact
of the choice of the model, but it’s not! It’s genuinely there.
Definition
(Geodesic ray)
.
Let
X
be a
δ
-hyperbolic geodesic metric space. A
geodesic ray is an isometric embedding r : [0, ) X.
We say
r
1
r
2
if there exists
M
such that
d
(
r
1
(
t
)
, r
2
(
t
))
M
for all
t
. In
the disc model of
H
2
, this is the scenario where two geodesic rays get very close
together as
t
. For example, in the upper half plane model of
H
2
, all vertical
lines are equivalent in this sense.
We define
X
=
{geodesic rays}/
. This can be topologized in a sensible
way, and in this case
X
X
is compact. By the Morse lemma, for hyperbolic
spaces, this is quasi-isometry invariant.
Example.
If Γ =
π
1
Σ, with Σ closed hyperbolic surface, then
Γ =
S
1
and
the union X
X gives us the closed unit disc.
Theorem
(Casson–Jungreis, Gabai)
.
If Γ is hyperbolic and
Γ
=
S
1
, then Γ
is virtually π
1
Σ for some closed hyperbolic Σ.
Example. If Γ is free, then
Γ is the Cantor set.
Conjecture
(Cannon)
.
If Γ is hyperbolic and
Γ
=
S
2
, then Γ is virtually
π
1
M for M a closed hyperbolic 3-manifold.
5 CAT(0) spaces and groups
From now on, instead of thinking of geodesics as being isometric embeddings,
we reparametrize them linearly so that the domain is always [0, 1].
5.1 Some basic motivations
Given a discrete group Γ, there are two basic problems you might want to solve.
Question. Can we solve the word problem in Γ?
Question. Can we compute the (co)homology of Γ?
Definition
(Group (co)homology)
.
The (co)homology of a group Γ is the
(co)homology of K, 1).
We can define this in terms of the group itself, but would require knowing
some extra homological algebra. A very closely related question is
Question.
Can we find an explicit
X
such that Γ =
π
1
X
and
˜
X
is contractible?
We know that these problems are not solvable in general:
Theorem
(Novikov–Boone theorem)
.
There exists a finitely-presented group
with an unsolvable word problem.
Theorem
(Gordon)
.
There exists a sequence of finitely generated groups Γ
n
such that H
2
n
) is not computable.
As before, we might expect that we can solve these problems if our groups
come with some nice geometry. In the previous chapter, we talked about
hyperbolic groups, which are negatively curved. In this section, we shall work
with slightly more general spaces, namely those that are non-positively curved.
Let
M
be a compact manifold of non-positive sectional curvature. It is a
classical fact that such a manifold satisfies a quadratic isoperimetric inequality.
This is not too surprising, since the “worst case” we can get is a space with
constant zero curvature, which implies
˜
M
=
R
n
.
If we know this, then by the Filling theorem, we know the Dehn function of
the fundamental group is at worst quadratic, and in particular it is computable.
This solves the first question.
What about the second question?
Theorem
(Cartan–Hadamard theorem)
.
Let
M
be a non-positively curved com-
pact manifold. Then
˜
M
is diffeomorphic to
R
n
. In particular, it is contractible.
Thus, M = K(π
1
M, 1).
For example, this applies to the torus, which is not hyperbolic.
So non-positively curved manifolds are good. However, there aren’t enough
of them. Why? In general, the homology of a group can be very complicated,
and in particular can be infinite dimensional. However, manifolds always have
finite-dimensional homology groups. Moreover, they satisfy Poincar´e duality.
Theorem
(Poincar´e duality)
.
Let
M
be an orientable compact
n
-manifold.
Then
H
k
(M; R)
=
H
nk
(M; R).
This is a very big constraint, and comes very close to characterizing manifolds.
In general, it is difficult to write down a group whose homology satisfies Poincar´e
duality, unless we started off with a manifold whose universal cover is contractible,
and then took its fundamental group.
Thus, we cannot hope to realize lots of groups as the
π
1
of a non-positively
curved manifold. The idea of CAT(0) spaces is to mimic the properties of
non-positively curved manifolds in a much more general setting.
5.2 CAT(κ) spaces
Let
κ
=
1
,
0 or 1, and let
M
κ
be the unique, simply connected, complete
2-dimensional Riemannian manifold of curvature κ. Thus,
M
1
= S
2
, M
0
= R
2
, M
1
= H
2
.
We can also talk about
M
κ
for other
κ
, but we can just obtain those by scaling
M
±1
.
Instead of working with Riemannian manifolds, we shall just think of these
as complete geodesic metric spaces. We shall now try to write down a “CAT(
κ
)”
condition, that says the curvature is bounded by κ in some space.
Definition (Triangle). A triangle with vertices {p, q, r} X is a choice
∆(p, q, r) = [p, q] [q, r] [r, p].
If we want to talk about triangles on a sphere, then we have to be a bit more
careful since the sides cannot be too long. Let
D
κ
=
diam M
κ
, i.e.
D
κ
=
for
κ = 0, 1 and D
κ
= π for κ = +1.
Suppose ∆ = ∆(
x
1
, x
2
, x
3
) is a triangle of perimeter
2
D
κ
in some complete
geodesic metric space (
X, d
). Then there is, up to isometry, a unique comparison
triangle
¯
∆ = ∆(¯x
1
, ¯x
2
, ¯x
3
) M
κ
such that
d
M
k
(¯x
i
, ¯x
j
) = d(x
i
, x
j
).
This is just the fact we know from high school that a triangle is determined
by the lengths of its side. The natural map
¯
is called the comparison
triangle.
Similarly, given a point
p
[
x
i
, x
j
], there is a comparison point
¯p
[
¯x
i
, ¯y
j
].
Note that
p
might be on multiple edges, so it could have multiple comparison
points. However, the comparison point is well-defined as long as we specify the
edge as well.
Definition
(CAT(
κ
) space)
.
We say a space (
X, d
) is CAT(
κ
) if for any geodesic
triangle
X
of diameter
2
D
κ
, any
p, q
and any comparison points
¯p, ¯q
¯
∆,
d(p, q) d
M
κ
(¯p, ¯q).
If X is locally CAT(κ), then K is said to be of curvature at most κ.
In particular, a locally CAT(0) space is called a non-positively curved space.
We are mostly interested in CAT(0) spaces. At some point, we will talk
about CAT(1) spaces.
Example. The following are CAT(0):
(i) Any Hilbert space.
(ii) Any simply-connected manifold of non-positive sectional curvature.
(iii) Symmetric spaces.
(iv) Any tree.
(v) If X, Y are CAT(0), then X × Y with the
2
metric is CAT(0).
(vi) In particular, a product of trees is CAT(0).
Lemma
(Convexity of the metric)
.
Let
X
be a CAT(0) space, and
γ, δ
: [0
,
1]
X be geodesics (reparameterized). Then for all t [0, 1], we have
d(γ(t), δ(t)) (1 t)d(γ(0), δ(0)) + td(γ(1), δ(1)).
Note that we have strict equality if this is in Euclidean space. So it makes
sense that in CAT(0) spaces, we have an inequality.
Proof. Consider the rectangle
α
γ(0) γ(1)
δ(0) δ(1)
γ
δ
Let
α
: [0
,
1]
X
be a geodesic from
γ
(0) to
δ
(1). Applying the CAT(0) estimate
to ∆(γ(0), γ(1), δ(1)), we get
d(γ(t), α(t)) d(γ(t), α(t)) = td(γ(1), α(1)) = td(γ(1), α(1)) = td(γ(1), δ(1)),
using what we know in plane Euclidean geometry. The same argument shows
that
d(δ(t), α(t)) (1 t)d(δ(0), γ(0)).
So we know that
d(γ(t), δ(t)) d(γ(t), α(t)) + d(α(t), δ(t)) (1 t)d(γ(0), δ(0)) + td(γ(1), δ(1)).
Lemma.
If
X
is CAT(0), then
X
is uniquely geodesic, i.e. each pair of points is
joined by a unique geodesic.
Proof.
Suppose
x
0
, x
1
X
and
γ
(0) =
δ
(0) =
x
0
and
γ
(1) =
δ
(1) =
x
1
. Then
by the convexity of the metric, we have
d
(
γ
(
t
)
, δ
(
t
))
0. So
γ
(
t
) =
δ
(
t
) for all
t.
This is not surprising, because this is true in the Euclidean plane and the
hyperbolic plane, but not in the sphere. Of course, one can also prove this result
directly.
Lemma.
Let
X
be a proper, uniquely geodesic metric space. Then geodesics in
X
vary continuously with their end points in the compact-open topology (which
is the same as the uniform convergence topology).
This is actually true without the word “proper”, but the proof is harder.
Proof. This is an easy application of the Arzel´a–Ascoli theorem.
Proposition. Any proper CAT(0) space X is contractible.
Proof.
Pick a point
x
0
X
. Then the map
X Maps
([0
,
1]
, X
) sending
x
to
the unique geodesic from
x
0
to
x
is continuous. The adjoint map
X ×
[0
,
1]
X
is then a homotopy from the constant map at x
0
to the identity map.
Definition
(CAT(0) group)
.
A group is CAT(0) if it acts properly discontinu-
ously and cocompactly by isometries on a proper CAT(0) space.
Usually, for us, the action will also be free. This is the case, for example,
when a fundamental group acts on the covering space.
Note that a space being CAT(0) is a very fine-grained property, so this is
not the same as saying the Cayley graph of the group is CAT(0).
Example. Z
n
for any n is CAT(0), since it acts on R
n
.
Example.
More generally,
π
1
M
for any closed manifold
M
of non-positive
curvature is CAT(0).
Example.
Uniform lattices in semi-simple Lie groups. Examples include
SL
n
O
K
for certain number fields K.
Example. Any free group, or direct product of free groups is CAT(0).
We remark, but will not prove
Proposition.
Any CAT(0) group Γ satisfies a quadratic isoperimetric inequality,
that is δ
Γ
' n or n
2
.
Note that if Γ is in fact CAT(-1), then Γ is hyperbolic, which is not terribly
difficult to see, since
H
2
is hyperbolic. But if a group is hyperbolic, is it necessarily
CAT(-1)? Or even just CAT(0)? This is an open question. The difficulty in
answering this question is that hyperbolicity is a “coarse condition”, but being
CAT(0) is a fine condition. For example, Cayley graphs are not CAT(0) spaces
unless they are trees.
5.3 Length metrics
In differential geometry, if we have a covering space
˜
X X
, and we have a
Riemannian metric on
X
, then we can lift this to a Riemannian metric to
˜
X
.
This is possible since the Riemannian metric is a purely local notion, and hence
we can lift it locally. Can we do the same for metric spaces?
Recall that if γ : [a, b] X is a path, then
(γ) = sup
a=t
0
<t
1
<···<t
n
=b
n
X
i=1
d(γ(t
i1
), γ(t
i
)).
We say γ is rectifiable if (γ) < .
Definition
(Length space)
.
A metric space
X
is called a length space if for all
x, y X, we have
d(x, y) = inf
γ:xy
(γ).
Given any metric space (
X, d
), we can construct a length pseudometric
ˆ
d : X × X [0, ] given by
ˆ
d(x, y) = inf
γ:xy
(γ).
Now given a covering space
p
:
˜
X X
and (
X, d
) a metric space, we can define
a length pseudometric on
˜
X by, for any path ˜γ : [a, b]
˜
X,
(˜γ) = (p ˜γ).
This induces an induced pseudometric on
˜
X.
Exercise. If X is a length space, then so is
˜
X.
Note that if
A, B
are length spaces, and
X
=
A B
(such that the metrics
agree on the intersection), then
X
has a natural induced length metric. Recall we
previously stated the Hopf–Rinow theorem in the context of differential geometry.
In fact, it is actually just a statement about length spaces.
Theorem
(Hopf–Rinow theorem)
.
If a length space
X
is complete and locally
compact, then X is proper and geodesic.
This is another application of the Arzel´a–Ascoli theorem.
5.4 Alexandrov’s lemma
Alexandrov’s lemma is a lemma that enables us to glue CAT(0) space together
to obtain new examples.
Lemma
(Alexandrov’s lemma)
.
Suppose the triangles
1
= ∆(
x, y, z
1
) and
2
= ∆(
x, y, z
2
) in a metric space satisfy the CAT(0) condition, and
y
[
z
1
, z
2
].
z
1
z
2
x
y
Then ∆ = ∆(x, z
1
, z
2
) also satisfies the CAT(0) condition.
This is the basic result we need if we want to prove “gluing theorems” for
CAT(0) spaces.
Proof.
Consider
¯
1
and
¯
2
, which together form a Euclidean quadrilateral
¯
Q
with with vertices
¯x, ¯z
1
, ¯z
2
, ¯y
. We claim that then the interior angle at
¯y
is
180
. Suppose not, and it looked like this:
¯x
¯y
¯z
1
¯z
2
If not, there exists ¯p
i
[¯y, ¯z
i
] such that [¯p
1
, ¯p
2
] [¯x, ¯y] = {¯q} and ¯q 6= ¯y. Now
d(p
1
, p
2
) = d(p
2
, y) + d(y, p
2
)
= d(¯p
1
, ¯y) + d(¯y, ¯p
1
)
= d(¯p, ¯y) + d(¯y, ¯p
2
)
> d(¯p
1
, ¯q) + d(¯q, ¯p
2
)
d(p
1
, q) + d(q, p
2
)
d(p
1
, p
2
),
which is a contradiction.
Thus, we know the right picture looks like this:
¯x
¯y
¯z
1
¯z
2
To obtain
¯
, we have to “push”
¯y
out so that the edge
¯z
1
¯z
2
is straight, while
keeping the lengths fixed. There is a natural map
π
:
¯
¯
Q
, and the lemma
follows by checking that for any a, b
¯
∆, we have
d(π(a), π(b)) d(a, b).
This is an easy case analysis (or is obvious).
A sample application is the following:
Proposition.
If
X
1
, X
2
are both locally compact, complete CAT(0) spaces and
Y
is isometric to closed, subspaces of both
X
1
and
X
2
. Then
X
1
Y
X
2
, equipped
with the induced length metric, is CAT(0).
5.5 Cartan–Hadamard theorem
Theorem
(Cartan–Hadamard theorem)
.
If
X
is a complete, connected length
space of non-positive curvature, then the universal cover
˜
X, equipped with the
induced length metric, is CAT(0).
This was proved by Cartan and Hadamard in the differential geometric
setting.
Corollary.
A (torsion free) group Γ is CAT(0) iff it is the
π
1
of a complete,
connected space X of non-positive curvature.
We’ll indicate some steps in the proof of the theorem.
Lemma.
If
X
is proper, non-positively curved and uniquely geodesic, then
X
is CAT(0).
Proof idea.
The idea is that given a triangle, we cut it up into a lot of small
triangles, and since
X
is locally CAT(0), we can use Alexandrov’s lemma to
conclude that the large triangle is CAT(0).
Recall that geodesics vary continuously with their endpoints. Consider a
triangle ∆ = ∆(
x, y, z
)
¯
B X
, where
¯
B
is a compact ball. By compactness,
there is an ε such that for every x
¯
B, the ball B
x
(¯ε) is CAT(0).
We let
β
t
be the geodesic from
x
to
α
(
t
). Using continuity, we can choose
0 < t
1
< · · · < t
N
= 1 such that
d(β
t
i
(s), β
t
i+1
(s)) < ε
for all s [0, 1].
Now divide up into a “patchwork” of triangles, each contained in an
ε
ball,
so each satisfies the CAT(0) condition, and apply induction and Alexandrov’s
lemma to conclude.
Now to prove the Cartan–Hadamard theorem, we only have to show that the
universal cover is uniquely geodesic. Here we must use the simply-connectedness
condition.
Theorem.
Let
X
be a proper length space of non-positive curvature, and
p, q X
. Then each homotopy class of paths from
p
to
q
contains a unique
(local) geodesic representative.
5.6 Gromov’s link condition
Gromov’s link condition is a criterion that makes it very easy to write down
interesting examples of non-positively-curved metric spaces.
Definition
(Euclidean cell complex)
.
A locally finite cell complex
X
is Euclidean
if every cell is isometric to a convex polyhedron in Euclidean space and the
attaching maps are isometries from the lower-dimensional cell to a face of the
new cell.
Such a complex
X
has a natural length metric which is proper and geodesic
by Hopf–Rinow. What we’d like to do is to come up with a condition that
ensures X is CAT(0).
Example.
The usual diagram for a torus gives an example of a Euclidean
complex.
Example. We can construct a sphere this way, just by taking a cube!
We know that
T
2
is non-positively curved (since it is flat), but the cube is
not, because by Cartan–Hadamard, it would be CAT(0), hence contractible, but
it is not.
Definition
(Link)
.
Let
X
be a Euclidean complex, and let
v
be a vertex of
X
,
and let 0 < ε shortest 1-cell. Then the link of v is
Lk(v) = S
v
(ε) = {x X : d(x, v) = ε}.
Note that
Lk
(
V
) is a cell complex: the intersection of
Lk
(
v
) with a cell of
X
of dimension is a cell of dimension n 1.
Example. In the torus, there is only one vertex. The link looks like
So the link is S
1
.
Example.
If we take the corner of a cube, then
Lk
(
v
) is homeomorphic to
S
1
,
but it is a weird one, because it is made up of three right angles, not two.
How can we distinguish between these two
S
1
’s? Angle puts a metric on
Lk
(
v
). We can do this for general metric spaces, but we only need it for Euclidean
complexes, in which case there is not much to do.
Restricted to each cell, the link is just a part of a sphere, and it has a natural
spherical metric, which is a length metric. These then glue together to a length
metric on Lk(v). Note that this is not the same as the induced metric.
Example.
In the torus, the total length of the link is 2
π
, while that of the cube
is
3π
2
.
The important theorem, which we are not going to prove, is this:
Theorem
(Gromov’s link criterion)
.
A Euclidean complex
X
is non-positively-
curved iff for every vertex v of X, Lk(v) is CAT(1).
Note that by definition, we only have to check the CAT(1) inequality on
triangles of perimeter < 2π.
Exercise. Check these in the case of the torus and the cube.
Thus, given a group, we can try to construct a space whose
π
1
is it, and then
put a Euclidean structure on it, then check Gromov’s link criterion.
In general, Gromov’s link condition might not be too useful, because we still
don’t know how to check if something is CAT(1)! However, it is very simple in
dimension 2.
Corollary.
If
X
is a 2-dimensional Euclidean complex, then for all vertices
v
,
Lk
(
v
) is a metric graph, and
X
is CAT(0) iff
Lk
(
v
) has no loop of length
<
2
π
for all v.
Example (Wise’s example). Consider the group
W = ha, b, s, t | [a, b] = 1, s
1
as = (ab)
2
, t
1
bt = (ab)
2
i.
Letting
c
=
ab
, it is easy to see that this is the
π
1
of the following Euclidean
complex:
b
a
c
s
t
This is metrized in the obvious way, where all edges have length 2 except the
black ones of length 1. To understand the links, we set
α = sin
1
1
4
, β = cos
1
1
4
.
Then the triangles each has angles 2α, β, β.
We show that
W
is non-positively curved, and then use this to show that
there is a homomorphism
W W
that is surjective but not injective. We say
W
is non-Hopfian. In particular, this will show that
W
is not linear, i.e. it is
not a matrix group.
To show that X is non-positively curved, we check the link condition:
a
a
c c
b
b
t
t
s
s
To check the link condition, we have to check that there are no loops of length
<
2
π
in
Lk
(
v
). Note that by trigonometric identities, we know
α
+
β
=
π
2
. So
we can just observe that everything is fine.
To see that W is non-Hopfian, we define
f : W W
a 7→ a
2
b 7→ b
2
s 7→ s
2
t 7→ t
We check that this is a well-defined homomorphism, and is surjective, since we
observe
a = sa
2
b
2
s
1
, b = ta
2
b
2
t
1
,
and so a, b, s, t im t. The non-trivial claim is that ker f 6= 0. Let
g = [scs
1
, tct
1
].
Note that
f(g) = [f(scs
1
), f(tct
1
)] = [sc
2
s
1
, tc
2
t
1
] = [a, b] = 1.
So the crucial claim is that
g 6
= 1. This is where geometry comes to the
rescue. For convenience, write
p
=
scs
1
, q
=
tct
1
. Consider the following local
geodesics in the two squares:
c
s
t
The left- and right-hand local geodesics represent p and q respectively. Then
g = p · q · ¯p · ¯q.
We claim that this represents
g
by a local geodesic. The only place that can
go wrong is the points where they are joined. By the proof of the Gromov link
condition, to check this, we check that the three “turns” at the vertex are all of
angle π.
a
a
c c
b
b
t
t
s
s
q
+
q
p
p
+
Here
p
+
is the point where
p
ends and
p
is the point where
p
starts, and
similarly for
q
. Moreover, the distance from
p
±
, q
±
to the top/bottom left/right
vertices is β. So we can just check and see that everything works.
Recall that every homotopy class of paths in
X
contains a unique locally
geodesic representative. Since the constant path is locally geodesic, we know
g 6= 1.
Definition
(Residually finite group)
.
A group
G
is residually finite if for every
g G \ {
1
}
, there is a homomorphism
ϕ
:
G finite group
such that
ϕ
(
g
)
6
= 0.
Theorem
(Mal’cev)
.
Every finitely generated linear subgroup (i.e. a subgroup
of GL
n
(C)) is resudially finite.
Proof sketch.
If the group is in fact a subgroup of
GL
n
(
Z
), then we just reduce
mod
p
for
p
0. To make it work over
GL
n
(
C
), we need a suitable version of
the Nullstellensatz.
Theorem
(Mal’cev)
.
Every finitely generated residually finite group is Hopfian.
Proof. Finding a proof is a fun exercise!
We know that Wise’s example is not Hopfian, hence not residually finite,
hence not a linear group.
Contrast this with the amazing theorem of Sela that all hyperbolic groups
are Hopfian. However, a major open question is whether all hyperbolic groups
are residually finite. On the other hand, it is known that not all hyperbolic
groups are not linear.
How are we supposed to think about residually finite groups?
Lemma
(Scott’s criterion)
.
Let
X
be a cell complex, and
G
=
π
1
X
. Then
G
is
residually finite if and only if the following holds:
Let
p
:
˜
X X
be the universal cover. For all compact subcomplexes
K
˜
X
,
there is a finite-sheeted cover
X
0
X
such that the natural covering map
p
0
:
˜
X X
0
is injective on K.
A good (though not technically correct) way to think about this is follows: if
we have a map
f
:
K X
that may be complicated, and in particular is not
injective, then we might hope that there is some “finite resolution”
X
0
X
such
that
f
lifts to
X
0
, and the lift is injective. Of course, this is not always possible,
and a necessary condition for this to work is that the lift to the universal cover
is injective. If this is not true, then obviously no such resolution can exist. And
residual finiteness says if there is no obvious reason to fail, then it in fact does
not fail.
5.7 Cube complexes
Definition
(Cube complex)
.
A Euclidean complex is a cube complex if every
cell is isometric to a cube (of any dimension).
This is less general than Euclidean complexes, but we can still make high
dimension things out of this. Of course, general Euclidean complexes also work in
high dimensions, but except in two dimensions, the link condition is rather tricky
to check. It turns out the link condition is easy to check for cube complexes.
Purely topologically, the link of each vertex is made out of simplices, and,
subdividing if necessary, they are simplicial complexes. The metric is such that
every edge has length
π
2
, and this is called the “all-right simplicial complex”.
Recall that
is non-positively curved, while
is not.
Definition
(Flag simplicial complex)
.
A simplicial complex
X
is flag if for all
n 2, each copy of
n
in X is in fact the boundary of a copy of
n
in X.
Note that topologically, flag complexes are not special in any sense. For any
simplicial complex K, the first barycentric subdivision is flag.
Theorem (Gromov). A cube complex is non-positively curved iff every link is
flag.
Now the property of being flag is purely combinatorial, and easy to check.
So this lets us work with cube complexes.
Right-angled Artin groups
Definition
(right-angled Artin group)
.
Let
N
be a simplicial graph, i.e. a graph
where the vertices determine the edges, i.e. a graph as a graph theorist would
consider. Then
A
N
= hV (N ) | [u, v] = 1 for all (u, v) E(N)i
is the right-angled Artin group, or graph group of N.
Example. If N is the discrete graph on n vertices, then A
N
= F
n
.
Example. If N is the complete graph on n vertices, then A
N
= Z
n
.
Example.
If
N
is a square, i.e. the complete bipartite graph
K
2,2
, then
A
N
=
F
2
× F
2
.
Example.
When
N
is the path with 4 vertices, then this is a complicated group
that doesn’t have a good, alternative description. This is quite an interesting
group.
Definition
(Salvetti complex)
.
Given a simplicial group
N
, the Salvetti complex
S
N
is the cube complex defined as follows:
Set S
(2)
N
is the presentation complex for A
N
.
For any immersion of the 2-skeleton of a
d
-dimensional cube, we glue in
an d-dimensional cube to S
(2)
N
.
Alternatively, we have a natural inclusion
S
(2)
N
(
S
1
)
|V (N)|
, and
S
N
is the
largest subcomplex whose 2-skeleton coincides with S
(2)
N
.
Example. If N is
then A
n
= Z Z
2
, and S
N
is a circle glued to a torus.
Example. If N is
,
then S
N
is T
2
S
1
. There is a unique vertex v, and its link looks like
In fact, there is a recipe for getting the link out of N.
Definition
(Double)
.
The double
D
(
K
) of a simplicial complex
K
is defined as
follows:
The vertices are
{v
+
1
, . . . , v
+
n
, v
1
, . . . , v
n
}
, where
{v
1
, . . . , v
n
}
are the ver-
tices of k.
The simplices are those of the form
hv
±
i
0
, . . . , v
±
i
k
i
, where
hv
i
0
, . . . , v
i
k
i K
.
Example. In the example above, the double of N is just Lk(v)!
Definition
(Flag complex)
.
The flag complex of
N
, written
¯
N
, is the only flag
simplicial complex with 1-skeleton N.
Lemma.
For any (simplicial) graph
N
, the link of the unique vertex of
S
N
is
D(
¯
N). In particular, S
N
is non-positively curved.
Thus, right-angled Artin groups and their Salvetti complexes give examples
of non-positively curved spaces with very general links. It turns out
Theorem.
Right-angled Artin groups embed into
GL
n
Z
(where
n
depends on
N).
5.8 Special cube complexes
Let
X
be a non-positively curved cube complex. We will write down explicit
geometric/combinatorial conditions on
X
so that
π
1
X
embeds into
A
N
for some
N.
Hyperplanes and their pathologies
If
C
=
[
1
,
1]
n
, then a midcube
M C
is the intersection of
C
with
{x
i
= 0
}
for some i.
Now if
X
is a non-positively curved cube complex, and
M
1
, M
2
are midcubes
of cubes in
X
, we say
M
1
M
2
if they have a common face, and extend this to
an equivalence relation. The equivalence classes are immersed hyperplanes. We
usually visualize these as the union of all the midcubes in the equivalence class.
Note that in general, these immersed hyperplanes can have self-intersections,
hence the word “immersed”. Thus, an immersed hyperplane can be thought of
as a locally isometric map H # X, where H is a cube complex.
In general, these immersed hyperplanes can have several “pathologies”. The
first is self-intersections, as we have already met. The next problem is that of
orientation, or sidedness. For example, we can have the following cube complex
with an immersed hyperplane:
This is bad, for the reason that if we think of this as a (
1
,
1)-bundle over
H, then it is non-orientable, and in particular, non-trivial.
In general, there could be self intersections. So we let
N
H
be the pullback
interval bundle over
H
. That is,
N
H
is obtained by gluing together
{M ×
(
1
,
1)
|
M is a cube in H}
. Then we say
H
is two-sided if this bundle is trivial, and
one-sided otherwise.
Sometimes, we might not have self-intersections, but something like this:
This is a direct self-osculation. We can also have indirect osculations that look
like this:
Note that it makes sense to distinguish between direct and indirect osculations
only if our hyperplane is two-sided.
Finally, we have inter-osculations, which look roughly like this:
Haglund and Wise defined special cube complexes to be cube complexes that
are free of pathologies.
Definition
(Special cube complex)
.
A cube complex is special if its hyperplanes
do not exhibit any of the following four pathologies:
One-sidedness
Self-intersection
Direct self-osculation
Inter-osculation
Example. A cube is a special cube complex.
Example.
Traditionally, the way to exhibit a surface as a cube complex is to
first tile it by right-angled polygons, so that every vertex has degree 4, and then
the dual exhibits the surface as a cube complex. The advantage of this approach
is that the hyperplanes are exactly the edges in the original tiling!
From this, one checks that we in fact have a special curve complex.
This is one example, but it is quite nice to have infinitely many. It is true
that covers of special things are special. So this already gives us infinitely many
special cube complexes. But we want others.
Example.
If
X
=
S
N
is a Salvetti complex, then it is a special cube complex,
and it is not terribly difficult to check.
The key theorem is the following:
Theorem
(Haglund–Wise)
.
If
X
is a compact special cube complex, then there
exists a graph N and a local isometry of cube complexes
ϕ
X
: X # S
N
.
Corollary. π
1
X A
N
.
Proof of corollary.
If
g π
1
X
, then
g
is uniquely represented by a local geodesic
γ
:
I X
. Then
ϕ γ
is a local geodesic in
S
N
. Since homotopy classes
of loops are represented by unique local geodesics, this implies
ϕ
X
γ
is not
null-homotopic. So the map (ϕ
X
)
is injective.
So if we know some nice group-theoretic facts about right-angled Artin groups,
then we can use them to understand π
1
X. For example,
Corollary.
If
X
is a special cube complex, then
π
1
X
is linear, residually finite,
Hopfian, etc.
We shall try to give an indication of how we can prove the Haglund–Wise
theorem. We first make the following definition.
Definition
(Virtually special group)
.
A group Γ is virtually special if there
exists a finite index subgroup Γ
0
Γ such that Γ
0
=
π
1
X
, where
X
is a compact
special cube complex.
Sketch proof of Haglund–Wise.
We have to first come up with an
N
. We set
the vertices of
N
to be the hyperplanes of
X
, and we join two vertices iff the
hyperplanes cross in
X
. This gives
S
N
. We choose a transverse orientation on
each hyperplane of X.
Now we define ϕ
X
: X # S
N
cell by cell.
Vertices: There is only one vertex in S
N
.
Edges: Let
e
be an edge of
X
. Then
e
crosses a unique hyperplane
H
.
Then
H
is a vertex of
N
. This corresponds to a generator in
A
N
, hence a
corresponding edge
e
(
H
) of
S
N
. Send
e
to
e
(
H
). The choice of transverse
orientation tells us which way round to do it
Squares: given hyperplanes
f
2
e
2
f
1
e
1
H
0
H
Note that we already mapped
e
1
, e
2
to
e
(
H
), and
f
1
, f
2
to
e
(
H
0
). Since
H
and
H
0
cross in
X
, we know
e
(
H
) and
e
(
H
0
) bound a square in
S
N
. Send
this square in X to that square in S
N
.
There is nothing to do for the higher-dimensional cubes, since by definition
of S
N
, they have all the higher-dimensional cubes we can hope for.
We haven’t used a lot of the nice properties of special cube complexes. They
are needed to show that the map is a local isometric embedding. What we do
is to use the hypothesis to show that the induced map on links is an isometric
embedding, which implies ϕ
X
is a local isometry.
This really applies to a really wide selection of objects.
Example. The following groups are virtually special groups:
π
1
M for M almost any 3-manifold.
Random groups
This is pretty amazing. A “random group” is linear!