1Cayley graphs and the word metric
IV Topics in Geometric Group Theory
1.4 The word problem
Long time ago, Poincar´e was interested in characterizing the 3sphere. His
first conjecture was that if a closed 3manifold
M
had
H
∗
(
M
)
∼
=
H
∗
(
S
3
), then
M
∼
=
S
3
. Poincar´e thought very hard about the problem, and came up with a
counterexample. The counterexample is known as the Poincar´e homology sphere.
This is a manifold P
3
with
H
∗
(P )
∼
=
H
∗
(S
3
),
but it turns out there is a surjection
π
1
(
P
3
)
→ A
5
. In particular,
P
3
is not
homeomorphic to a sphere.
So he made a second conjecture, that if
M
is a compact manifold with
π
1
M
∼
=
1, then
M
∼
=
S
3
. Note that the condition already implies
H
∗
(
M
)
∼
=
H
∗
(
S
3
)
by Hurewicz theorem and Poincar´e duality. This is known as the Poincar´e
conjecture, which was proved in 2002 by Perelman.
But there is more to say about Poincar´e’s initial conjecture. Some time later,
Max Dehn constructed an infinite family of 3d homology spheres. In order to
prove that he genuinely had infinitely many distinct homology spheres, he had
to show that his homology spheres had different fundamental groups. He did
manage to write down the presentations of the fundamental groups, and so what
was left to do is to distinguish whether the presentations actually presented the
same group.
To make sense of this statement, we must have some way to distinguish
between the homology spheres. To do so, he managed to write down the
presentation of the fundamental group of his homology spheres, and he had to
figure out if those groups are genuinely the same.
For our purposes, perhaps we should be slightly less ambitious. Suppose we
are given a presentation
hS  Ri
of a finitelypresented group Γ. Define the set
of alphabets
S
±
= S q S
−1
= {s, s
−1
: s ∈ S},
and let
S
∗
be the set of all finite words in
S
±
. Then the fundamental question is,
given a word
w ∈ S
∗
, when does it represent the identity element 1
∈
Γ? Ideally,
we would want an algorithm that determines whether this is the case.
Example.
Recall that in the free group
F
(
S
) =
hS  ∅i
, we had a normal form
theorem, namely that every element in Γ can be written uniquely as a product
of generator (and inverses) such that there aren’t any occurrences of
ss
−1
or
s
−1
s
. This gives a way of determining whether a given word
w ∈ S
∗
represents
the identity. We perform elementary reductions, removing every occurrence of
ss
−1
or
s
−1
s
. Since each such procedure reduces the length of the word, this
eventually stops, and we would end up with a reduced word . If this reduced
word is empty, then w represents the identity. If it is nonempty, w does not.
Thus, we have a complete solution to the word problem for free groups, and
moreover the algorithm is something we can practically implement.
For a general finitelypresented group, this is more difficult. We can first
reformulate our problem. The presentation Γ =
hS  Ri
gives us a map
F
(
S
)
→
Γ,
sending
s
to
s
. Each word gives us an element in
F
(
S
), and thus we can
reformulate our problem as identifying the elements in the kernel of this map.
Lemma.
Let Γ =
hS  Ri
. Then the elements of
ker
(
F
(
S
)
→
Γ) are precisely
those of the form
d
Y
i=1
g
i
r
±1
i
g
−1
i
,
where g
i
∈ F (S) and r
i
∈ R.
Proof.
We know that
ker
(
F
(
S
)
→
Γ) =
hhRii
, and the set described above is
exactly hhRii, noting that gxyg
−1
= (gxg
−1
)(gyg
−1
).
This tells us the set of elements in
S
∗
that represent the identity is recursively
enumerable, i.e. there is an algorithm that lists out all words that represent the
identity. However, this is not very helpful when we want to identify whether a
word represents the identity. If it does, then our algorithm will eventually tell
us it is (maybe after 3 trillion years), but if it doesn’t, the program will just run
forever, and we will never know that it doesn’t represent the identity.
Example. Let Γ = Z
2
∼
=
ha, b  [a, b]i. Consider the word
w = a
n
b
n
a
−n
b
−n
We see that this represents our identity element. But how long would it take for
our algorithm to figure this fact out? Equivalently, what
d
do we need to pick
so that w is of the form
d
Y
i=1
g
i
r
±1
i
g
−1
i
?
We can draw a picture. Our element [a, b] looks like
bb
a
a
If, say, n = 2, then w is given by
a
a
bb
a
a
bb
If we think about it hard, then we see that what we need to do is to find out how
to fill in this big square with the small squares, and we need 4 squares. Indeed,
we can read off the sequence
w = [a, b] · (b[a, b]b
−1
) · (b
2
ab
−1
[a, b]ba
−1
b
−2
) · (b
2
a
2
b
−1
a
−1
b
−1
[a, b]baba
−2
b
−2
),
which corresponds to filling in the square one by one as follows:
Of course, this is fine if we know very well what the Cayley graph of the
group looks like, but in general it is quite hard. Indeed, solving the word problem
is part of what you do if you want to draw the Cayley graph, since you need to
know when two words give the same element!
So how do we solve the word problem? Our previous partial algorithm would
make a good, full solution if we knew how far we had to search. If we know that
we will only need at most
d
= 10
10
100
, then if we searched for all expressions of
the form
Q
d
i=1
g
i
r
±
i
g
−1
i
for
d <
10
10
100
, and didn’t find
w
, then we know
w
does
not represent the identity element (we will later argue that we don’t have to
worry about there being infinitely many g
i
’s to look through).
Definition (Nullhomotopic). We say w ∈ S
∗
is nullhomotopic if w = 1 in Γ.
Definition
(Algebraic area)
.
Let
w ∈ S
∗
be mullhomotopic. Its algebraic area
is
Area
a,P
(w) = min
(
d : w =
d
Y
i=1
g
i
r
±1
i
g
−1
i
)
.
We write the subscript
a
to emphasize this is the algebraic area. We will
later define the geometric area, and we will write it as
Area
g
. Afterwards, we
will show that they are the same, and we will stop writing the subscript.
Let us use
 · 
S
to denote word length in
F
(
S
), while
`
S
continues to denote
the word length in Γ.
Definition
(Dehn function)
.
Then Dehn function is the function
δ
P
:
N → N
mapping
n 7→ max {Area
a,P
(w)  w
S
≤ n, w is nullhomotopic} .
This Dehn function measures the difficulty of the word problem in P.
Proposition.
The word problem for
P
is solvable iff
δ
P
is a computable function.
We will postpone the proof. The hard part of the proof is that we don’t have
to worry about the infinitely many possible g
i
that may be used to conjugate.
It would be really nice if the Dehn function can be viewed as a property of
the group, and not the presentation. This requires us to come up with a notion
of equivalence relation on functions N → [0, ∞) (or [0, ∞) → [0, ∞)).
Definition (4). We write f 4 g iff for some C > 0, we have
f(x) ≤ Cg(Cx + C) + Cx + C.
for all x.
We write f ≈ g if f 4 g and g 4 f.
This captures the (superlinear) asymptotic behaviour of f .
Example. For α, β ≥ 1, n
α
4 n
β
iff α ≤ β.
Proposition. If P and Q are two finite presentations for Γ, then δ
P
≈ δ
Q
.
We start with two special cases, and then show that we can reduce everything
else to these two special cases.
Lemma. If R
0
⊆ hhRii is a finite set, and
P = hS  Ri, Q = hS  R ∪ R
0
i,
then δ
P
' δ
Q
.
Proof. Clearly, δ
Q
≤ δ
P
. Let
m = max
r
0
∈R
0
Area
P
(r
0
).
It is then easy to see that
Area
P
(w) ≤ m Area
Q
(w).
Lemma. Let P = hS  Ri, and let
Q = hS q T  R q R
0
i,
where
R
0
= {tw
−1
t
: t ∈ T, w
t
∈ F (S)}.
Then δ
P
≈ δ
Q
.
Proof. We first show δ
P
4 δ
Q
. Define
ρ : F (S q T ) → F (s)
s 7→ s
t 7→ w
t
.
In particular, ρ(r) = r for all r ∈ R and ρ(r
0
) = 1 for all r
0
∈ R
0
.
Given w ∈ F (S), we need to show that
Area
P
(w) ≤ Area
Q
(w).
Let d = Area
Q
(w). Then
w =
d
Y
i=1
g
i
r
±1
i
g
−1
i
,
where
g
i
∈ F
(
S
`
T
)
, r
i
∈ R ∪ R
0
. We now apply
ρ
. Since
ρ
is a retraction,
ρ(w) = w. Thus,
w =
d
Y
i=1
ρ(g
i
)ρ(r
i
)
±1
ρ(g
i
)
−1
.
Now
ρ
(
r
i
) is either
r
i
or 1. In the first case, nothing happens, and in the second
case, we can just forget the ith term. So we get
w =
d
Y
i=1,r
i
∈R
ρ(g
i
)r
±1
i
ρ(g
i
)
−1
.
Since this is a valid proof in P that w = 1, we know that
Area
P
(w) ≤ d = Area
Q
(w).
We next prove that
δ
Q
4 δ
P
. It is unsurprising that some constants will appear
this time, instead of just inequality on the nose. We let
C = max
t∈T
w
t

S
.
Consider a nullhomotopic word w ∈ F (S q T ). This word looks like
w = s
±
i
1
s
±
i
2
· · · t
j
1
· · · s · · · t · · · ∈ F (S q T ).
We want to turn these t’s into s’s, and we need to use the relators to do so.
We apply relations from R
0
to write this as
w
0
= s
±
i
1
s
±
i
2
· · · w
t
j
1
· · · s · · · w
t
· · · ∈ F (S).
We certainly have w
0

S
≤ Cw
SqT
. With a bit of thought, we see that
Area
Q
(w) ≤ Area
P
(w
0
) + w
SqT
≤ δ
P
(Cw
SqT
) + w
SqT
.
So it follows that
δ
Q
(n) ≤ δ
P
(Cn) + n.
Proof of proposition.
Let
P
=
hS  Ri
and
Q
=
hS
0
 R
0
i
. If
P, Q
present
isomorphic groups, then we can write
s = u
s
∈ F (S
0
) for all s ∈ S
Similarly, we have
s
0
= v
s
0
∈ F (S) for all s
0
∈ S
0
We let
T = {su
−1
s
 s ∈ S}
T
0
= {s
0
v
−1
s
0
 s
0
∈ S
0
}
We then let
M = hS q S
0
 R ∪ R
0
∪ T ∪ T
0
i.
We can then apply our lemmas several times to show that δ
P
≈ δ
M
≈ δ
Q
.
Now we can talk about
δ
Γ
, the Dehn function of Γ, as long as we know we
are only talking up to ≈. In fact, it is true that
Fact. If Γ
1
'
qi
Γ
2
, then δ
Γ
1
≈ δ
Γ
2
.
The proof is rather delicate, and we shall not attempt to reproduce it.