6Minkowski bound and finiteness of class group
II Number Fields
6 Minkowski bound and finiteness of class group
Dedekind’s criterion allowed us to find all prime factors of
hpi
, but if we want
to figure out if, say, the class group of a number field is trivial, or even finite,
we still have no idea how to do so, because we cannot go and check every single
prime p and see what happens.
What we are now going to do is the following — we are going to use purely
geometric arguments to reason about ideals, and figure that each element of
the class group
cl
L
=
I
L
/P
L
has a representative whose norm is bounded by
some number
c
L
, which we will find rather explicitly. After finding the
c
L
, to
understand the class group, we just need to factor all prime numbers less than
c
L
and see what they look like.
We are first going to do the case of quadratic extensions explicitly, since
2dimensional pictures are easier to draw. We will then do the full general case
afterwards.
Quadratic extensions
Consider again the case L = Q(
√
d), where d < 0. Then O
L
= Z[α], where
α =
(
√
d d ≡ 2, 3 (mod 4)
1
2
(1 +
√
d) d ≡ 1 (mod 4)
We can embed this as a subfield
L ⊆ C
. We can then plot the points on the
complex plane. For example, if
d ≡
2
,
3 (
mod
4), then the points look like this:
0
√
d
1
1 +
√
d
2
√
d
Then an ideal of
O
L
, say
a
=
h
2
,
√
di
, would then be the sublattice given by
the blue crosses.
×
×
×
×
×
×
×
×
×
×
×
×
×
×
×
We always get this picture, since any ideal of
O
L
is isomorphic to
Z
2
as an
abelian group.
If we are in the case where d ≡ 1 (mod 4), then the lattice is hexagonal:
0
√
d
1
1
2
(1 +
√
d)
The key result is the following purely geometric lemma:
Lemma
(Minskowski’s lemma)
.
Let Λ =
Zv
1
+
Zv
2
⊆ R
2
be a lattice, with
v
1
, v
2
linearly independent in
R
(i.e.
Rv
1
+
Rv
2
=
R
2
). We write
v
i
=
a
i
e
1
+
b
i
e
2
.
Then let
A(Λ) = area of fundamental parallelogram =
det
a
1
a
2
b
1
b
2
,
where the fundamental parallelogram is the following:
v
1
v
2
v
1
+ v
2
Then a closed disc S around 0 contains a nonzero point of Λ if
area(S) ≥ 4A(Λ).
In particular, there exists an α ∈ Λ with α 6= 0, such that
0 < α
2
≤
4A(Λ)
π
.
This is just an easy piece of geometry. What is remarkable is that the radius
of the disc needed depends only on the area of the fundamental parallelogram,
and not its shape.
Proof. We will prove a general result in any dimensions later.
We now apply this to ideals
a ≤ O
L
, regarded as a subset of
C
=
R
2
via
some embedding
L → C
. The following proposition gives us the areas of the
relevant lattices:
Proposition.
(i) If α = a + b
√
λ, then as a complex number,
α
2
= (a + b
√
λ)(a − b
√
λ) = N(α).
(ii) For O
L
, we have
A(O
L
) =
1
2
p
D
L
.
(iii) In general, we have
A(a) =
1
2
p
∆(α
1
, α
2
),
where α
1
, α
2
are the integral basis of a.
(iv) We have
A(a) = N(a)A(O
L
).
Proof.
(i) This is clear.
(ii) We know O
L
has basis 1, α, where again
α =
(
√
d d ≡ 2, 3 (mod 4)
1
2
(1 +
√
d) d ≡ 1 (mod 4)
.
So we can just look at the picture of the lattice, and compute to get
A(O
L
) =
(
p
d d ≡ 2, 3 (mod 4)
1
2
p
d d ≡ 1 (mod 4)
=
1
2
p
D
L
.
(iii)
If
α
1
, α
2
are the integral basis of
a
, then the lattice of
a
is in fact spanned
by the vectors α
1
= a + bi, α
2
= a
0
+ b
0
i. This has area
A(a) = det
a b
a
0
b
0
,
whereas we have
∆(α
1
, α
2
) = det
α
1
¯α
1
α
2
¯α
2
2
= (α
1
¯α
2
− α
2
¯α
1
)
2
= Im(2α
1
¯α
2
)
2
= 4(a
0
b − ab
0
)
2
= 4A(a)
2
.
(iv) This follows from (ii) and (iii), as
∆(α
1
, ··· , α
n
) = N(a)
2
D
L
in general.
Now what does Minkowski’s lemma tell us? We know there is an
α ∈ a
such
that
N(α) ≤
4A(a)
π
= N(a)c
L
,
where
c
L
=
2
p
D
L

π
.
But α ∈ a implies hαi ⊆ a, which implies hαi = ab for some ideal b. So
N(α) = N(hαi) = N(a)N(b).
So this implies
N(b) ≤ c
L
=
2
p
D
L

π
.
Recall that the class group is
cl
L
=
I
L
/P
L
, the fractional ideals quotiented by
principal ideals, and we write [
a
] for the class of
a
in
cl
L
. Then if
ab
=
hαi
, then
we have
[b] = [a
−1
]
in cl
L
. So we have just shown,
Proposition
(Minkowski bound)
.
For all [
a
]
∈ cl
L
, there is a representative
b
of [a] (i.e. an ideal b ≤ O
L
such that [b] = [a]) such that
N(b) ≤ c
L
=
2
p
D
L

π
.
Proof. Find the b such that [b] = [(a
−1
)
−1
] and N(b) ≤ c
L
.
Combining this with the following easy lemma, we know that the class group
is finite!
Lemma.
For every
n ∈ Z
, there are only finitely many ideals
a ≤ O
L
with
N(a) = m.
Proof.
If
N
(
a
) =
m
, then by definition
O
L
/a
=
m
. So
m ∈ a
by Lagrange’s
theorem. So
hmi ⊆ a
, i.e.
a  hmi
. Hence
a
is a factor of
hmi
. By unique
factorization of prime ideals, there are only finitely many such ideals.
Another proof is as follows:
Proof.
Each ideal bijects with an ideal in
O
L
/mO
L
= (
Z/m
)
n
. So there are
only finitely many.
Thus, we have proved
Theorem.
The class group
cl
L
is a finite group, and the divisors of ideals of
the form hpi for p ∈ Z, p a prime, and 0 < p < c
L
, collectively generate cl
L
.
Proof.
(i)
Each element is represented by an ideal of norm less than 2
p
D
L
/π
, and
there are only finitely many ideals of each norm.
(ii)
Given any element of
cl
L
, we pick a representative
a
such that
N
(
a
)
< c
L
.
We factorize
a = p
e
1
1
···p
e
r
r
.
Then
N(p
i
) ≤ N(a) < c
L
.
Suppose
p
i
 hpi
. Then
N
(
p
) is a power of
p
, and is thus at least
p
. So
p < c
L
.
We now try to work with some explicit examples, utilizing Dedekind’s criterion
and the Minkowski bound.
Example. Consider d = −7. So Q(
√
−7) = L, and D
L
= −7. Then we have
1 < c
L
=
2
√
7
π
< 2.
So cl
L
= {1}, since there are no primes p < c
L
. So O
L
is a UFD.
Similarly, if d = −1, −2, −3, then O
L
is a UFD.
Example. Let d = −5. Then D
L
= −20. We have
2 < c
L
=
4
√
5
π
< 3.
So cl
L
is generated by primes dividing h2i.
Recall that Dirichlet’s theorem implies
h2i = h2, 1 +
√
−5i
2
= p
2
.
Also,
p
=
h
2
,
1 +
√
−5i
is not principal. If it were, then
p
=
hβi
, with
β
=
x
+
y
√
−5
, and
N
(
β
) = 2. But there are no solutions in
Z
of
x
2
+ 5
y
2
= 2. So
cl
L
= hpi = Z/2.
Example.
Consider
d
=
−
17
≡
3 (
mod
4). So
c
L
≈
5
.
3. So
cl
L
is generated by
primes dividing by h2i, h3i, h5i. We factor
x
2
+ 17 ≡ x
2
+ 1 ≡ (x + 1)
2
(mod 2).
So
h2i = p
2
= h2, 1 +
√
di
2
.
Doing this mod 3, we have
x
2
+ 17 ≡ x
2
− 1 ≡ (x − 1)(x + 1) (mod 3).
So we have
h3i = q
¯
q = h3, 1 +
√
dih3, 1 −
√
di.
Finally, mod 5, we have
x
2
+ 17 ≡ x
2
+ 2 (mod 5).
So 5 is inert, and [h5i] = 1 in cl
L
. So
cl
L
= h[p], [q]i,
and we need to compute what this is. We can just compute powers
q
2
, q
3
, ···
,
pq, pq
2
, ···, and see what happens.
But a faster way is to look for principal ideals with small norms that are
multiples of 2 and 3. For example,
N(h1 +
√
di) = 18 = 2 · 3
2
.
But we have
1 +
√
d ∈ p, q.
So
p, q  h
1 +
√
di
. Thus we know
pq  h
1 +
√
di
. We have
N
(
pq
) = 2
·
3 = 6. So
there is another factor of 3 to account for. In fact, we have
h1 +
√
di = pq
2
,
which we can show by either thinking hard or expanding it out. So we must have
[p] = [q]
−2
in
cl
L
. So we have
cl
L
=
h
[
q
]
i
. Also, [
q
]
−2
= [
p
]
6
= 1 in
cl
L
, as if it did, then
p
is
principal, i.e.
p
=
hx
+
y
√
di
, but 2 =
N
(
p
) =
x
2
+ 7
y
2
has no solution in the
integers. Also, we know [p]
2
= [1]. So we know
cl
L
= Z/4Z.
In fact, we have
Theorem. Let L = Q(
√
d) with d < 0. Then O
L
is a UFD if
−d ∈ {1, 2, 3, 7, 11, 19, 43, 67, 163}.
Moreover, this is actually an “if and only if”.
The first part is a straightforward generalization of what we have been doing,
but the proof that no other d’s work is hard.
General case
Now we want to extend these ideas to higher dimensions. We are really just
doing the same thing, but we need a bit harder geometry and proper definitions.
Definition
(Discrete subset)
.
A subset
X ⊆ R
n
is discrete if for every
x ∈ X
,
there is some
ε >
0 such that
B
ε
(
x
)
∩ X
=
{x}
. This is true if and only if for
every compact K ⊆ R
n
, K ∩ X is finite.
We have the following very useful characterization of discrete subgroups of
R
n
:
Proposition.
Suppose Λ
⊆ R
n
is a subgroup. Then Λ is a discrete subgroup of
(R
n
, +) if and only if
Λ =
(
m
X
1
n
i
x
i
: n
i
∈ Z
)
for some x
1
, ··· , x
m
linearly independent over R.
Note that linear independence is important. For example,
Z
√
2
+
Z
√
3 ⊆ R
is not discrete. On the other hand, if Λ =
a C O
L
is an ideal, where
L
=
Q
(
√
d
)
and d < 0, then this is discrete.
Proof.
Suppose Λ is generated by
x
1
, ··· , x
m
. By linear independence, there
is some
g ∈ GL
n
(
R
) such that
gx
i
=
e
i
for all 1
≤ i ≤ m
, where
e
1
, ··· , e
n
is
the standard basis. We know acting by
g
preserves discreteness, since it is a
homeomorphism, and
g
Λ =
Z
m
⊆ R
m
× R
n−m
is clearly discrete (take
ε
=
1
2
).
So this direction follows.
For the other direction, suppose Λ is discrete. We pick
y
1
, ··· , y
m
∈
Λ
which are linearly independent over
R
, with
m
maximal (so
m ≤ n
). Then by
maximality, we know
(
m
X
i=1
λ
i
y
i
: λ
i
∈ R
)
=
(
m
X
1
λ
i
z
i
: λ
i
∈ R, z
i
∈ Λ
)
,
and this is the smallest vector subspace of R
n
containing Λ. We now let
X =
(
m
X
i=1
λ
i
y
i
: λ
i
∈ [0, 1]
)
∼
=
[0, 1]
m
.
This is closed and bounded, and hence compact. So X ∩ Λ is finite.
Also, we know
M
Zy
i
= Z
m
⊆ Λ,
and if γ is any element of Λ, we can write it as γ = γ
0
+ γ
1
, where γ
0
∈ X and
γ
1
∈ Z
m
. So
Λ
Z
m
≤ X ∩ Λ < ∞.
So let d = Λ/Z
m
. Then dΛ ⊆ Z
m
, i.e. Λ ⊆
1
d
Z
m
. So
Z
m
⊆ Λ ⊆
1
d
Z
m
.
So Λ is a free abelian group of rank
m
. So there exists
x
1
, ··· , x
m
∈
1
d
Z
m
which
is an integral basis of Λ and are linearly independent over R.
Definition (Lattice). If rank Λ = n = dim R
n
, then Λ is a lattice in R
n
.
Definition
(Covolume and fundamental domain)
.
Let Λ
⊆ R
n
be a lattice, and
x
1
, ··· , x
n
be a basis of Λ, then let
P =
(
n
X
i=1
λ
i
x
i
: λ
i
∈ [0, 1]
)
,
and define the covolume of Λ to be
covol(Λ) = vol(P ) = det A,
where A is the matrix such that x
i
=
P
a
ij
e
j
.
We say P is a fundamental domain for the action of Λ on R
n
, i.e.
R
n
=
[
γ∈Λ
(γ + P ),
and
(γ + P ) ∩ (µ + P ) ⊆ ∂(γ + P ).
In particular, the intersection has zero volume.
This is called the covolume since if we consider the space
R
n
/
Λ, which is an
ndimensional torus, then this has volume covol(Λ).
Observe now that if
x
0
1
, ··· , x
0
n
is a different basis of Λ, then the transition
matrix
x
0
i
=
P
b
ij
x
j
has
B ∈ GL
n
(
Z
). So we have
det B
=
±
1, and
covol
(Λ) is
independent of the basis choice.
With these notations, we can now state Minkowski’s theorem.
Theorem
(Minkowski’s theorem)
.
Let Λ
⊆ R
n
be a lattice, and
P
be a funda
mental domain. We let
S ⊆ R
n
be a measurable set, i.e. one for which
vol
(
S
) is
defined.
(i)
Suppose
vol
(
S
)
> covol
(Λ). Then there exists distinct
x, y ∈ S
such that
x − y ∈ Λ.
(ii)
Suppose
0 ∈ S
, and
S
is symmetric around 0, i.e.
s ∈ S
if and only if
−s ∈ S, and S is convex, i.e. for all x, y ∈ S and λ ∈ [0, 1], then
λx + (1 − λ)y ∈ S.
Then suppose either
(a) vol(S) > 2
n
covol(Λ); or
(b) vol(S) ≥ 2
n
covol(Λ) and S is closed.
Then S contains a γ ∈ Λ with γ 6= 0.
Note that for n = 2, this is what we used for quadratic fields.
By considering Λ =
Z
n
⊆ R
n
and
S
= [
−
1
,
1]
n
, we know the bounds are
sharp.
Proof.
(i)
Suppose
vol
(
S
)
> covol
(Λ) =
vol
(
P
). Since
P ⊆ R
n
is a fundamental
domain, we have
vol(S) = vol(S ∩ R
n
) = vol
S ∩
X
γ∈Λ
(P + γ)
=
X
γ∈Λ
vol(S ∩ (P + γ)).
Also, we know
vol(S ∩ (P + γ)) = vol((S − γ) ∩ P ),
as volume is translation invariant. We now claim the sets (
S − γ
)
∩ P
for
γ ∈ Λ are not pairwise disjoint. If they were, then
vol(P ) ≥
X
γ∈Λ
vol((S − γ) ∩ P ) =
X
γ∈Λ
vol(S ∩ (P + γ)) = vol(S),
contradicting our assumption.
Then in particular, there are some distinct
γ
and
µ
such that (
S − γ
) and
(
S − µ
) are not disjoint. In other words, there are
x, y ∈ S
such that
x − γ = y − µ, i.e. x − y = γ − µ ∈ Λ 6= 0.
(ii) We now let
S
0
=
1
2
S =
1
2
s : s ∈ S
.
So we have
vol(S
0
) = 2
−n
vol(S) > covol(Λ),
by assumption.
(a)
So there exists some distinct
y, z ∈ S
0
such that
y − z ∈
Λ
\ {
0
}
. We
now write
y − z =
1
2
(2y + (−2z)),
Since 2
z ∈ S
implies
−
2
z ∈ S
by symmetry around
0
, so we know
y − z ∈ S by convexity.
(b)
We apply the previous part to
S
m
=
1 +
1
m
S
for all
m ∈ N
,
m >
0.
So we get a nonzero γ
m
∈ S
m
∩ Λ.
By convexity, we know
S
m
⊆ S
1
= 2
S
for all
m
. So
γ
1
, γ
2
, ··· ∈ S
1
∩
Λ.
But
S
1
is compact set. So
S
1
∩
Λ is finite. So there exists
γ
such that
γ
m
is γ infinitely often. So
γ ∈
\
m≥0
S
m
= S.
So γ ∈ S.
We are now going to use this to mimic our previous proof that the class
group of an imaginary quadratic field is finite.
To begin with, we need to produce lattices from ideals of
O
L
. Let
L
be a
number field, and [
L
:
Q
] =
n
. We let
σ
1
, ··· , σ
r
:
L → R
be the real embeddings,
and
σ
r+1
, ··· , σ
r+s
, ¯σ
r+1
, ··· , ¯σ
r+s
:
L → C
be the complex embeddings (note
that which embedding is σ
r+i
and which is ¯σ
r+i
is an arbitrary choice).
Then this defines an embedding
σ = (σ
1
, σ
2
, ··· , σ
r
, σ
r+1
, ··· , σ
r+s
) : L → R
r
× C
s
∼
=
R
r
× R
2s
= R
r+2s
= R
n
,
under the isomorphism C → R
2
by x + iy 7→ (x, y).
Just as we did for quadratic fields, we can relate the norm of ideals to their
covolume.
Lemma.
(i) σ(O
L
) is a lattice in R
n
of covolume 2
−s
D
L

1
2
.
(ii)
More generally, if
aC O
L
is an ideal, then
σ
(
a
) is a lattice and the covolume
covol(σ(a)) = 2
−s
D
L

1
2
N(a).
Proof.
Obviously (ii) implies (i). So we just prove (ii). Recall that
a
has an
integral basis γ
1
, ··· , γ
n
. Then a is the integer span of the vectors
(σ
1
(γ
i
), σ
2
(γ
i
), ··· , σ
r+s
(γ
i
))
for
i
= 1
, ··· , n
, and they are independent as we will soon see when we compute
the determinant. So it is a lattice.
We also know that
∆(γ
1
, ··· , γ
n
) = det(σ
i
(γ
j
))
2
= N(a)
2
D
L
,
where the σ
i
run over all σ
1
, ··· , σ
r
, σ
r+1
, ··· , σ
r+s
, ¯σ
r+1
, ··· ¯σ
r+s
.
So we know
det(σ
i
(γ
j
)) = N(a)D
L

1
2
.
So what we have to do is to relate
det
(
σ
i
(
γ
j
)) to the covolume of
σ
(
a
). But
these two expressions are very similar.
In the σ
i
(γ
j
) matrix, we have columns that look like
σ
r+i
(γ
j
) ¯σ
r+i
(γ
j
)
=
z ¯z
.
On the other hand, the matrix of σ(γ) has corresponding entries
Re(z) Im(z)
=
1
2
(z + ¯z)
i
2
(¯z − z)
=
1
2
1 1
i −i
z
¯z
We call the last matrix A =
1
2
1 1
i −i
. We can compute the determinant as
det A =
det
1
2
1 1
i −i
=
1
2
.
Hence the change of basis matrix from (
σ
i
(
γ
j
)) to
σ
(
γ
) is
s
diagonal copies of
A
,
so has determinant 2
−s
. So this proves the lemma.
Proposition.
Let
a C O
L
be an ideal. Then there exists an
α ∈ a
with
α 6
= 0
such that
N(α) ≤ c
L
N(a),
where
c
L
=
4
π
s
n!
n
n
D
L

1
2
.
This is the Minkowski bound.
Proof. Let
B
r,s
(t) =
n
(y
1
, ··· , y
r
, z
1
, ··· , z
s
) ∈ R
r
× C
s
:
X
y
i
 + 2
X
z
i
 ≤ t
o
.
This
(i) is closed and bounded;
(ii) is measurable (it is defined by polynomial inequalities);
(iii) has volume
vol(B
r,s
(t)) = 2
r
π
2
s
t
n
n!
;
(iv) is convex and symmetric about 0.
Only (iii) requires proof, and it is on the second example sheet, i.e. we are not
doing it here. It is just doing the integral.
We now choose t so that
vol B
r,s
(t) = 2
n
covol(σ(a)).
Explicitly, we let
t
n
=
4
π
s
n!D
L

1/2
N(a).
Then by Minkowski’s lemma, there is some
α ∈ a
nonzero such that
σ
(
α
)
∈
B
r,s
(t). We write
σ(α) = (y
1
, ··· , y
r
, z
1
, ··· , z
s
).
Then we observe
N(α) = y
1
···y
r
z
1
¯z
1
z
2
¯z
2
···z
s
¯z
s
=
Y
y
i
Y
z
j

2
.
By the AMGM inequality, we know
N(α)
1/n
≤
1
n
X
y
i
+ 2
X
z
j

≤
t
n
,
as we know σ(a) ∈ B
r,s
(t). So we get
N(α) ≤
t
n
n
n
= c
L
N(a).
Corollary. Every [a] ∈ cl
L
has a representative a ∈ O
L
with N(a) ≤ c
L
.
Theorem
(Dirichlet)
.
The class group
cl
L
is finite, and is generated by prime
ideals of norm ≤ c
L
.
Proof. Just as the case for imaginary quadratic fields.