7Dirichlet's unit theorem
II Number Fields
7 Dirichlet’s unit theorem
We have previously characterized the units on
O
L
as the elements with unit
norm, i.e.
α ∈ O
L
is a unit if and only if
N
(
α
)

= 1. However, this doesn’t tell
us much about how many units there are, and how they are distributed. The
answer to this question is given by Dirichlet’s unit theorem.
Theorem (Dirichlet unit theorem). We have the isomorphism
O
×
L
∼
=
µ
L
× Z
r+s−1
,
where
µ
L
= {α ∈ L : α
N
= 1 for some N > 0}
is the group of roots of unity in L, and is a finite cyclic group.
Just as in the finiteness of the class group, we do it for an example first, or
else it will be utterly incomprehensible.
We do the example of real quadratic fields,
L
=
Q
(
√
d
), where
d >
1 is
squarefree. So r = 2, s = 0, and L ⊆ R implies µ
L
= {±1}. So
O
×
L
∼
=
{±1} × Z.
Also, we know that
N(x + y
√
d) = (x + y
√
d)(x − y
√
d) = x
2
− dy
2
.
So Dirichlet’s theorem is saying that there are infinitely many solutions of
x
2
− dy
2
= ±1, and are all (plus or minus) the powers of one single element.
Theorem
(Pell’s equation)
.
There are infinitely many
x
+
y
√
d ∈ O
L
such that
x
2
− dy
2
= ±1.
You might have seen this in IIC Number Theory, where we proved it directly
by continued fractions. We will provide a totally unconstructive proof here, since
this is more easily generalized to arbitrary number fields.
This is actually just half of Dirichlet’s theorem. The remaining part is to
show that they are all powers of the same element.
Proof. Recall that σ : O
L
→ R
2
sends
α = x + y
√
d 7→ (σ
1
(α), σ
2
(α)) = (x + y
√
d, x − y
√
d).
(in the domain,
√
d
is a formal symbol, while in the codomain, it is a real number,
namely the positive square root of d)
Also, we know
covol(σ(O
L
)) = D
L

1
2
.
(1, 1)
Z[X]
N(α) = 1
Consider
s
t
=
(y
1
, y
2
) ∈ R
2
: y
1
 ≤ t, y
2
 ≤
D
L

1/2
t
.
So
vol(s
t
) = 4D
L

1
2
= 2
n
covol(O
L
),
as
n
= [
L
:
Q
] = 2. Now Minkowski implies there is an
α ∈ O
L
nonzero such
that σ(α) ∈ s
t
. Also, if we write
σ(α) = (y
1
, y
2
),
then
N(α) = y
1
y
2
.
So such an α will satisfy
1 ≤ N(α) ≤ D
L

1/2
.
This is not quite what we want, since we need
N
(
α
)

= 1 exactly. Nevertheless,
this is a good start. So let’s try to find infinitely such elements.
First notice that no points on the lattice (apart from the origin) hits the
x
or
y
axis, since any such point must satisfy
x ±y
√
d
= 0, but
√
d
is not rational.
Also,
s
t
is compact. So
s
t
∩ σ
(
O
L
) contains finitely many points. So we can
find a
t
2
such that for each (
y
1
, y
2
)
∈ s
t
∩ O
L
, we have
y
1
 > t
2
. In particular,
s
t
2
does not contain any point in
s
t
∩ σ
(
O
L
). So we get a new set of points
α ∈ s
t
2
∩ O
L
such that 1 ≤ N(α) ≤ D
L

1/2
.
s
1
s
2
We can do the same thing for
s
t
2
and get a new
t
3
. In general, given
t
1
> ··· > t
n
,
pick t
n+1
be such that
0 < t
n+1
< min
(
y
1
 : (y
1
, y
2
) ∈
n
[
i=1
s
t
i
∩ σ(O
L
)
)
,
and the minimum is finite since
s
t
is compact and hence contains finitely many
lattice points on σ(O
L
).
Then we get an infinite sequence of
t
i
such that
s
t
i
∩ σ
(
O
L
) are disjoint for
different
i
. Since each must contain at least one point, we have got infinitely
many points in O
L
satisfying 1 ≤ N(α) ≤ D
L

1/2
.
Since there are only finitely many integers between 1 and
D
L

1/2
, we can
apply the pigeonhole principle, and get that there is some integer satisfying
1
≤ m ≤ D
L

1/2
such that there exists infinitely many
α ∈ O
L
with
N
(
α
) =
m
.
This is not quite good enough. We consider
O
L
/mO
L
∼
=
(Z/mZ)
[L:Q]
,
another finite set. We notice that each
α ∈ O
L
must fall into one of finitely
many the cosets of
mO
L
in
O
L
. In particular, each
α
such that
N
(
α
) =
m
must
belong to one of these cosets.
So again by the pigeonhole principle, there exists a
β ∈ O
L
with
N
(
β
) =
m
,
and infinitely many α ∈ O
L
with N(α) = m and α = β (mod mO
L
).
Now of course
α
and
β
are not necessarily units, if
m 6
= 1. However, we will
show that
α/β
is. The hard part is of course showing that it is in
O
L
itself,
because it is clear that
α/β
has norm 1 (alternatively, by symmetry,
β/α
is in
O
L
, so an inverse exists).
Hence all it remains is to prove the general fact that if
α = β + mγ,
where α, β, γ ∈ O
L
and N(α) = N(β) = m, then α/β ∈ O
L
.
To show this, we just have to compute
α
β
= 1 +
m
β
γ = 1 +
N(β)
β
γ = 1 +
¯
βγ ∈ O
L
,
since N(β) = β
¯
β. So done.
We have thus constructed infinitely many units.
We now prove the remaining part
Theorem
(Dirchlet’s unit theorem for real quadratic fields)
.
Let
L
=
Q
(
√
d
).
Then there is some ε
0
∈ O
×
L
such that
O
×
L
= {±ε
n
0
: n ∈ Z}.
We call such an ε
0
a fundamental unit (which is not unique). So
O
×
L
∼
=
{±1} × Z.
Proof.
We have just proved the really powerful theorem that there are infinitely
many
ε
with
N
(
ε
) = 1. We are not going to need the full theorem. All we need
is that there are three — in particular, something that is not ±1.
We pick some
ε ∈ O
×
L
with
ε 6
=
±
1. This exists by what we just proved.
Then we know
σ
1
(ε) 6= 1,
as
σ
1
(
ε
)

= 1 if and only if
ε
=
±
1. Replacing by
ε
−1
if necessary, we wlog
E = σ
1
(ε) > 1. Now consider
{α ∈ O
L
: N(α) = ±1, 1 ≤ σ
1
(α) ≤ E}.
This is again finite, since it is specified by a compact subset of the
O
L
lattice.
So we pick
ε
0
in this set with
ε
0
6
=
±
1 and
σ
1
(
ε
0
)

minimal (
>
1). Replacing
ε
0
by −ε
0
if necessary, we can assume σ
1
(ε) > 1.
Finally, we claim that if
ε ∈ O
×
L
and
σ
1
(
ε
)
>
0, then
ε
=
ε
N
0
for some
N ∈ Z
.
This is obvious if we have addition instead of multiplication. So we take logs.
Suppose
log ε
log ε
0
= N + γ,
where N ∈ Z and 0 ≤ γ < 1. Then we know
εε
−N
0
= ε
γ
0
∈ O
×
L
,
but
ε
γ
0

=
ε
0

γ
< ε
0

, as
ε
0
 >
1. By our choice of
ε
0
, we must have
γ
= 0. So
done.
Now we get to prove the Dirichlet unit theorem in its full glory.
Theorem (Dirichlet unit theorem). We have the isomorphism
O
×
L
∼
=
µ
L
× Z
r+s−1
,
where
µ
L
= {α ∈ L : α
N
= 1 for some N > 0}
is the group of roots of unity in L, and is a finite cyclic group.
Proof.
We do the proof in the opposite order. We throw in the logarithm at the
very beginning. We define
` : O
×
L
→ R
r+s
by
x 7→ (log σ
1
(x), ··· , log σ
r
(x), 2 log σ
r+1
(x), ··· , 2 log σ
r+s
(x)).
Note that
σ
r+i
(
x
)

=
σ
r+`
(x)
. So this is independent of the choice of one of
σ
r+i
, ¯σ
r+i
.
Claim.
We now claim that
im `
is a discrete group in
R
r+s
and
ker `
=
µ
L
is a
finite cyclic group.
We note that
log ab = log a + log b.
So this is a group homomorphism, and the image is a subgroup. To prove the
first part, it suffices to show that
im ` ∩
[
−A, A
]
r+s
is finite for all
A >
0. We
notice ` factors as
O
×
L
O
L
R
r
× C
s
R
r+s
.
σ
j
where σ maps α 7→ (σ
1
(α), ··· , σ
r+s
(α)), and
j : (y
1
, ··· , y
r
, z
1
, ··· , z
s
) 7→ (log y
1
, ··· , log y
r
, 2 log z
1
, ··· , 2 log z
2
).
We see
j
−1
([−A, A]
r+s
) = {(y
i
, z
j
) : e
−A
≤ y
i
 ≤ e
A
, e
−A
≤ 2z
j
 ≤ e
A
}
is a compact set, and
σ
(
O
L
) is a lattice, in particular discrete. So
σ
(
O
L
)
∩
j
−1
([
−A, A
]
r+s
) is finite. This also shows the kernel is finite, since the kernel is
the inverse image of a compact set.
Now as
ker `
is finite, all elements are of finite order. So
ker ` ⊆ µ
L
. Con
versely, it is clear that
µ
L
⊆ ker `
. So it remains to show that
µ
L
is cyclic. Since
L
embeds in
C
, we know
µ
L
is contained in the roots of unity in
C
. Since
µ
L
is
finite, we know
L
is generated by a root of unity with the smallest argument
(from, say, IA Groups).
Claim. We claim that
im ` ⊆
n
(y
1
, ··· , y
r+s
) :
X
y
i
= 0
o
∼
=
R
r+s−1
.
To show this, note that if α ∈ O
×
L
, then
N(α) =
n
Y
i=1
σ
i
(α)
s
Y
`=1
σ
r+`
(α)¯σ
r+`
= ±1.
Taking the log of the absolute values, we get
0 =
X
log σ
i
(α) + 2
X
log σ
r+i
(α).
So we know
im ` ⊆ R
r+s−1
as a discrete subgroup. So it is isomorphic to
Z
a
for some
a ≤ r
+
s −
1. Then what we want to show is that
im ` ⊆ R
r+s−1
is a
lattice, i.e. it is congruent to Z
r+s−1
.
Note that so far what we have done is the second part of what we did for the
real quadratic fields. We took the logarithm to show that these form a discrete
subgroup. Next, we want to find
r
+
s −
1 independent elements to show it is a
lattice.
Claim.
Fix a
k
such that 1
≤ k ≤ r
+
s
and
α ∈ O
L
with
α 6
= 0. Then there
exists a β ∈ O
L
such that
N(β) ≤
2
π
s
D
L

1/2
,
and moreover if we write
`(α) = (a
1
, ··· , a
r+s
)
`(β) = (b
1
, ··· , b
r+s
),
then we have b
i
< a
i
for all i 6= k.
We can apply Minkowski to the region
S = {(y
1
, ··· , y
r
, z
1
, ··· , z
s
) ∈ R
r
× C
s
: y
i
 ≤ c
i
, z
j
 ≤ c
r+j
}
(we will decide what values of c
i
to take later). Then this has volume
vol(S) = 2
r
π
s
c
1
···c
r+s
.
We notice
S
is convex and symmetric around 0. So if we choose 0
< c
i
< e
a
i
for
i 6= k, and choose
c
k
=
2
π
s
D
L

1/2
1
c
1
···ˆc
k
···c
r+s
.
Then Minkowski gives β ∈ σ(O
L
) ∩ S, satisfying the two conditions above.
Claim.
For any
k
= 1
, ··· , r
+
s
, there is a unit
u
k
∈ O
×
L
such that if
`
(
u
k
) =
(y
1
, ··· , y
r+s
, then y
i
< 0 for all i 6= k (and hence y
k
> 0 since
P
y
i
= 0).
This is just as in the proof for the real quadratic case. We can repeatedly
apply the previous claim to get a sequence
α
1
, α
2
, ··· ∈ O
L
such that
N
(
α
t
)
is bounded for all
t
, and for all
i 6
=
k
, the
i
th coordinate of
`
(
α
1
)
, `
(
α
2
)
, ···
is strictly decreasing. But then as with real quadratic fields, the pigeonhole
principle implies we can find t, t
0
such that
N(α
t
) = N(α
t
0
) = m,
say, and
α
t
≡ α
t
0
(mod mO
L
),
i.e.
α
t
=
α
t
0
in
O
L
/mO
L
. Hence for each
k
, we get a unit
u
k
=
α
t
/α
t
0
such that
`(u
k
) = `(α
t
) − `(α
0
t
) = (y
1
, ··· , y
r+s
)
has
y
i
<
0 if
i 6
=
k
(and hence
y
k
>
0, since
P
y
i
= 0). We need a final trick to
show the following:
Claim.
The units
u
1
, ··· , u
r+s−1
are linearly independent in
R
r+s−1
. Hence
the rank of `(O
×
L
) = r + s − 1, and Dirichlet’s theorem is proved.
We let
A
be the (
r
+
s
)
×
(
r
+
s
) matrix whose
j
th row is
`
(
u
j
), and apply
the following lemma:
Claim.
Let
A ∈ Mat
m
(
R
) be such that
a
ii
>
0 for all
i
and
a
ij
<
0 for all
i 6
=
j
,
and
P
j
a
ij
≥ 0 for each i. Then rank(A) ≥ m − 1.
To show this, we let
v
i
be the
i
th column of
A
. We show that
v
1
, ··· , v
m−1
are linearly independent. If not, there exists a sequence t
i
∈ R such that
m−1
X
i=1
t
i
v
i
= 0, (∗)
with not all of the
t
i
nonzero. We choose
k
so that
t
k

is maximal among the
t
1
, ··· , t
m−1
’s. We divide the whole equation by
t
k
. So we can wlog assume
t
k
= 1, t
i
≤ 1 for all i.
Now consider the kth row of (∗). We get
0 =
m−1
X
i=1
t
i
a
ki
≥
m−1
X
i=1
a
ki
,
as
a <
0 and
t ≤
1 implies
at ≥ a
. Moreover, we know
a
mi
>
0 strictly. So we
get
0 >
m
X
i=1
a
ki
≥ 0.
This is a contradiction. So done.
You should not expect this to be examinable.
We make a quick definition that we will need later.
Definition (Regulator). The regulator of a number field L is
R
L
= covol(`(O
×
L
) ⊆ R
r+s−1
).
More concretely, we pick fundamental units ε
1
, ··· , ε
r+s−1
∈ O
×
L
so that
O
×
L
= µ
L
× {ε
n
1
1
···ε
n
r+s−1
r+s−1
: n
i
∈ Z}.
We take any (
r
+
s −
1)(
r
+
s −
1) subminor of the matrix
`(ε
1
) ··· `(ε
r+s
)
.
Their determinants all have the same absolute value, and
det(subminor) = R
L
.
This is a definition we will need later.
We quickly look at some examples with quadratic fields. Consider
L
=
Q
(
√
d
),
where d 6= 0, 1 squarefree.
Example.
If
d <
0, then
r
= 0 and
s
= 1. So
r
+
s −
1 = 0. So
O
×
L
=
µ
L
is a
finite group. So R
L
= 1.
Lemma.
(i) If d = −1, then Z[i]
×
= {±1, ±i} = Z/4Z.
(ii)
If
d
=
−
3, then let
ω
=
1
2
(1 +
√
d
), and we have
ω
6
= 1. So
Z
[
ω
]
×
=
{1, ω, ··· , ω
5
}
∼
=
Z/6Z.
(iii) For any other d < 0, we have O
×
L
= {±1}.
Proof. This is just a direct check.
If
d ≡
2
,
3 (
mod
4), then by looking at the solution of
x
2
− dy
2
=
±
1 in the
integers, we get (i) and (iii).
If
d ≡
1 (
mod
4), then by looking at the solutions to
x +
y
2
2
−
d
4
y
2
=
±
1
in the integers, we get (ii) and (iii).
Now if
d >
0, then
R
L
=
log ε
, where
ε
is a fundamental unit. So how do
we find a fundamental unit? In general, there is no good algorithm for finding the
fundamental unit of a fundamental field. The best algorithm takes exponential
time. We do have a good algorithm for quadratic fields using continued fractions,
but we are not allowed to use that.
Instead, we could just guess a solution — we find a unit by guessing, and
then show there is no smaller one by direct check.
Example.
Consider the field
Q
(
√
2
). We can try
ε
= 1 +
√
2
. We have
N
(
ε
) = 1
−
2 =
−
1. So this is a unit. We claim this is fundamental. If not,
there there exists
u
=
a
+
b
√
2
, where
a, b ∈ Z
and 1
< u < ε
(as real numbers).
Then we have
¯u = a − b
√
2
has u¯u = ±1. Since u > 1, we know ¯u < 1. Then we must have u ± ¯u > 0. So
we need a, b > 0. We know can only be finitely many possibilities for
1 < a + b
√
2 < 1 +
√
2,
where a, b are positive integers. But there actually are none. So done.
Example.
Consider
Q
(
√
11
). We guess
ε
= 10
−
3
√
11
is a unit. We can
compute N(ε) = 100 − 99 = 1. Note that ε < 1 and ε
−1
> 1.
Suppose this is not fundamental. Then we have some u such that
1 < u = a + b
√
11 < 10 + 3
√
11 = ε
−1
< 20. (∗)
We can check all the cases, but there is a faster way.
We must have
N
(
u
) =
±
1. If
N
(
u
) =
−
1, then
a
2
−
11
b
2
=
−
1. But
−
1 is
not a square mod 11.
So there we must have
N
(
u
) = 1. Then
u
−1
=
¯u
. We get 0
< ε < u
−1
=
¯u <
1
also. So
−1 < −a + b
√
11 < 0.
Adding this to (∗), we get
0 < 2b
√
11 < 10 + 3
√
11 < 7
√
11.
So b = 1, 2 or 3, but 11b
2
+ 1 is not a square for each of these. So done.