2Norm, trace, discriminant, numbers

II Number Fields

2 Norm, trace, discriminant, numbers

Recall that in our motivating example of

Z

[

i

], one important tool was the norm

of an algebraic integer

x

+

iy

, given by

N

(

x

+

iy

) =

x

2

+

y

2

. This can be

generalized to arbitrary number fields, and will prove itself to be a very useful

notion to consider. Apart from the norm, we will also consider a number known

as the trace, which is also useful. We will also study numbers associated with

the number field itself, rather than particular elements of the field, and it turns

out they tell us a lot about how the field behaves.

Norm and trace

Recall the following definition from IID Galois Theory:

Definition

(Norm and trace)

.

Let

L/K

be a field extension, and

α ∈ L

. We

write

m

α

:

L → L

for the map

7→ α

. Viewing this as a linear map of

L

vector

spaces, we define the norm of α to be

N

L/K

(α) = det m

α

,

and the trace to be

tr

L/K

(α) = tr m

α

.

The following property is immediate:

Proposition.

For a field extension

L/K

and

a, b ∈ L

, we have

N

(

ab

) =

N(a)N(b) and tr(a + b) = tr(a) + tr(b).

We can alternatively define the norm and trace as follows:

Proposition.

Let

p

α

∈ K

[

x

] be the minimal polynomial of

α

. Then the

characteristic polynomial of m

α

is

det(xI − m

α

) = p

[L:K(α)]

α

Hence if p

α

(x) splits in some field L

0

⊇ K(α), say

p

α

(x) = (x − α

1

) ···(x − α

r

),

then

N

K(α)/K

(α) =

Y

α

i

, tr

K(α)/K

(α) =

X

α

i

,

and hence

N

L/K

(α) =

Y

α

i

[L:K(α)]

, tr

L/K

= [L : K(α)]

X

α

i

.

This was proved in the IID Galois Theory course, and we will just use it

without proving.

Corollary. Let L ⊇ Q be a number field. Then the following are equivalent:

(i) α ∈ O

L

.

(ii) The minimal polynomial p

α

is in Z[x]

(iii) The characteristic polynomial of m

α

is in Z[x].

This in particular implies N

L/Q

(α) ∈ Z and tr

L/Q

(α) ∈ Z.

Proof.

The equivalence between the first two was already proven. For the

equivalence between (ii) and (iii), if

m

α

∈ Z

[

x

], then

α ∈ O

L

since it vanishes

on a monic polynomial in

Z

[

x

]. On the other hand, if

p

α

∈ Z

[

x

], then so is the

characteristic polynomial, since it is just p

N

α

.

The final implication comes from the fact that the norm and trace are just

coefficients of the characteristic polynomial.

It would be nice if the last implication is an if and only if. This is in general

not true, but it occurs, obviously, when the characteristic polynomial is quadratic,

since the norm and trace would be the only coefficients.

Example.

Let

L

=

K

(

√

d

) =

K

[

z

]

/

(

z

2

−d

), where

d

is not a square in

K

. As a

vector space over

K

, we can take 1

,

√

d

as our basis. So every

α

can be written

as

α = x + y

√

d.

Hence the matrix of multiplication by α is

m

α

=

x dy

y x

.

So the trace and norm are given by

tr

L/K

(x + y

√

d) = 2x = (x + y

√

d) + (x − y

√

d)

N

L/K

(x + y

√

d) = x

2

− dy

2

= (x + y

√

d)(x − y

√

d)

We can also obtain this by consider the roots of the minimal polynomial of

α = x + y

√

d, namely (α − x)

2

− y

2

d = 0, which has roots x ±y

√

d.

In particular, if

L

=

Q

(

√

d

), with

d <

0, then the norm of an element is just

the norm of it as a complex number.

Now that we have computed the general trace and norm, we can use the

proposition to find out what the algebraic integers are. It turns out the result is

(slightly) unexpected:

Lemma. Let L = Q(

√

d), where d ∈ Z is not 0, 1 and is square-free. Then

O

L

=

(

Z[

√

d] d ≡ 2 or 3 (mod 4)

Z

h

1

2

(1 +

√

d)

i

d ≡ 1 (mod 4)

Proof.

We know

x

+

y

√

λ ∈ O

L

if and only if 2

x, x

2

− dy

2

∈ Z

by the previous

example. These imply 4

dy

2

∈ Z

. So if

y

=

r

s

with

r, s

coprime,

r, s ∈ Z

, then we

must have s

2

| 4d. But d is square-free. So s = 1 or 2. So

x =

u

2

, y =

v

2

for some

u, v ∈ Z

. Then we know

u

2

−dv

2

∈

4

Z

, i.e.

u

2

≡ dv

2

(

mod

4). But we

know the squares mod 4 are always 0 and 1. So if

d 6≡

1 (

mod

4), then

u

2

≡ dv

2

(

mod

4) imply that

u

2

=

v

2

= 0 (

mod

4), and hence

u, v

are even. So

x, y ∈ Z

,

giving O

L

= Z[

√

d].

On the other hand, if

d ≡

1 (

mod

4), then

u, v

have the same parity mod 2,

i.e. we can write x + y

√

d as a Z-combination of 1 and

1

2

(1 +

√

d).

As a sanity check, we find that the minimal polynomial of

1

2

(1 +

√

d

) is

x

2

− x +

1

4

(1 − d) which is in Z if and only if d ≡ 1 (mod 4).

Field embeddings

Recall the following theorem from IID Galois Theory:

Theorem

(Primitive element theorem)

.

Let

K ⊆ L

be a separable field extension.

Then there exists an α ∈ L such that K(α) = L.

For example, Q(

√

2,

√

3) = Q(

√

2 +

√

3).

Since

Q

has characteristic zero, it follows that all number fields are separable

extensions. So any number field

L/Q

is of the form

L

=

Q

(

α

). This makes it

much easier to study number fields, as the only extra “stuff” we have on top of

Q.

One particular thing we can do is to look at the number of ways we can

embed

L → C

. For example, for

Q

(

√

−1

), there are two such embeddings — one

sends

√

−1 to i and the other sends

√

−1 to −i.

Lemma.

The degree [

L

:

Q

] =

n

of a number field is the number of field

embeddings L → C.

Proof.

Let

α

be a primitive element, and

p

α

(

x

)

∈ Q

[

x

] its minimal polynomial.

Then by we have

deg p

α

= [

L

:

Q

] =

n

, as 1

, α, α

2

, ··· , α

n−1

is a basis. Moreover,

Q[x]

(p

α

)

∼

=

Q(α) = L.

Since L/Q is separable, we know p

α

has n distinct roots in C. Write

p

α

(x) = (x − α

1

) ···(x − α

n

).

Now an embedding

Q

[

x

]

/

(

p

α

)

→ C

is uniquely determined by the image of

x

,

and

x

must be sent to one of the roots of

p

α

. So for each

i

, the map

x 7→ α

i

gives us a field embedding, and these are all. So there are n of them.

Using these field embeddings, we can come up with the following alternative

formula for the norm and trace.

Corollary.

Let

L/Q

be a number field. If

σ

1

, ··· , σ

n

:

L → C

are the different

field embeddings and β ∈ L, then

tr

L/Q

(β) =

X

σ

i

(β), N

L/Q

(β) =

Y

i

σ

i

(β).

We call σ

1

(β), ··· , σ

n

(β) the conjugates of β in C.

Proof is in the Galois theory course.

Using this characterization, we have the following very concrete test for when

something is a unit.

Lemma. Let x ∈ O

L

. Then x is a unit if and only if N

L/Q

(x) = ±1.

Notation. Write O

×

L

= {x ∈ O

L

: x

−1

∈ O

L

}, the units in O

L

.

Proof.

(

⇒

) We know

N

(

ab

) =

N

(

a

)

N

(

b

). So if

x ∈ O

×

L

, then there is some

y ∈ O

L

such that xy = 1. So N(x)N(y) = 1. So N(x) is a unit in Z, i.e. ±1.

(

⇐

) Let

σ

1

, ··· , σ

n

:

L → C

be the

n

embeddings of

L

in

C

. For notational

convenience, We suppose that

L

is already subfield of

C

, and

σ

1

is the inclusion

map. Then for each x ∈ O

L

, we have

N(x) = xσ

2

(x) ···σ

n

(x).

Now if

N

(

x

) =

±

1, then

x

−1

=

±σ

2

(

x

)

···σ

n

(

x

). So we have

x

−1

∈ O

L

, since

this is a product of algebraic integers. So x is a unit in O

L

.

Corollary. If x ∈ O

L

is such that N (x) is prime, then x is irreducible.

Proof.

If

x

=

ab

, then

N

(

a

)

N

(

b

) =

N

(

x

). Since

N

(

x

) is prime, either

N

(

a

) =

±

1

or N(b) = ±1. So a or b is a unit.

We can consider a more refined notion than just the number of field embed-

dings.

Definition

(

r

and

s

)

.

We write

r

for the number of field embeddings

L → R

,

and s the number of pairs of non-real field embeddings L → C. Then

n = r + 2s.

Alternatively,

r

is the number of real roots of

p

α

, and

s

is the number of pairs of

complex conjugate roots.

The distinction between real embeddings and complex embeddings will be

important in the second half of the course.

Discriminant

The final invariant we will look at in this chapter is the discriminant. It is based

on the following observation:

Proposition.

Let

L/K

be a separable extension. Then a

K

-bilinear form

L × L → K

defined by (

x, y

)

7→ tr

L/K

(

xy

) is non-degenerate. Equivalent, if

α

1

, ··· , α

n

are a

K

-basis for

L

, the Gram matrix (

tr

(

α

i

α

j

))

i,j=1,···,n

has non-zero

determinant.

Recall from Galois theory that if

L/K

is not separable, then

tr

L/K

= 0, and

it is very very degenerate. Also, note that if

K

is of characteristic 0, then there is

a quick and dirty proof of this fact — the trace map is non-degenerate, because

for any

x ∈ K

, we have

tr

L/K

(

x · x

−1

) =

n 6

= 0. This is really the only case

we care about, but in the proof of the general result, we will also find a useful

formula for the discriminant when the basis is 1, θ, θ

2

, . . . , θ

n−1

.

We will use the following important notation:

Notation.

∆(α

1

, ··· , α

n

) = det(tr

L/K

(α

i

α

j

)).

Proof.

Let

σ

1

, ··· , σ

n

:

L →

¯

K

be the

n

distinct

K

-linear field embeddings

L →

¯

K. Put

S = (σ

i

(α

j

))

i,j=1,···,n

=

σ

1

(α

1

) ··· σ

1

(α

n

)

.

.

.

.

.

.

.

.

.

σ

n

(α

1

) ··· σ

n

(α

n

).

Then

S

T

S =

n

X

k=1

σ

k

(α

i

)σ

k

(α

j

)

!

i,j=1,···n

.

We know σ

k

is a field homomorphism. So

n

X

k=1

σ

k

(α

i

)σ

k

(α

j

) =

n

X

k=1

σ

k

(α

i

α

j

) = tr

L/K

(α

i

α

j

).

So

S

T

S = (tr(α

i

α

j

))

i,j=1,···,n

.

So we have

∆(α

1

, ··· , α

n

) = det(S

T

S) = det(S)

2

.

Now we use the theorem of primitive elements to write

L

=

K

(

θ

) such that

1, θ, ··· , θ

n−1

is a basis for L over K, with [L : K] = n. Now S is just

S =

1 σ

1

(θ) ··· σ

1

(θ)

n−1

.

.

.

.

.

.

.

.

.

.

.

.

1 σ

n

(θ) ··· σ

n

(θ)

n−1

.

This is a Vandermonde matrix, and so

∆(1, θ, ··· , θ

n−1

) = (det S)

2

=

Y

i<j

(σ

i

(θ) − σ

j

(θ))

2

.

Since the field extension is separable, and hence

σ

i

6

=

σ

j

for all

i, j

, this implies

σ

i

(

θ

)

6

=

σ

j

(

θ

), since

θ

generates the field. So the product above is non-zero.

So we have this nice canonical bilinear map. However, this determinant is

not canonical. Recall that if

α

1

, ··· , α

n

is a basis for

L/K

, and

α

0

1

, ··· , α

0

n

is

another basis, then

α

0

i

=

X

a

ij

α

j

for some A = (a

ij

) ∈ GL

n

(K). So

∆(α

0

1

, ··· , α

0

n

) = (det A)

2

∆(α

1

, ··· , α

n

).

However, for number fields, we shall see that we can pick a “canonical” basis,

and get a canonical value for ∆. We will call this the discriminant.

Definition

(Integral basis)

.

Let

L/Q

be a number field. Then a basis

α

1

, ··· , α

n

of L is an integral basis if

O

L

=

(

n

X

i=1

m

i

α

i

: m

i

∈ Z

)

=

n

M

1

Zα

i

.

In other words, it is simultaneously a basis for L over Q and O

L

over Z.

Note that integral bases are not unique, just as with usual bases. Given one

basis, you can get any other by acting by GL

n

(Z).

Example.

Consider

Q

(

√

d

) with

d

square-free,

d 6

= 0

,

1. If

d

∼

=

1 (

mod

4),

we’ve seen that 1

,

1

2

(1 +

√

λ

) is an integral basis. Otherwise, if

d

∼

=

2

,

3 (

mod

4),

then 1,

√

d is an integral basis.

The important theorem is that an integral basis always exists.

Theorem.

Let

Q/L

be a number field. Then there exists an integral basis for

O

L

. In particular, O

L

∼

=

Z

n

with n = [L : Q].

Proof.

Let

α

1

, ··· , α

n

be any basis of

L

over

Q

. We have proved that there is

some

n

i

∈ Z

such that

n

i

α

i

∈ O

L

. So wlog

α

1

, ··· , α

n

∈ O

L

, and are an basis of

L

over

Q

. Since

α

i

are integral, so are

α

i

α

j

, and so all these have integer trace,

as we have previously shown. Hence ∆(

α

1

, ··· , α

n

), being the determinant of a

matrix with integer entries, is an integer.

Now choose a

Q

-basis

α

1

, ··· , α

n

∈ O

L

such that ∆(

α

1

, ··· , α

n

)

∈ Z \ {

0

}

has minimal absolute value. We will show that these are an integral basis.

Let x ∈ O

L

, and write

x =

X

λ

i

α

i

for some λ

i

∈ Q. These λ

i

are necessarily unique since α

1

, ··· , α

n

is a basis.

Suppose some λ

i

6∈ Z. wlog say λ

1

6∈ Z. We write

λ

1

= n

1

+ ε

1

,

for n

1

∈ Z and 0 < ε

1

< 1. We put

α

0

1

= x − n

1

α

1

= ε

1

α

1

+ λ

2

α

2

+ ··· + λ

n

α

n

∈ O

L

.

So α

0

1

, α

2

, ··· , α

n

is still a basis for L/Q, and are still in O

L

. But then

∆(α

0

1

, ··· , α

n

) = ε

2

1

· ∆(α

1

, ··· , α

n

) < ∆(α

1

, ··· , α

n

).

This contradicts minimality. So we must have

λ

i

∈ Z

for all

Z

. So this is a basis

for O

L

.

Now if

α

0

1

, ··· , α

0

n

is another integral basis of

L

over

Q

, then there is some

g ∈ GL

n

(

Z

) such that

gα

i

=

α

0

i

. Since

det

(

g

) is invertible in

Z

, it must be 1 or

−1, and hence

det ∆(α

0

1

, ··· , α

0

n

) = det(g)

2

∆(α

1

, ··· , α

n

) = ∆(α

1

, ··· , α

n

)

and is independent of the choice of integral basis.

Definition

(Discriminant)

.

The discriminant

D

L

of a number field

L

is defined

as

D

L

= ∆(α

1

, ··· , α

n

)

for any integral basis α

1

, ··· , α

n

.

Example.

Let

L

=

Q

(

√

d

), where

d 6

= 0

,

1 and

d

is square-free. If

d

∼

=

2

,

3

(mod 4), then it has an integral basis 1,

√

d. So

D

L

= det

1

√

d

1 −

√

d

2

= 4d.

Otherwise, if d

∼

=

1 (mod 4), then

D

L

= det

1

1

2

(1 +

√

d)

1

1

2

(1 −

√

d)

2

= d.

Recall that we have seen the word discriminant before, and let’s make sure

these concepts are more-or-less consistent. Recall that the discriminant of a

polynomial f(x) =

Q

(x − α

i

) is defined as

disc(f) =

Y

i<j

(α

i

− α

j

)

2

= (−1)

n(n−1)/2

Y

i6=j

(α

i

− α

j

).

If

p

θ

(

x

)

∈ K

[

x

] is the minimal polynomial of

θ

(where

L

=

K

[

θ

]), then the roots

of p

θ

are σ

i

(θ). Hence we get

disc(p

θ

) =

Y

i<j

(σ

i

(θ) − σ

j

(θ))

2

.

In other words,

disc(p

θ

) = ∆(1, θ, ··· , θ

n−1

).

So this makes sense.