3Modules

IB Groups, Rings and Modules

3.4 Modules over F[X] and normal forms for matrices

That was one promise delivered. We next want to consider the Jordan normal

form. This is less straightforward, since considering

V

directly as an

F

module

would not be too helpful (since that would just be pure linear algebra). Instead,

we use the following trick:

For a field

F

, the polynomial ring

F

[

X

] is a Euclidean domain, so the results

of the last few sections apply. If

V

is a vector space on

F

, and

α

:

V → V

is a

linear map, then we can make V into an F[X]-module via

F[X] × V → V

(f, v) 7→ (f(α))(v).

We write V

α

for this F[X]-module.

Lemma.

If

V

is a finite-dimensional vector space, then

V

α

is a finitely-generated

F[X]-module.

Proof.

If

v

1

, ··· , v

n

generate

V

as an

F

-module, i.e. spans

V

as a vector space

over F, then they also generate V

α

as an F[X]-module, since F ≤ F[X].

Example.

Suppose

V

α

∼

=

F

[

X

]

/

(

X

r

) as

F

[

X

]-modules. Then in particular

they are isomorphic as

F

-modules (since being a map of

F

-modules has fewer

requirements than being a map of F[X]-modules).

Under this bijection, the elements 1

, X, X

2

, ··· , X

r−1

∈ F

[

X

]

/

(

X

r

) form a

vector space basis for

V

α

. Viewing

F

[

X

]

/

(

X

r

) as an

F

-vector space, the action

of X has the matrix

0 0 ··· 0 0

1 0 ··· 0 0

0 1 ··· 0 0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 0 ··· 1 0

.

We also know that in V

α

, the action of X is by definition the linear map α. So

under this basis, α also has matrix

0 0 ··· 0 0

1 0 ··· 0 0

0 1 ··· 0 0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 0 ··· 1 0

.

Example. Suppose

V

α

∼

=

F[X]

((X − λ)

r

)

for some λ ∈ F. Consider the new linear map

β = α − λ · id : V → V.

Then

V

β

∼

=

F

[

Y

]

/

(

Y

r

), for

Y

=

X − λ

. So there is a basis for

V

so that

β

looks

like

0 0 ··· 0 0

1 0 ··· 0 0

0 1 ··· 0 0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 0 ··· 1 0

.

So we know α has matrix

λ 0 ··· 0 0

1 λ ··· 0 0

0 1 ··· 0 0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 0 ··· 1 λ

So it is a Jordan block (except the Jordan blocks are the other way round, with

zeroes below the diagonal).

Example. Suppose V

α

∼

=

F[X]/(f) for some polynomial f, for

f = a

0

+ a

1

X + ··· + a

r−1

X

r−1

+ X

r

.

This has a basis 1, X, X

2

, ··· , X

r−1

as well, in which α is

c(f) =

0 0 ··· 0 −a

0

1 0 ··· 0 −a

1

0 1 ··· 0 −a

2

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 0 ··· 1 −a

r−1

.

We call this the companion matrix for the monic polynomial f.

These are different things that can possibly happen. Since we have already

classified all finitely generated

F

[

X

] modules, this allows us to put matrices in a

rather nice form.

Theorem

(Rational canonical form)

.

Let

α

:

V → V

be a linear endomorphism

of a finite-dimensional vector space over

F

, and

V

α

be the associated

F

[

X

]-module.

Then

V

α

∼

=

F[X]

(f

1

)

⊕

F[X]

(f

2

)

⊕ ··· ⊕

F[X]

(f

s

)

,

with

f

1

| f

2

| ··· | f

2

. Thus there is a basis for

V

in which the matrix for

α

is

the block diagonal

c(f

1

) 0 ··· 0

0 c(f

2

) ··· 0

.

.

.

.

.

.

.

.

.

.

.

.

0 0 ··· c(f

s

)

This is the sort of theorem whose statement is longer than the proof.

Proof.

We already know that

V

α

is a finitely-generated

F

[

X

] module. By the

structure theorem of F[X]-modules, we know

V

α

∼

=

F[X]

(f

1

)

⊕

F[X]

(f

2

)

⊕ ··· ⊕

F[X]

(f

s

)

⊕ 0.

We know there are no copies of

F

[

X

], since

V

α

=

V

is finite-dimensional over

F

, but

F

[

X

] is not. The divisibility criterion also follows from the structure

theorem. Then the form of the matrix is immediate.

This is really a canonical form. The Jordan normal form is not canonical,

since we can move the blocks around. The structure theorem determines the

factors f

i

up to units, and once we require it is monic, there is no choice left.

In terms of matrices, this says that if

α

is represented by a matrix

A ∈

M

n,n

(F ) in some basis, then A is conjugate to a matrix of the form above.

From the rational canonical form, we can immediately read off the minimal

polynomial as

f

s

. This is since if we view

V

α

as the decomposition above, we

find that

f

s

(

α

) kills everything in

F[X]

(f

s

)

. It also kills the other factors since

f

i

| f

s

for all

i

. So

f

s

(

α

) = 0. We also know no smaller polynomial kills

V

, since it

does not kill

F[X]

(f

s

)

.

Similarly, we find that the characteristic polynomial of α is f

1

f

2

···f

s

.

Recall we had a different way of decomposing a module over a Euclidean

domain, namely the prime decomposition, and this gives us the Jordan normal

form.

Before we can use that, we need to know what the primes are. This is why

we need to work over C.

Lemma.

The prime elements of

C

[

X

] are the

X − λ

for

λ ∈ C

(up to multipli-

cation by units).

Proof.

Let

f ∈ C

[

X

]. If

f

is constant, then it is either a unit or 0. Otherwise, by

the fundamental theorem of algebra, it has a root

λ

. So it is divisible by

X − λ

.

So if

f

is irreducible, it must have degree 1. And clearly everything of degree 1

is prime.

Applying the prime decomposition theorem to

C

[

X

] modules gives us the

Jordan normal form.

Theorem

(Jordan normal form)

.

Let

α

:

V → V

be an endomorphism of a

vector space V over C, and V

α

be the associated C[X] module. Then

V

α

∼

=

C[X]

((X − λ

1

)

a

1

)

⊕

C[X]

((X − λ

2

)

a

2

)

⊕ ··· ⊕

C[X]

((X − λ

t

)

a

t

)

,

where

λ

i

∈ C

do not have to be distinct. So there is a basis of

V

in which

α

has

matrix

J

a

1

(λ

1

) 0

J

a

2

(λ

2

)

.

.

.

0 J

a

t

(λ

t

)

,

where

J

m

(λ) =

λ 0 ··· 0

1 λ ··· 0

.

.

.

.

.

.

.

.

.

.

.

.

0 ··· 1 λ

is an m × m matrix.

Proof.

Apply the prime decomposition theorem to

V

α

. Then all primes are of

the form

X −λ

. We then use our second example at the beginning of the chapter

to get the form of the matrix.

The blocks

J

m

(

λ

) are called the Jordan

λ

-blocks. It turns out that the Jordan

blocks are unique up to reordering, but it does not immediately follow from what

we have so far, and we will not prove it. It is done in the IB Linear Algebra

course.

We can also read off the minimal polynomial and characteristic polynomial

of α. The minimal polynomial is

Y

λ

(X − λ)

a

λ

,

where

a

λ

is the size of the largest

λ

-block. The characteristic polynomial of

α

is

Y

λ

(X − λ)

b

λ

,

where b

λ

is the sum of the sizes of the λ-blocks. Alternatively, it is

t

Y

i=1

(X − λ

i

)

a

i

.

From the Jordan normal form, we can also read off another invariant, namely

the size of the λ-space of α, namely the number of λ-blocks.

We can also use the idea of viewing

V

as an

F

[

X

] module to prove Cayley-

Hamilton theorem. In fact, we don’t need F to be a field.

Theorem

(Cayley-Hamilton theorem)

.

Let

M

be a finitely-generated

R

-module,

where

R

is some commutative ring. Let

α

:

M → M

be an

R

-module homomor-

phism. Let

A

be a matrix representation of

α

under some choice of generators,

and let p(t) = det(tI − A). Then p(α) = 0.

Proof. We consider M as an R[X] module with action given by

(f(X))(m) = f(α)m.

Suppose e

1

, ··· , e

n

span M, and that for all i, we have

α(e

i

) =

n

X

j=1

a

ij

e

j

.

Then

n

X

j=1

(Xδ

ij

− a

ij

)e

j

= 0.

We write C for the matrix with entries

c

ij

= Xδ

ij

− a

ij

∈ F[X].

We now use the fact that

adj(C)C = det(C)I,

which we proved in IB Linear Algebra (and the proof did not assume that the

underlying ring is a field). Expanding this out, we get the following equation (in

F [X]).

χ

α

(X)I = det(XI − A)I = (adj(XI − A))(XI − A).

Writing this in components, and multiplying by e

k

, we have

χ

α

(X)δ

ik

e

k

=

n

X

j=1

(adj(XI − A)

ij

)(Xδ

jk

− a

jk

)e

k

.

Then for each i, we sum over k to obtain

n

X

k=1

χ

α

(X)δ

ik

e

k

=

n

X

k=1

(adj(XI − A)

ij

)(Xδ

jk

− a

jk

)e

k

= 0,

by our choice of

a

ij

. But the left hand side is just

χ

α

(

X

)

e

i

. So

χ

α

(

X

) acts

trivially on all of the generators

e

i

. So it in fact acts trivially. So

χ

α

(

α

) is the

zero map (since acting by X is the same as acting by α, by construction).