6Endomorphisms
IB Linear Algebra
6.4 Multiplicities of eigenvalues and Jordan normal form
We will want to put our matrices in their “Jordan normal forms”, which is
a unique form for each equivalence class of similar matrices. The following
properties will help determine which Jordan normal form a matrix can have.
Definition
(Algebraic and geometry multiplicity)
.
Let
α ∈ End
(
V
) and
λ
an
eigenvalue of α. Then
(i)
The algebraic multiplicity of
λ
, written
a
λ
, is the multiplicity of
λ
as a
root of χ
α
(t).
(ii)
The geometric multiplicity of
λ
, written
g
λ
, is the dimension of the corre
sponding eigenspace, dim E
α
(λ).
(iii) c
λ
is the multiplicity of λ as a root of the minimal polynomial m
α
(t).
We will look at some extreme examples:
Example.
– Let
A =
λ 1 ··· 0
0 λ
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
0 0 ··· λ
.
We will later show that a
λ
= n = c
λ
and g
λ
= 1.
– Consider A = λI. Then a
λ
= g
λ
= n, c
λ
= 1.
Lemma. If λ is an eigenvalue of α, then
(i) 1 ≤ g
λ
≤ a
λ
(ii) 1 ≤ c
λ
≤ a
λ
.
Proof.
(i)
The first inequality is easy. If
λ
is an eigenvalue, then
E
(
λ
)
6
= 0. So
g
λ
=
dim E
(
λ
)
≥
1. To prove the other inequality, if
v
1
, ··· , v
g
is a basis
for
E
(
λ
), then we can extend it to a basis for
V
, and then
α
is represented
by
λI
g
∗
0 B
So χ
α
(t) = (t − λ)
g
χ
B
(t). So a
λ
> g = g
λ
.
(ii)
This is straightforward since
M
α
(
λ
) = 0 implies 1
≤ c
λ
, and since
M
α
(
t
)

χ
α
(t), we know that c
λ
≤ α
λ
.
Lemma. Suppose F = C and α ∈ End(V ). Then the following are equivalent:
(i) α is diagonalizable.
(ii) g
λ
= a
λ
for all eigenvalues of α.
(iii) c
λ
= 1 for all λ.
Proof.
–
(i)
⇔
(ii):
α
is diagonalizable iff
dim V
=
P
dim E
α
(
λ
i
). But this is
equivalent to
dim V =
X
g
λ
i
≤
X
a
λ
i
= deg χ
α
= dim V.
So we must have
P
g
λ
i
=
P
a
λ
i
. Since each
g
λ
i
is at most
a
λ
i
, they must
be individually equal.
–
(i)
⇔
(iii):
α
is diagonalizable if and only if
M
α
(
t
) is a product of distinct
linear factors if and only if c
λ
= 1 for all eigenvalues λ.
Definition
(Jordan normal form)
.
We say
A ∈ Mat
N
(
C
) is in Jordan normal
form if it is a block diagonal of the form
J
n
1
(λ
1
) 0
J
n
2
(λ
2
)
.
.
.
0 J
n
k
(λ
k
)
where
k ≥
1,
n
1
, ··· , n
k
∈ N
such that
n
=
P
n
i
,
λ
1
, ··· , λ
k
not necessarily
distinct, and
J
m
(λ) =
λ 1 ··· 0
0 λ
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
0 0 ··· λ
is an m × m matrix. Note that J
m
(λ) = λI
m
+ J
m
(0).
For example, it might look something like
λ
1
1
λ
1
1
λ
1
0
λ
2
0
λ
3
1
λ
3
0
.
.
.
.
.
.
λ
n
1
λ
n
with all other entries zero.
Then we have the following theorem:
Theorem
(Jordan normal form theorem)
.
Every matrix
A ∈ Mat
n
(
C
) is similar
to a matrix in Jordan normal form. Moreover, this Jordan normal form matrix
is unique up to permutation of the blocks.
This is a complete solution to the classification problem of matrices, at least
in
C
. We will not prove this completely. We will only prove the uniqueness part,
and then reduce the existence part to a special form of endomorphisms. The
remainder of the proof is left for IB Groups, Rings and Modules.
We can rephrase this result using linear maps. If
α ∈ End
(
V
) is an endo
morphism of a finitedimensional vector space
V
over
C
, then the theorem says
there exists a basis such that
α
is represented by a matrix in Jordan normal
form, and this is unique as before.
Note that the permutation thing is necessary, since if two matrices in Jordan
normal form differ only by a rearrangement of blocks, then they are similar, by
permuting the basis.
Example.
Every 2
×
2 matrix in Jordan normal form is one of the three types:
Jordan normal form χ
A
M
A
λ 0
0 λ
(t − λ)
2
(t − λ)
λ 0
0 µ
(t − λ)(t − µ) (t − λ)(t − µ)
λ 1
0 λ
(t − λ)
2
(t − λ)
2
with
λ, µ ∈ C
distinct. We see that
M
A
determines the Jordan normal form of
A, but χ
A
does not.
Every 3
×
3 matrix in Jordan normal form is one of the six types. Here
λ
1
, λ
2
and λ
3
are distinct complex numbers.
Jordan normal form χ
A
M
A
λ
1
0 0
0 λ
2
0
0 0 λ
3
(t − λ
1
)(t − λ
2
)(t − λ
3
) (t − λ
1
)(t − λ
2
)(t − λ
3
)
λ
1
0 0
0 λ
1
0
0 0 λ
2
(t − λ
1
)
2
(t − λ
2
) (t − λ
1
)(t − λ
2
)
λ
1
1 0
0 λ
1
0
0 0 λ
2
(t − λ
1
)
2
(t − λ
2
) (t − λ
1
)
2
(t − λ
2
)
λ
1
0 0
0 λ
1
0
0 0 λ
1
(t − λ
1
)
3
(t − λ
1
)
λ
1
1 0
0 λ
1
0
0 0 λ
1
(t − λ
1
)
3
(t − λ
1
)
2
λ
1
1 0
0 λ
1
1
0 0 λ
1
(t − λ
1
)
3
(t − λ
1
)
3
Notice that
χ
A
and
M
A
together determine the Jordan normal form of a 3
×
3
complex matrix. We do indeed need
χ
A
in the second case, since if we are given
M
A
= (t − λ
1
)(t − λ
2
), we know one of the roots is double, but not which one.
In general, though, even χ
A
and M
A
together does not suffice.
We now want to understand the Jordan normal blocks better. Recall the
definition
J
n
(λ) =
λ 1 ··· 0
0 λ
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
0 0 ··· λ
= λI
n
+ J
n
(0).
If (e
1
, ··· , e
n
) is the standard basis for C
n
, we have
J
n
(0)(e
1
) = 0, J
n
(0)(e
i
) = e
i−1
for 2 ≤ i ≤ n.
Thus we know
J
n
(0)
k
(e
i
) =
(
0 i ≤ k
e
i−k
k < i ≤ n
In other words, for k < n, we have
(J
n
(λ) − λI)
k
= J
n
(0)
k
=
0 I
n−k
0 0
.
If k ≥ n, then we have (J
n
(λ) − λI)
k
= 0. Hence we can summarize this as
n((J
m
(λ) − λI
m
)
r
) = min{r, m}.
Note that if
A
=
J
n
(
λ
), then
χ
A
(
t
) =
M
A
(
t
) = (
t − λ
)
n
. So
λ
is the only
eigenvalue of
A
. So
a
λ
=
c
λ
=
n
. We also know that
n
(
A−λI
) =
n−r
(
A−λI
) =
1. So g
λ
= 1.
Recall that a general Jordan normal form is a block diagonal matrix of Jordan
blocks. We have just studied individual Jordan blocks. Next, we want to look at
some properties of block diagonal matrices in general. If
A
is the block diagonal
matrix
A =
A
1
A
2
.
.
.
A
k
,
then
χ
A
(t) =
k
Y
i=1
χ
A
i
(k).
Moreover, if ρ ∈ F[t], then
ρ(A) =
ρ(A
1
)
ρ(A
2
)
.
.
.
ρ(A
k
)
.
Hence
M
A
(t) = lcm(M
A
1
(t), ··· , M
A
k
(t)).
By the ranknullity theorem, we have
n(ρ(A)) =
k
X
i=1
n(ρ(A
i
)).
Thus if A is in Jordan normal form, we get the following:
– g
λ
is the number of Jordan blocks in A with eigenvalue λ.
– a
λ
is the sum of sizes of the Jordan blocks of A with eigenvalue λ.
– c
λ
is the size of the largest Jordan block with eigenvalue λ.
We are now going to prove the uniqueness part of Jordan normal form
theorem.
Theorem.
Let
α ∈ End
(
V
), and
A
in Jordan normal form representing
α
. Then
the number of Jordan blocks J
n
(λ) in A with n ≥ r is
n((α − λι)
r
) − n((α − λι)
r−1
).
This is clearly independent of the choice of basis. Also, given this information,
we can figure out how many Jordan blocks of size exactly
n
by doing the right
subtraction. Hence this tells us that Jordan normal forms are unique up to
permutation of blocks.
Proof. We work blockwise for
A =
J
n
1
(λ
1
)
J
n
2
(λ
2
)
.
.
.
J
n
k
(λ
k
)
.
We have previously computed
n((J
m
(λ) − λI
m
)
r
) =
(
r r ≤ m
m r > m
.
Hence we know
n((J
m
(λ) − λI
m
)
r
) − n((J
m
(λ) − λI
m
)
r−1
) =
(
1 r ≤ m
0 otherwise.
It is also easy to see that for µ 6= λ,
n((J
m
(µ) − λI
m
)
r
) = n(J
m
(µ − λ)
r
) = 0
Adding up for each block, for r ≥ 1, we have
n((α−λι)
r
)−n((α−λι)
r−1
) = number of Jordan blocks J
n
(λ) with n ≥ r.
We can interpret this result as follows: if
r ≤ m
, when we take an additional
power of
J
m
(
λ
)
−λI
m
, we get from
0 I
m−r
0 0
to
0 I
m−r−1
0 0
. So we kill off
one more column in the matrix, and the nullity increase by one. This happens
until (
J
m
(
λ
)
− λI
m
)
r
= 0, in which case increasing the power no longer affects
the matrix. So when we look at the difference in nullity, we are counting the
number of blocks that are affected by the increase in power, which is the number
of blocks of size at least r.
We have now proved uniqueness, but existence is not yet clear. To show
this, we will reduce it to the case where there is exactly one eigenvalue. This
reduction is easy if the matrix is diagonalizable, because we can decompose the
matrix into each eigenspace and then work in the corresponding eigenspace. In
general, we need to work with “generalized eigenspaces”.
Theorem
(Generalized eigenspace decomposition)
.
Let
V
be a finitedimensional
vector space C such that α ∈ End(V ). Suppose that
M
α
(t) =
k
Y
i=1
(t − λ
i
)
c
i
,
with λ
1
, ··· , λ
k
∈ C distinct. Then
V = V
1
⊕ ··· ⊕ V
k
,
where V
i
= ker((α − λ
i
ι)
c
i
) is the generalized eigenspace.
This allows us to decompose
V
into a block diagonal matrix, and then each
block will only have one eigenvalue.
Note that if
c
1
=
···
=
c
k
= 1, then we recover the diagonalizability theorem.
Hence, it is not surprising that the proof of this is similar to the diagonalizability
theorem. We will again prove this by constructing projection maps to each of
the V
i
.
Proof. Let
p
j
(t) =
Y
i6=j
(t − λ
i
)
c
i
.
Then
p
1
, ··· , p
k
have no common factors, i.e. they are coprime. Thus by Euclid’s
algorithm, there exists q
1
, ··· , q
k
∈ C[t] such that
X
p
i
q
i
= 1.
We now define the endomorphism
π
j
= q
j
(α)p
j
(α)
for j = 1, ··· , k.
Then
P
π
j
= ι. Since M
α
(α) = 0 and M
α
(t) = (t − λ
j
ι)
c
j
p
j
(t), we get
(α − λ
j
ι)
c
j
π
j
= 0.
So im π
j
⊆ V
j
.
Now suppose v ∈ V . Then
v = ι(v) =
k
X
j=1
π
j
(v) ∈
X
V
j
.
So
V =
X
V
j
.
To show this is a direct sum, note that
π
i
π
j
= 0, since the product contains
M
α
(α) as a factor. So
π
i
= ιπ
i
=
X
π
j
π
i
= π
2
i
.
So π is a projection, and π
j

V
j
= ι
V
j
. So if v =
P
v
i
, then applying π
i
to both
sides gives
v
i
=
π
i
(
v
). Hence there is a unique way of writing
v
as a sum of
things in V
i
. So V =
L
V
j
as claimed.
Note that we didn’t really use the fact that the vector space is over
C
, except
to get that the minimal polynomial is a product of linear factors. In fact, for
arbitrary vector spaces, if the minimal polynomial of a matrix is a product of
linear factors, then it can be put into Jordan normal form. The converse is also
true — if it can be put into Jordan normal form, then the minimal polynomial
is a product of linear factors, since we’ve seen that a necessary and sufficient
condition for the minimal polynomial to be a product of linear factors is for
there to be a basis in which the matrix is upper triangular.
Using this theorem, by restricting
α
to its generalized eigenspaces, we can
reduce the existence part of the Jordan normal form theorem to the case
M
α
(
t
) =
(
t −λ
)
c
. Further by replacing
α
by
α −λι
, we can reduce this to the case where
0 is the only eigenvalue.
Definition
(Nilpotent)
.
We say
α ∈ End
(
V
) is nilpotent if there is some
r
such
that α
r
= 0.
Over
C
,
α
is nilpotent if and only if the only eigenvalue of
α
is 0. This is
since α is nilpotent if the minimal polynomial is t
r
for some r.
We’ve now reduced the problem of classifying complex endomorphisms to the
classifying nilpotent endomorphisms. This is the point where we stop. For the
remainder of the proof, see IB Groups, Rings and Modules. There is in fact an
elementary proof of it, and we’re not doing it not because it’s hard, but because
we don’t have time.
Example. Let
A =
3 −2 0
1 0 0
1 0 1
We know we can find the Jordan normal form by just computing the minimal
polynomial and characteristic polynomial. But we can do better and try to find
a P such that P
−1
AP is in Jordan normal form.
We first compute the eigenvalues of A. The characteristic polynomial is
det
t − 3 −2 0
1 t 0
1 0 t − 1
= (t − 1)((t − 3)t + 2) = (t − 1)
2
(t − 2).
We now compute the eigenspaces of A. We have
A − I =
2 −2 0
1 −1 0
1 0 0
We see this has rank 2 and hence nullity 1, and the eigenspace is the kernel
E
A
(1) =
*
0
0
1
+
We can also compute the other eigenspace. We have
A − 2I =
1 −2 0
1 −2 0
1 0 −1
This has rank 2 and
E
A
(2) =
*
2
1
2
+
.
Since
dim E
A
(1) + dim E
A
(2) = 2 < 3,
this is not diagonalizable. So the minimal polynomial must also be
M
A
(
t
) =
χ
A
(
t
) = (
t −
1)
2
(
t −
2). From the classificaion last time, we know that
A
is
similar to
1 1 0
0 1 0
0 0 2
We now want to compute a basis that transforms
A
to this. We want a basis
(v
1
, v
2
, v
3
) of C
3
such that
Av
1
= v
1
, Av
2
= v
1
+ v
2
, Av
3
= 2v
3
.
Equivalently, we have
(A − I)v
1
= 0, (A − I)v
2
= v
1
, (A − 2I)v
3
= 0.
There is an obvious choices v
3
, namely the eigenvector of eigenvalue 2.
To find
v
1
and
v
2
, the idea is to find some
v
2
such that (
A − I
)
v
2
6
=
0
but
(A − I)
2
v
2
= 0. Then we can let v
1
= (A − I)v
1
.
We can compute the kernel of (A − I)
2
. We have
(A − I)
2
=
2 −2 0
1 −1 0
2 −2 0
The kernel of this is
ker(A − I)
2
=
*
0
0
1
,
1
1
0
+
.
We need to pick our
v
2
that is in this kernel but not in the kernel of
A − I
(which is the eigenspace E
1
we have computed above). So we have
v
2
=
1
1
0
, v
1
=
0
0
1
, v
3
=
2
1
2
.
Hence we have
P =
0 1 2
0 1 1
1 0 2
and
P
−1
AP =
1 1 0
0 1 0
0 0 2
.