6Endomorphisms

IB Linear Algebra



6.2 The minimal polynomial
6.2.1 Aside on polynomials
Before we talk about minimal polynomials, we first talk about polynomials in
general.
Definition (Polynomial). A polynomial over F is an object of the form
f(t) = a
m
t
m
+ a
m1
t
m1
+ ··· + a
1
t + a
0
,
with m 0, a
0
, ··· , a
m
F.
We write F[t] for the set of polynomials over F.
Note that we don’t identify a polynomial
f
with the corresponding function
it represents. For example, if
F
=
Z/pZ
, then
t
p
and
t
are different polynomials,
even though they define the same function (by Fermat’s little theorem/Lagrange’s
theorem). Two polynomials are equal if and only if they have the same coeffi-
cients.
However, we will later see that if
F
is
R
or
C
, then polynomials are equal
if and only if they represent the same function, and this distinction is not as
important.
Definition
(Degree)
.
Let
f F
[
t
]. Then the degree of
f
, written
deg f
is the
largest n such that a
n
6= 0. In particular, deg 0 = −∞.
Notice that deg f g = deg f + deg g and deg f + g max{deg f, deg g}.
Lemma
(Polynomial division)
.
If
f, g F
[
t
] (and
g 6
= 0), then there exists
q, r F[t] with deg r < deg g such that
f = qg + r.
Proof is omitted.
Lemma. If λ F is a root of f, i.e. f(λ) = 0, then there is some g such that
f(t) = (t λ)g(t).
Proof. By polynomial division, let
f(t) = (t λ)g(t) + r(t)
for some
g
(
t
)
, r
(
t
)
F
[
t
] with
deg r < deg
(
t λ
) = 1. So
r
has to be constant,
i.e. r(t) = a
0
for some a
0
F. Now evaluate this at λ. So
0 = f(λ) = (λ λ)g(λ) + r(λ) = a
0
.
So a
0
= 0. So r = 0. So done.
Definition
(Multiplicity of a root)
.
Let
f F
[
t
] and
λ
a root of
f
. We say
λ
has multiplicity k if (t λ)
k
is a factor of f but (t λ)
k+1
is not, i.e.
f(t) = (t λ)
k
g(t)
for some g(t) F[t] with g(λ) 6= 0.
We can use the last lemma and induction to show that any non-zero
f F
[
t
]
can be written as
f = g(t)
k
Y
i=1
(t λ
i
)
a
i
,
where
λ
1
, ··· , λ
k
are all distinct,
a
i
>
1, and
g
is a polynomial with no roots in
F.
Hence we obtain the following:
Lemma.
A non-zero polynomial
f F
[
t
] has at most
deg f
roots, counted with
multiplicity.
Corollary.
Let
f, g F
[
t
] have degree
< n
. If there are
λ
1
, ··· , λ
n
distinct such
that f(λ
i
) = g(λ
i
) for all i, then f = g.
Proof.
Given the lemma, consider
f g
. This has degree less than
n
, and
(
f g
)(
λ
i
) = 0 for
i
= 1
, ··· , n
. Since it has at least
n deg
(
f g
) roots, we
must have f g = 0. So f = g.
Corollary.
If
F
is infinite, then
f
and
g
are equal if and only if they agree on
all points.
More importantly, we have the following:
Theorem
(The fundamental theorem of algebra)
.
Every non-constant polyno-
mial over C has a root in C.
We will not prove this.
We say C is an algebraically closed field.
It thus follows that every polynomial over
C
of degree
n >
0 has precisely
n
roots, counted with multiplicity, since if we write
f
(
t
) =
g
(
t
)
Q
(
t λ
i
)
a
i
and
g
has no roots, then
g
is constant. So the number of roots is
P
a
i
=
deg f
,
counted with multiplicity.
It also follows that every polynomial in
R
factors into linear polynomials and
quadratic polynomials with no real roots (since complex roots of real polynomials
come in complex conjugate pairs).
6.2.2 Minimal polynomial
Notation.
Given
f
(
t
) =
P
m
i=0
a
i
t
i
F
[
t
],
A Mat
n
(
F
) and
α End
(
V
), we
can write
f(A) =
m
X
i=0
a
i
A
i
, f(α) =
m
X
i=0
a
i
α
i
where A
0
= I and α
0
= ι.
Theorem
(Diagonalizability theorem)
.
Suppose
α End
(
V
). Then
α
is diago-
nalizable if and only if there exists non-zero
p
(
t
)
F
[
t
] such that
p
(
α
) = 0, and
p(t) can be factored as a product of distinct linear factors.
Proof.
Suppose
α
is diagonalizable. Let
λ
1
, ··· , λ
k
be the distinct eigenvalues
of α. We have
V =
k
M
i=1
E(λ
i
).
So each v V can be written (uniquely) as
v =
k
X
i=1
v
i
with α(v
i
) = λ
i
v
i
.
Now let
p(t) =
k
Y
i=1
(t λ
i
).
Then for any v, we get
p(α)(v) =
k
X
i=1
p(α)(v
i
) =
k
X
i=1
p(λ
i
)v
i
= 0.
So p(α) = 0. By construction, p has distinct linear factors.
Conversely, suppose we have our polynomial
p(t) =
k
Y
i=1
(t λ
i
),
with
λ
1
, ··· , λ
k
F
distinct, and
p
(
α
) = 0 (we can wlog assume
p
is monic, i.e.
the leading coefficient is 1). We will show that
V =
k
X
i=1
E
α
(λ
i
).
In other words, we want to show that for all
v V
, there is some
v
i
E
α
(
λ
i
)
for i = 1, ··· , k such that v =
P
v
i
.
To find these v
i
out, we let
q
j
(t) =
Y
i6=j
t λ
i
λ
j
λ
i
.
This is a polynomial of degree k 1, and q
j
(λ
i
) = δ
ij
.
Now consider
q(t) =
k
X
i=1
q
i
(t).
We still have
deg q k
1, but
q
(
λ
i
) = 1 for any
i
. Since
q
and 1 agree on
k
points, we must have q = 1.
Let π
j
: V V be given by π
j
= q
j
(α). Then the above says that
k
X
j=1
π
j
= ι.
Hence given v V , we know that v =
P
π
j
v.
We now check that π
j
v E
α
(λ
j
). This is true since
(α λ
j
ι)π
j
v =
1
Q
i6=j
(λ
j
λ
i
)
k
Y
i=1
(α λ
ι
)(v) =
1
Q
i6=j
(λ
j
λ
i
)
p(α)(v) = 0.
So
αv
j
= λ
j
v
j
.
So done.
In the above proof, if
v E
α
(
λ
i
), then
π
j
(
v
) =
δ
ij
v
. So
π
i
is a projection
onto the E
α
(λ
i
).
Definition
(Minimal polynomial)
.
The minimal polynomial of
α End
(
V
) is
the non-zero monic polynomial M
α
(t) of least degree such that M
α
(α) = 0.
The monic requirement is just for things to look nice, since we can always
divide by the leading coefficient of a polynomial to get a monic version.
Note that if
A
represents
α
, then for all
p F
[
t
],
p
(
A
) represents
p
(
α
).
Thus
p
(
α
) is zero iff
p
(
A
) = 0. So the minimal polynomial of
α
is the minimal
polynomial of A if we define M
A
analogously.
There are two things we want to know whether the minimal polynomial
exists, and whether it is unique.
Existence is always guaranteed in finite-dimensional cases. If
dim V
=
n <
,
then
dim End
(
V
) =
n
2
. So
ι, α, α
2
, ··· , α
n
2
are linearly dependent. So there are
some λ
0
, ··· , λ
n
2
F not all zero such that
n
2
X
i=0
λ
i
α
i
= 0.
So deg M
α
n
2
. So we must have a minimal polynomial.
To show that the minimal polynomial is unique, we will prove the following
stronger characterization of the minimal polynomial:
Lemma.
Let
α End
(
V
), and
p F
[
t
]. Then
p
(
α
) = 0 if and only if
M
α
(
t
) is
a factor of p(t). In particular, M
α
is unique.
Proof.
For all such
p
, we can write
p
(
t
) =
q
(
t
)
M
α
(
t
) +
r
(
t
) for some
r
of degree
less than deg M
α
. Then
p(α) = q(α)M
α
(α) + r(α).
So if
r
(
α
) = 0 iff
p
(
α
) = 0. But
deg r < deg M
α
. By the minimality of
M
α
, we
must have r(α) = 0 iff r = 0. So p(α) = 0 iff M
α
(t) | p(t).
So if
M
1
and
M
2
are both minimal polynomials for
α
, then
M
1
| M
2
and
M
2
| M
1
. So
M
2
is just a scalar multiple of
M
1
. But since
M
1
and
M
2
are
monic, they must be equal.
Example. Let V = F
2
, and consider the following matrices:
A =
1 0
0 1
, B =
1 1
0 1
.
Consider the polynomial
p
(
t
) = (
t
1)
2
. We can compute
p
(
A
) =
p
(
B
) = 0. So
M
A
(
t
) and
M
B
(
t
) are factors of (
t
1)
2
. There aren’t many factors of (
t
1)
2
.
So the minimal polynomials are either (
t
1) or (
t
1)
2
. Since
A I
= 0 and
B I 6
= 0, the minimal polynomial of
A
is
t
1 and the minimal polynomial of
B is (t 1)
2
.
We can now re-state our diagonalizability theorem.
Theorem
(Diagonalizability theorem 2.0)
.
Let
α End
(
V
). Then
α
is diago-
nalizable if and only if M
α
(t) is a product of its distinct linear factors.
Proof. () This follows directly from the previous diagonalizability theorem.
(
) Suppose
α
is diagonalizable. Then there is some
p F
[
t
] non-zero such
that
p
(
α
) = 0 and
p
is a product of distinct linear factors. Since
M
α
divides
p
,
M
α
also has distinct linear factors.
Theorem.
Let
α, β End
(
V
) be both diagonalizable. Then
α
and
β
are
simultaneously diagonalizable (i.e. there exists a basis with respect to which
both are diagonal) if and only if αβ = βα.
This is important in quantum mechanics. This means that if two operators
do not commute, then they do not have a common eigenbasis. Hence we have
the uncertainty principle.
Proof.
(
) If there exists a basis (
e
1
, ··· , e
n
) for
V
such that
α
and
β
are repre-
sented by
A
and
B
respectively, with both diagonal, then by direct computation,
AB = BA. But AB represents αβ and BA represents βα. So αβ = βα.
(
) Suppose
αβ
=
βα
. The idea is to consider each eigenspace of
α
individu-
ally, and then diagonalize
β
in each of the eigenspaces. Since
α
is diagonalizable,
we can write
V =
k
M
i=1
E
α
(λ
i
),
where
λ
i
are the eigenvalues of
V
. We write
E
i
for
E
α
(
λ
i
). We want to show
that
β
sends
E
i
to itself, i.e.
β
(
E
i
)
E
i
. Let
v E
i
. Then we want
β
(
v
) to be
in E
i
. This is true since
α(β(v)) = β(α(v)) = β(λ
i
v) = λ
i
β(v).
So β(v) is an eigenvector of α with eigenvalue λ
i
.
Now we can view β|
E
i
End(E
i
). Note that
M
β
(β|
E
i
) = M
β
(β)|
E
i
= 0.
Since
M
β
(
t
) is a product of its distinct linear factors, it follows that
β|
E
i
is
diagonalizable. So we can choose a basis
B
i
of eigenvectors for
β|
E
i
. We can do
this for all i.
Then since
V
is a direct sum of the
E
i
’s, we know that
B
=
S
k
i=1
B
i
is a
basis for V consisting of eigenvectors for both α and β. So done.