5Determinants of matrices

IB Linear Algebra



5 Determinants of matrices
We probably all know what the determinant is. Here we are going to give a
slightly more abstract definition, and spend quite a lot of time trying motivate
this definition.
Recall that
S
n
is the group of permutations of
{
1
, ··· , n}
, and there is a
unique group homomorphism
ε
:
S
n
1
}
such that
ε
(
σ
) = 1 if
σ
can be
written as a product of an even number of transpositions;
ε
(
σ
) =
1 if
σ
can be
written as an odd number of transpositions. It is proved in IA Groups that this
is well-defined.
Definition (Determinant). Let A Mat
n,n
(F). Its determinant is
det A =
X
σS
n
ε(σ)
n
Y
i=1
A
(i)
.
This is a big scary definition. Hence, we will spend the first half of the
chapter trying to understand what this really means, and how it behaves. We
will eventually prove a formula that is useful for computing the determinant,
which is probably how you were first exposed to the determinant.
Example. If n = 2, then S
2
= {id, (1 2)}. So
det A = A
11
A
22
A
12
A
21
.
When n = 3, then S
3
has 6 elements, and
det A = A
11
A
22
A
33
+ A
12
A
23
A
31
+ A
13
A
21
A
32
A
11
A
23
A
32
A
22
A
31
A
13
A
33
A
12
A
21
.
We will first prove a few easy and useful lemmas about the determinant.
Lemma. det A = det A
T
.
Proof.
det A
T
=
X
σS
n
ε(σ)
n
Y
i=1
A
σ(i)i
=
X
σS
n
ε(σ)
n
Y
j=1
A
jσ
1
(j)
=
X
τ S
n
ε(τ
1
)
n
Y
j=1
A
jτ (j)
Since ε(τ ) = ε(τ
1
), we get
=
X
τ S
n
ε(τ)
n
Y
j=1
A
jτ (j)
= det A.
Lemma. If A is an upper triangular matrix, i.e.
A =
a
11
a
12
··· a
1n
0 a
22
··· a
2n
.
.
.
.
.
.
.
.
.
.
.
.
0 0 ··· a
nn
Then
det A =
n
Y
i=1
a
ii
.
Proof. We have
det A =
X
σS
n
ε(σ)
n
Y
i=1
A
( i)
But A
( i)
= 0 whenever i > σ(i). So
n
Y
i=1
A
( i)
= 0
if there is some i {1, ··· , n} such that i > σ(i).
However, the only permutation in which
i σ
(
i
) for all
i
is the identity. So
the only thing that contributes in the sum is σ = id. So
det A =
n
Y
i=1
A
ii
.
To motivate this definition, we need a notion of volume. How can we define
volume on a vector space? It should be clear that the “volume” cannot be
uniquely determined, since it depends on what units we are using. For example,
saying the volume is “1” is meaningless unless we provide the units, e.g.
cm
3
.
So we have an axiomatic definition for what it means for something to denote a
“volume”.
Definition
(Volume form)
.
A volume form on
F
n
is a function
d
:
F
n
×···×F
n
F that is
(i) Multilinear, i.e. for all i and all v
1
, ··· , v
i1
, v
i+1
, ··· , v
n
F
n
, we have
d(v
1
, ··· , v
i1
, ·, v
i+1
, ··· , v
n
) (F
n
)
.
(ii) Alternating, i.e. if v
i
= v
j
for some i 6= j, then
d(v
1
, ··· , v
n
) = 0.
We should think of
d
(
v
1
, ··· , v
n
) as the
n
-dimensional volume of the paral-
lelopiped spanned by v
1
, ··· , v
n
.
We can view
A Mat
n
(
F
) as
n
-many vectors in
F
n
by considering its columns
A = (A
(1)
A
(2)
··· A
(n)
), with A
(i)
F
n
. Then we have
Lemma. det A is a volume form.
Proof. To see that det is multilinear, it is sufficient to show that each
n
Y
i=1
A
(i)
is multilinear for all
σ S
n
, since linear combinations of multilinear forms are
multilinear. But each such product is contains precisely one entry from each
column, and so is multilinear.
To show it is alternating, suppose now there are some
k, `
distinct such that
A
(k)
=
A
(`)
. We let
τ
be the transposition (
k `
). By Lagrange’s theorem, we
can write
S
n
= A
n
q τA
n
,
where A
n
= ker ε and q is the disjoint union. We also know that
X
σA
n
n
Y
i=1
A
( i)
=
X
σA
n
n
Y
i=1
A
i,τ σ( i)
,
since if
σ
(
i
) is not
k
or
l
, then
τ
does nothing; if
σ
(
i
) is
k
or
l
, then
τ
just swaps
them around, but A
(k)
= A
(l)
. So we get
X
σA
n
n
Y
i=1
A
(i)
=
X
σ
0
τ A
n
n
Y
i=1
A
0
(i)
,
But we know that
det A = LHS RHS = 0.
So done.
We have shown that determinants are volume forms, but is this the only
volume form? Well obviously not, since 2
det A
is also a valid volume form.
However, in some sense, all volume forms are “derived” from the determinant.
Before we show that, we need the following
Lemma.
Let
d
be a volume form on
F
n
. Then swapping two entries changes
the sign, i.e.
d(v
1
, ··· , v
i
, ··· , v
j
, ··· , v
n
) = d(v
1
, ··· , v
j
, ··· , v
i
, ··· , v
n
).
Proof. By linearity, we have
0 = d(v
1
, ··· , v
i
+ v
j
, ··· , v
i
+ v
j
, ··· , v
n
)
= d(v
1
, ··· , v
i
, ··· , v
i
, ··· , v
n
)
+ d(v
1
, ··· , v
i
, ··· , v
j
, ··· , v
n
)
+ d(v
1
, ··· , v
j
, ··· , v
i
, ··· , v
n
)
+ d(v
1
, ··· , v
j
, ··· , v
j
, ··· , v
n
)
= d(v
1
, ··· , v
i
, ··· , v
j
, ··· , v
n
)
+ d(v
1
, ··· , v
j
, ··· , v
i
, ··· , v
n
).
So done.
Corollary. If σ S
n
, then
d(v
σ(1)
, ··· , v
σ(n)
) = ε(σ)d(v
1
, ··· , v
n
)
for any v
i
F
n
.
Theorem.
Let
d
be any volume form on
F
n
, and let
A
= (
A
(1)
··· A
(n)
)
Mat
n
(F). Then
d(A
(1)
, ··· , A
(n)
) = (det A)d(e
1
, ··· , e
n
),
where {e
1
, ··· , e
n
} is the standard basis.
Proof. We can compute
d(A
(1)
, ··· , A
(n)
) = d
n
X
i=1
A
i1
e
i
, A
(2)
, ··· , A
(n)
!
=
n
X
i=1
A
i1
d(e
i
, A
(2)
, ··· , A
(n)
)
=
n
X
i,j=1
A
i1
A
j2
d(e
i
, e
j
, A
(3)
, ··· , A
(n)
)
=
X
i
1
,··· ,i
n
d(e
i
1
, ··· , e
i
n
)
n
Y
j=1
A
i
j
j
.
We know that lots of these are zero, since if
i
k
=
i
j
for some
k, j
, then the term
is zero. So we are just summing over distinct tuples, i.e. when there is some
σ
such that i
j
= σ(j). So we get
d(A
(1)
, ··· , A
(n)
) =
X
σS
n
d(e
σ(1)
, ··· , e
σ(n)
)
n
Y
j=1
A
σ(j)j
.
However, by our corollary up there, this is just
d(A
(1)
, ··· , A
(n)
) =
X
σS
n
ε(σ)d(e
1
, ··· , e
n
)
n
Y
j=1
A
σ(j)j
= (det A)d(e
1
, ··· , e
n
).
So done.
We can rewrite the formula as
d(Ae
1
, ··· , Ae
n
) = (det A)d(e
1
, ··· , e
n
).
It is not hard to see that the same proof gives for any v
1
, ··· , v
n
, we have
d(Av
1
, ··· , Av
n
) = (det A)d(v
1
, ··· , v
n
).
So we know that
det A
is the volume rescaling factor of an arbitrary parallelopiped,
and this is true for any volume form d.
Theorem. Let A, B Mat
n
(F). Then det(AB) = det(A) det(B).
Proof.
Let
d
be a non-zero volume form on
F
n
(e.g. the “determinant”). Then
we can compute
d(ABe
1
, ··· , ABe
n
) = (det AB)d(e
1
, ··· , e
n
),
but we also have
d(ABe
1
, ··· , ABe
n
) = (det A)d(Be
1
, ··· , Be
n
) = (det A)(det B)d(e
1
, ··· , e
n
).
Since d is non-zero, we must have det AB = det A det B.
Corollary.
If
A Mat
n
(
F
) is invertible, then
det A 6
= 0. In fact, when
A
is
invertible, then det(A
1
) = (det A)
1
.
Proof. We have
1 = det I = det(AA
1
) = det A det A
1
.
So done.
Definition
(Singular matrices)
.
A matrix
A
is singular if
det A
= 0. Otherwise,
it is non-singular.
We have just shown that if
det A
= 0, then
A
is not invertible. Is the converse
true? If
det A 6
= 0, then can we conclude that
A
is invertible? The answer is
yes. We are now going to prove it in an abstract and clean way. We will later
prove this fact again by constructing an explicit formula for the inverse, which
involves dividing by the determinant. So if the determinant is non-zero, then we
know an inverse exists.
Theorem. Let A Mat
n
(F). Then the following are equivalent:
(i) A is invertible.
(ii) det A 6= 0.
(iii) r(A) = n.
Proof.
We have proved that (i)
(ii) above, and the rank-nullity theorem implies
(iii)
(i). We will prove (ii)
(iii). In fact we will show the contrapositive.
Suppose
r
(
A
)
< n
. By rank-nullity theorem,
n
(
A
)
>
0. So there is some
x =
λ
1
.
.
.
λ
n
such that Ax = 0. Suppose λ
k
6= 0. We define B as follows:
B =
1 λ
1
.
.
.
.
.
.
1 λ
k1
λ
k
λ
k+1
1
.
.
.
.
.
.
λ
n
1
So
AB
has the
k
th column identically zero. So
det
(
AB
) = 0. So it is sufficient
to prove that det(B) 6= 0. But det B = λ
k
6= 0. So done.
We are now going to come up with an alternative formula for the determinant
(which is probably the one you are familiar with). To do so, we introduce the
following notation:
Notation.
Write
ˆ
A
ij
for the matrix obtained from
A
by deleting the
i
th row
and jth column.
Lemma. Let A Mat
n
(F). Then
(i) We can expand det A along the jth column by
det A =
n
X
i=1
(1)
i+j
A
ij
det
ˆ
A
ij
.
(ii) We can expand det A along the ith row by
det A =
n
X
j=1
(1)
i+j
A
ij
det
ˆ
A
ij
.
We could prove this directly from the definition, but that is messy and scary,
so let’s use volume forms instead.
Proof.
Since
det A
=
det A
T
, (i) and (ii) are equivalent. So it suffices to prove
just one of them. We have
det A = d(A
(1)
, ··· , A
(n)
),
where d is the volume form induced by the determinant. Then we can write as
det A = d
A
(1)
, ··· ,
n
X
i=1
A
ij
e
i
, ··· , A
(n)
!
=
n
X
i=1
A
ij
d(A
(1)
, ··· , e
i
, ··· , A
(n)
)
The volume form on the right is the determinant of a matrix with the
j
th column
replaced with
e
i
. We can move our columns around so that our matrix becomes
B =
ˆ
A
ij
0
stuff 1
We get that
det B
=
det
ˆ
A
ij
, since the only permutations that give a non-zero
sum are those that send
n
to
n
. In the row and column swapping, we have made
n j column transpositions and n i row transpositions. So we have
det A =
n
X
i=1
A
ij
(1)
nj
(1)
ni
det B
=
n
X
i=1
A
ij
(1)
i+j
det
ˆ
A
ij
.
This is not only useful for computing determinants, but also computing
inverses.
Definition
(Adjugate matrix)
.
Let
A Mat
n
(
F
). The adjugate matrix of
A
,
written adj A, is the n × n matrix such that (adj A)
ij
= (1)
i+j
det
ˆ
A
ji
.
The relevance is the following result:
Theorem.
If
A Mat
n
(
F
), then
A
(
adj A
) = (
det A
)
I
n
= (
adj A
)
A
. In particu-
lar, if det A 6= 0, then
A
1
=
1
det A
adj A.
Note that this is not an efficient way to compute the inverse.
Proof. We compute
[(adj A)A]
jk
=
n
X
i=1
(adj A)
ji
A
ik
=
n
X
i=1
(1)
i+j
det
ˆ
A
ij
A
ik
. ()
So if j = k, then [(adj A)A]
jk
= det A by the lemma.
Otherwise, if
j 6
=
k
, consider the matrix
B
obtained from
A
by replacing the
j
th column by the
k
th column. Then the right hand side of (
) is just
det B
by
the lemma. But we know that if two columns are the same, the determinant is
zero. So the right hand side of () is zero. So
[(adj A)A]
jk
= det
jk
The calculation for [
A adj A
] = (
det A
)
I
n
can be done in a similar manner, or by
considering (A adj A)
T
= (adj A)
T
A
T
= (adj(A
T
))A
T
= (det A)I
n
.
Note that the coefficients of (
adj A
) are just given by polynomials in the
entries of
A
, and so is the determinant. So if
A
is invertible, then its inverse is
given by a rational function (i.e. ratio of two polynomials) in the entries of A.
This is very useful theoretically, but not computationally, since the polyno-
mials are very large. There are better ways computationally, such as Gaussian
elimination.
We’ll end with a useful tricks to compute the determinant.
Lemma. Let A, B be square matrices. Then for any C, we have
det
A C
0 B
= (det A)(det B).
Proof. Suppose A Mat
k
(F), and B Mat
`
(F), so C Mat
k,`
(F). Let
X =
A C
0 B
.
Then by definition, we have
det X =
X
σS
k+`
ε(σ)
k+`
Y
i=1
X
(i)
.
If
j k
and
i > k
, then
X
ij
= 0. We only want to sum over permutations
σ
such
that
σ
(
i
)
> k
if
i > k
. So we are permuting the last
j
things among themselves,
and hence the first
k
things among themselves. So we can decompose this into
σ
=
σ
1
σ
2
, where
σ
1
is a permutation of
{
1
, ··· , k}
and fixes the remaining things,
while σ
2
fixes {1, ··· , k}, and permutes the remaining. Then
det X =
X
σ=σ
1
σ
2
ε(σ
1
σ
2
)
k
Y
i=1
X
1
(i)
`
Y
j=1
X
k+j σ
2
(k+j)
=
X
σ
1
S
k
ε(σ
1
)
k
Y
i=1
A
1
(i)
!
X
σ
2
S
`
ε(σ
2
)
`
Y
j=1
B
jσ
2
(j)
= (det A)(det B)
Corollary.
det
A
1
stuff
A
2
.
.
.
0 A
n
=
n
Y
i=1
det A
i