5Determinants of matrices

IB Linear Algebra

5 Determinants of matrices

We probably all know what the determinant is. Here we are going to give a

slightly more abstract definition, and spend quite a lot of time trying motivate

this definition.

Recall that

S

n

is the group of permutations of

{

1

, ··· , n}

, and there is a

unique group homomorphism

ε

:

S

n

→ {±

1

}

such that

ε

(

σ

) = 1 if

σ

can be

written as a product of an even number of transpositions;

ε

(

σ

) =

−

1 if

σ

can be

written as an odd number of transpositions. It is proved in IA Groups that this

is well-defined.

Definition (Determinant). Let A ∈ Mat

n,n

(F). Its determinant is

det A =

X

σ∈S

n

ε(σ)

n

Y

i=1

A

iσ(i)

.

This is a big scary definition. Hence, we will spend the first half of the

chapter trying to understand what this really means, and how it behaves. We

will eventually prove a formula that is useful for computing the determinant,

which is probably how you were first exposed to the determinant.

Example. If n = 2, then S

2

= {id, (1 2)}. So

det A = A

11

A

22

− A

12

A

21

.

When n = 3, then S

3

has 6 elements, and

det A = A

11

A

22

A

33

+ A

12

A

23

A

31

+ A

13

A

21

A

32

− A

11

A

23

A

32

− A

22

A

31

A

13

− A

33

A

12

A

21

.

We will first prove a few easy and useful lemmas about the determinant.

Lemma. det A = det A

T

.

Proof.

det A

T

=

X

σ∈S

n

ε(σ)

n

Y

i=1

A

σ(i)i

=

X

σ∈S

n

ε(σ)

n

Y

j=1

A

jσ

−1

(j)

=

X

τ ∈S

n

ε(τ

−1

)

n

Y

j=1

A

jτ (j)

Since ε(τ ) = ε(τ

−1

), we get

=

X

τ ∈S

n

ε(τ)

n

Y

j=1

A

jτ (j)

= det A.

Lemma. If A is an upper triangular matrix, i.e.

A =

a

11

a

12

··· a

1n

0 a

22

··· a

2n

.

.

.

.

.

.

.

.

.

.

.

.

0 0 ··· a

nn

Then

det A =

n

Y

i=1

a

ii

.

Proof. We have

det A =

X

σ∈S

n

ε(σ)

n

Y

i=1

A

iσ( i)

But A

iσ( i)

= 0 whenever i > σ(i). So

n

Y

i=1

A

iσ( i)

= 0

if there is some i ∈ {1, ··· , n} such that i > σ(i).

However, the only permutation in which

i ≤ σ

(

i

) for all

i

is the identity. So

the only thing that contributes in the sum is σ = id. So

det A =

n

Y

i=1

A

ii

.

To motivate this definition, we need a notion of volume. How can we define

volume on a vector space? It should be clear that the “volume” cannot be

uniquely determined, since it depends on what units we are using. For example,

saying the volume is “1” is meaningless unless we provide the units, e.g.

cm

3

.

So we have an axiomatic definition for what it means for something to denote a

“volume”.

Definition

(Volume form)

.

A volume form on

F

n

is a function

d

:

F

n

×···×F

n

→

F that is

(i) Multilinear, i.e. for all i and all v

1

, ··· , v

i−1

, v

i+1

, ··· , v

n

∈ F

n

, we have

d(v

1

, ··· , v

i−1

, ·, v

i+1

, ··· , v

n

) ∈ (F

n

)

∗

.

(ii) Alternating, i.e. if v

i

= v

j

for some i 6= j, then

d(v

1

, ··· , v

n

) = 0.

We should think of

d

(

v

1

, ··· , v

n

) as the

n

-dimensional volume of the paral-

lelopiped spanned by v

1

, ··· , v

n

.

We can view

A ∈ Mat

n

(

F

) as

n

-many vectors in

F

n

by considering its columns

A = (A

(1)

A

(2)

··· A

(n)

), with A

(i)

∈ F

n

. Then we have

Lemma. det A is a volume form.

Proof. To see that det is multilinear, it is sufficient to show that each

n

Y

i=1

A

iσ(i)

is multilinear for all

σ ∈ S

n

, since linear combinations of multilinear forms are

multilinear. But each such product is contains precisely one entry from each

column, and so is multilinear.

To show it is alternating, suppose now there are some

k, `

distinct such that

A

(k)

=

A

(`)

. We let

τ

be the transposition (

k `

). By Lagrange’s theorem, we

can write

S

n

= A

n

q τA

n

,

where A

n

= ker ε and q is the disjoint union. We also know that

X

σ∈A

n

n

Y

i=1

A

iσ( i)

=

X

σ∈A

n

n

Y

i=1

A

i,τ σ( i)

,

since if

σ

(

i

) is not

k

or

l

, then

τ

does nothing; if

σ

(

i

) is

k

or

l

, then

τ

just swaps

them around, but A

(k)

= A

(l)

. So we get

X

σ∈A

n

n

Y

i=1

A

iσ(i)

=

X

σ

0

∈τ A

n

n

Y

i=1

A

iσ

0

(i)

,

But we know that

det A = LHS − RHS = 0.

So done.

We have shown that determinants are volume forms, but is this the only

volume form? Well obviously not, since 2

det A

is also a valid volume form.

However, in some sense, all volume forms are “derived” from the determinant.

Before we show that, we need the following

Lemma.

Let

d

be a volume form on

F

n

. Then swapping two entries changes

the sign, i.e.

d(v

1

, ··· , v

i

, ··· , v

j

, ··· , v

n

) = −d(v

1

, ··· , v

j

, ··· , v

i

, ··· , v

n

).

Proof. By linearity, we have

0 = d(v

1

, ··· , v

i

+ v

j

, ··· , v

i

+ v

j

, ··· , v

n

)

= d(v

1

, ··· , v

i

, ··· , v

i

, ··· , v

n

)

+ d(v

1

, ··· , v

i

, ··· , v

j

, ··· , v

n

)

+ d(v

1

, ··· , v

j

, ··· , v

i

, ··· , v

n

)

+ d(v

1

, ··· , v

j

, ··· , v

j

, ··· , v

n

)

= d(v

1

, ··· , v

i

, ··· , v

j

, ··· , v

n

)

+ d(v

1

, ··· , v

j

, ··· , v

i

, ··· , v

n

).

So done.

Corollary. If σ ∈ S

n

, then

d(v

σ(1)

, ··· , v

σ(n)

) = ε(σ)d(v

1

, ··· , v

n

)

for any v

i

∈ F

n

.

Theorem.

Let

d

be any volume form on

F

n

, and let

A

= (

A

(1)

··· A

(n)

)

∈

Mat

n

(F). Then

d(A

(1)

, ··· , A

(n)

) = (det A)d(e

1

, ··· , e

n

),

where {e

1

, ··· , e

n

} is the standard basis.

Proof. We can compute

d(A

(1)

, ··· , A

(n)

) = d

n

X

i=1

A

i1

e

i

, A

(2)

, ··· , A

(n)

!

=

n

X

i=1

A

i1

d(e

i

, A

(2)

, ··· , A

(n)

)

=

n

X

i,j=1

A

i1

A

j2

d(e

i

, e

j

, A

(3)

, ··· , A

(n)

)

=

X

i

1

,··· ,i

n

d(e

i

1

, ··· , e

i

n

)

n

Y

j=1

A

i

j

j

.

We know that lots of these are zero, since if

i

k

=

i

j

for some

k, j

, then the term

is zero. So we are just summing over distinct tuples, i.e. when there is some

σ

such that i

j

= σ(j). So we get

d(A

(1)

, ··· , A

(n)

) =

X

σ∈S

n

d(e

σ(1)

, ··· , e

σ(n)

)

n

Y

j=1

A

σ(j)j

.

However, by our corollary up there, this is just

d(A

(1)

, ··· , A

(n)

) =

X

σ∈S

n

ε(σ)d(e

1

, ··· , e

n

)

n

Y

j=1

A

σ(j)j

= (det A)d(e

1

, ··· , e

n

).

So done.

We can rewrite the formula as

d(Ae

1

, ··· , Ae

n

) = (det A)d(e

1

, ··· , e

n

).

It is not hard to see that the same proof gives for any v

1

, ··· , v

n

, we have

d(Av

1

, ··· , Av

n

) = (det A)d(v

1

, ··· , v

n

).

So we know that

det A

is the volume rescaling factor of an arbitrary parallelopiped,

and this is true for any volume form d.

Theorem. Let A, B ∈ Mat

n

(F). Then det(AB) = det(A) det(B).

Proof.

Let

d

be a non-zero volume form on

F

n

(e.g. the “determinant”). Then

we can compute

d(ABe

1

, ··· , ABe

n

) = (det AB)d(e

1

, ··· , e

n

),

but we also have

d(ABe

1

, ··· , ABe

n

) = (det A)d(Be

1

, ··· , Be

n

) = (det A)(det B)d(e

1

, ··· , e

n

).

Since d is non-zero, we must have det AB = det A det B.

Corollary.

If

A ∈ Mat

n

(

F

) is invertible, then

det A 6

= 0. In fact, when

A

is

invertible, then det(A

−1

) = (det A)

−1

.

Proof. We have

1 = det I = det(AA

−1

) = det A det A

−1

.

So done.

Definition

(Singular matrices)

.

A matrix

A

is singular if

det A

= 0. Otherwise,

it is non-singular.

We have just shown that if

det A

= 0, then

A

is not invertible. Is the converse

true? If

det A 6

= 0, then can we conclude that

A

is invertible? The answer is

yes. We are now going to prove it in an abstract and clean way. We will later

prove this fact again by constructing an explicit formula for the inverse, which

involves dividing by the determinant. So if the determinant is non-zero, then we

know an inverse exists.

Theorem. Let A ∈ Mat

n

(F). Then the following are equivalent:

(i) A is invertible.

(ii) det A 6= 0.

(iii) r(A) = n.

Proof.

We have proved that (i)

⇒

(ii) above, and the rank-nullity theorem implies

(iii)

⇒

(i). We will prove (ii)

⇒

(iii). In fact we will show the contrapositive.

Suppose

r

(

A

)

< n

. By rank-nullity theorem,

n

(

A

)

>

0. So there is some

x =

λ

1

.

.

.

λ

n

such that Ax = 0. Suppose λ

k

6= 0. We define B as follows:

B =

1 λ

1

.

.

.

.

.

.

1 λ

k−1

λ

k

λ

k+1

1

.

.

.

.

.

.

λ

n

1

So

AB

has the

k

th column identically zero. So

det

(

AB

) = 0. So it is sufficient

to prove that det(B) 6= 0. But det B = λ

k

6= 0. So done.

We are now going to come up with an alternative formula for the determinant

(which is probably the one you are familiar with). To do so, we introduce the

following notation:

Notation.

Write

ˆ

A

ij

for the matrix obtained from

A

by deleting the

i

th row

and jth column.

Lemma. Let A ∈ Mat

n

(F). Then

(i) We can expand det A along the jth column by

det A =

n

X

i=1

(−1)

i+j

A

ij

det

ˆ

A

ij

.

(ii) We can expand det A along the ith row by

det A =

n

X

j=1

(−1)

i+j

A

ij

det

ˆ

A

ij

.

We could prove this directly from the definition, but that is messy and scary,

so let’s use volume forms instead.

Proof.

Since

det A

=

det A

T

, (i) and (ii) are equivalent. So it suffices to prove

just one of them. We have

det A = d(A

(1)

, ··· , A

(n)

),

where d is the volume form induced by the determinant. Then we can write as

det A = d

A

(1)

, ··· ,

n

X

i=1

A

ij

e

i

, ··· , A

(n)

!

=

n

X

i=1

A

ij

d(A

(1)

, ··· , e

i

, ··· , A

(n)

)

The volume form on the right is the determinant of a matrix with the

j

th column

replaced with

e

i

. We can move our columns around so that our matrix becomes

B =

ˆ

A

ij

0

stuff 1

We get that

det B

=

det

ˆ

A

ij

, since the only permutations that give a non-zero

sum are those that send

n

to

n

. In the row and column swapping, we have made

n − j column transpositions and n − i row transpositions. So we have

det A =

n

X

i=1

A

ij

(−1)

n−j

(−1)

n−i

det B

=

n

X

i=1

A

ij

(−1)

i+j

det

ˆ

A

ij

.

This is not only useful for computing determinants, but also computing

inverses.

Definition

(Adjugate matrix)

.

Let

A ∈ Mat

n

(

F

). The adjugate matrix of

A

,

written adj A, is the n × n matrix such that (adj A)

ij

= (−1)

i+j

det

ˆ

A

ji

.

The relevance is the following result:

Theorem.

If

A ∈ Mat

n

(

F

), then

A

(

adj A

) = (

det A

)

I

n

= (

adj A

)

A

. In particu-

lar, if det A 6= 0, then

A

−1

=

1

det A

adj A.

Note that this is not an efficient way to compute the inverse.

Proof. We compute

[(adj A)A]

jk

=

n

X

i=1

(adj A)

ji

A

ik

=

n

X

i=1

(−1)

i+j

det

ˆ

A

ij

A

ik

. (∗)

So if j = k, then [(adj A)A]

jk

= det A by the lemma.

Otherwise, if

j 6

=

k

, consider the matrix

B

obtained from

A

by replacing the

j

th column by the

k

th column. Then the right hand side of (

∗

) is just

det B

by

the lemma. But we know that if two columns are the same, the determinant is

zero. So the right hand side of (∗) is zero. So

[(adj A)A]

jk

= det Aδ

jk

The calculation for [

A adj A

] = (

det A

)

I

n

can be done in a similar manner, or by

considering (A adj A)

T

= (adj A)

T

A

T

= (adj(A

T

))A

T

= (det A)I

n

.

Note that the coefficients of (

adj A

) are just given by polynomials in the

entries of

A

, and so is the determinant. So if

A

is invertible, then its inverse is

given by a rational function (i.e. ratio of two polynomials) in the entries of A.

This is very useful theoretically, but not computationally, since the polyno-

mials are very large. There are better ways computationally, such as Gaussian

elimination.

We’ll end with a useful tricks to compute the determinant.

Lemma. Let A, B be square matrices. Then for any C, we have

det

A C

0 B

= (det A)(det B).

Proof. Suppose A ∈ Mat

k

(F), and B ∈ Mat

`

(F), so C ∈ Mat

k,`

(F). Let

X =

A C

0 B

.

Then by definition, we have

det X =

X

σ∈S

k+`

ε(σ)

k+`

Y

i=1

X

iσ(i)

.

If

j ≤ k

and

i > k

, then

X

ij

= 0. We only want to sum over permutations

σ

such

that

σ

(

i

)

> k

if

i > k

. So we are permuting the last

j

things among themselves,

and hence the first

k

things among themselves. So we can decompose this into

σ

=

σ

1

σ

2

, where

σ

1

is a permutation of

{

1

, ··· , k}

and fixes the remaining things,

while σ

2

fixes {1, ··· , k}, and permutes the remaining. Then

det X =

X

σ=σ

1

σ

2

ε(σ

1

σ

2

)

k

Y

i=1

X

iσ

1

(i)

`

Y

j=1

X

k+j σ

2

(k+j)

=

X

σ

1

∈S

k

ε(σ

1

)

k

Y

i=1

A

iσ

1

(i)

!

X

σ

2

∈S

`

ε(σ

2

)

`

Y

j=1

B

jσ

2

(j)

= (det A)(det B)

Corollary.

det

A

1

stuff

A

2

.

.

.

0 A

n

=

n

Y

i=1

det A

i