3Linear maps

IA Vectors and Matrices

3.4 Matrices

In the examples above, we have represented our linear maps by some object

R

such that

x

0

i

=

R

ij

x

j

. We call

R

the matrix for the linear map. In general, let

α : R

n

→ R

m

be a linear map, and x

0

= α(x).

Let {e

i

} be a basis of R

n

. Then x = x

j

e

j

for some x

j

. Then we get

x

0

= α(x

j

e

j

) = x

j

α(e

j

).

So we get that

x

0

i

= [α(e

j

)]

i

x

j

.

We now define A

ij

= [α(e

j

)]

i

. Then x

0

i

= A

ij

x

j

. We write

A = {A

ij

} =

A

11

··· A

1n

.

.

. A

ij

.

.

.

A

m1

··· A

mn

Here

A

ij

is the entry in the

i

th row of the

j

th column. We say that

A

is an

m × n matrix, and write x

0

= Ax.

We see that the columns of the matrix are the images of the standard basis

vectors under the mapping α.

Example.

3.4.1 Examples

(i)

In

R

2

, consider a reflection in a line with an angle

θ

to the

x

axis. We

know that

ˆ

i 7→ cos

2

θ

ˆ

i

+

sin

2

θ

ˆ

j

, with

ˆ

j 7→ −cos

2

θ

ˆ

j

+

sin

2

θ

ˆ

i

. Then the

matrix is

cos 2θ sin 2θ

sin 2θ −cos 2θ

.

(ii)

In

R

3

, as we’ve previously seen, a rotation by

θ

about the

z

axis is given

by

R =

cos θ −sin θ 0

sin θ cos θ 0

0 0 1

(iii)

In

R

3

, a reflection in plane with normal

ˆ

n

is given by

R

ij

=

δ

ij

−

2

ˆn

i

ˆn

j

.

Written as a matrix, we have

1 − 2ˆn

2

1

−2ˆn

1

ˆn

2

−2ˆn

1

ˆn

3

−2ˆn

2

ˆn

1

1 − 2ˆn

2

2

−2ˆn

2

ˆn

3

−2ˆn

3

ˆn

1

−2ˆn

3

ˆn

2

1 − 2ˆn

2

3

(iv)

Dilation (“stretching”)

α

:

R

3

→ R

3

is given by a map (

x, y, z

)

7→

(λx, µy, νz) for some λ, µ, ν. The matrix is

λ 0 0

0 µ 0

0 0 ν

(v) Shear: Consider S : R

3

→ R

3

that sheers in the x direction:

x

y

x x

0

sheer in x direction

We have (x, y, z) 7→ (x + λy, y, z). Then

S =

1 λ 0

0 1 0

0 0 1

3.4.2 Matrix Algebra

This part is mostly on a whole lot of definitions, saying what we can do with

matrices and classifying them into different types.

Definition

(Addition of matrices)

.

Consider two linear maps

α, β

:

R

n

→ R

m

.

The sum of α and β is defined by

(α + β)(x) = α(x) + β(x)

In terms of the matrix, we have

(A + B)

ij

x

j

= A

ij

x

j

+ B

ij

x

j

,

or

(A + B)

ij

= A

ij

+ B

ij

.

Definition

(Scalar multiplication of matrices)

.

Define (

λα

)

x

=

λ

[

α

(

x

)]. So

(λA)

ij

= λA

ij

.

Definition

(Matrix multiplication)

.

Consider maps

α

:

R

`

→ R

n

and

β

:

R

n

→ R

m

. The composition is

βα

:

R

`

→ R

m

. Take

x ∈ R

`

7→ x

00

∈ R

m

.

Then

x

00

= (

BA

)

x

=

Bx

0

, where

x

0

=

Ax

. Using suffix notation, we have

x

00

i

= (Bx

0

)

i

= b

ik

x

0

k

= B

ik

A

kj

x

j

. But x

00

i

= (BA)

ij

x

j

. So

(BA)

ij

= B

ik

A

kj

.

Generally, an

m ×n

matrix multiplied by an

n ×`

matrix gives an

m ×`

matrix.

(BA)

ij

is given by the ith row of B dotted with the jth column of A.

Note that the number of columns of

B

has to be equal to the number of rows

of

A

for multiplication to be defined. If

`

=

m

as well, then both

BA

and

AB

make sense, but

AB 6

=

BA

in general. In fact, they don’t even have to have the

same dimensions.

Also, since function composition is associative, we get A(BC) = (AB)C.

Definition

(Transpose of matrix)

.

If

A

is an

m × n

matrix, the transpose

A

T

is an n × m matrix defined by (A

T

)

ij

= A

ji

.

Proposition.

(i) (A

T

)

T

= A.

(ii) If x is a column vector

x

1

x

2

.

.

.

x

n

, x

T

is a row vector (x

1

x

2

···x

n

).

(iii) (AB)

T

= B

T

A

T

since (AB)

T

ij

= (AB)

ji

= A

jk

B

ki

= B

ki

A

jk

= (B

T

)

ik

(A

T

)

kj

= (B

T

A

T

)

ij

.

Definition

(Hermitian conjugate)

.

Define

A

†

= (

A

T

)

∗

. Similarly, (

AB

)

†

=

B

†

A

†

.

Definition (Symmetric matrix). A matrix is symmetric if A

T

= A.

Definition

(Hermitian matrix)

.

A matrix is Hermitian if

A

†

=

A

. (The diagonal

of a Hermitian matrix must be real).

Definition

(Anti/skew symmetric matrix)

.

A matrix is anti-symmetric or skew

symmetric if A

T

= −A. The diagonals are all zero.

Definition

(Skew-Hermitian matrix)

.

A matrix is skew-Hermitian if

A

†

=

−A

.

The diagonals are pure imaginary.

Definition

(Trace of matrix)

.

The trace of an

n ×n

matrix

A

is the sum of the

diagonal. tr(A) = A

ii

.

Example.

Consider the reflection matrix

R

ij

=

δ

ij

−

2

ˆn

i

ˆn

j

. We have

tr

(

A

) =

R

ii

= 3 − 2ˆn · ˆn = 3 − 2 = 1.

Proposition. tr(BC) = tr(CB)

Proof. tr(BC) = B

ik

C

ki

= C

ki

B

ik

= (CB)

kk

= tr(CB)

Definition (Identity matrix). I = δ

ij

.

3.4.3 Decomposition of an n × n matrix

Any

n ×n

matrix

B

can be split as a sum of symmetric and antisymmetric parts.

Write

B

ij

=

1

2

(B

ij

+ B

ji

)

| {z }

S

ij

+

1

2

(B

ij

− B

ji

)

| {z }

A

ij

.

We have

S

ij

=

S

ji

, so

S

is symmetric, while

A

ji

=

−A

ij

, and

A

is antisymmetric.

So B = S + A.

Furthermore , we can decompose

S

into an isotropic part (a scalar multiple

of the identity) plus a trace-less part (i.e. sum of diagonal = 0). Write

S

ij

=

1

n

tr(S)δ

ij

| {z }

isotropic part

+ (S

ij

−

1

n

tr(S)δ

ij

)

| {z }

T

ij

.

We have tr(T ) = T

ii

= S

ii

−

1

n

tr(S)δ

ii

= tr(S) −

1

n

tr(S)(n) = 0.

Putting all these together,

B =

1

n

tr(B)I +

1

2

(B + B

T

) −

1

n

tr(B)I

+

1

2

(B −B

T

).

In three dimensions, we can write the antisymmetric part

A

in terms of a single

vector: we have

A =

0 a −b

−a 0 c

b −c 0

and we can consider

ε

ijk

ω

k

=

0 ω

3

−ω

2

−ω

3

0 ω

1

ω

2

−ω

1

0

So if we have ω = (c, b, a), then A

ij

= ε

ijk

ω

k

.

This decomposition can be useful in certain physical applications. For

example, if the matrix represents the stress of a system, different parts of the

decomposition will correspond to different types of stresses.

3.4.4 Matrix inverse

Definition

(Inverse of matrix)

.

Consider an

m×n

matrix

A

and

n×m

matrices

B

and

C

. If

BA

=

I

, then we say

B

is the left inverse of

A

. If

AC

=

I

, then

we say

C

is the right inverse of

A

. If

A

is square (

n × n

), then

B

=

B

(

AC

) =

(

BA

)

C

=

C

, i.e. the left and right inverses coincide. Both are denoted by

A

−1

,

the inverse of A. Therefore we have

AA

−1

= A

−1

A = I.

Note that not all square matrices have inverses. For example, the zero matrix

clearly has no inverse.

Definition (Invertible matrix). If A has an inverse, then A is invertible.

Proposition. (AB)

−1

= B

−1

A

−1

Proof. (B

−1

A

−1

)(AB) = B

−1

(A

−1

A)B = B

−1

B = I.

Definition

(Orthogonal and unitary matrices)

.

A real

n×n

matrix is orthogonal

if

A

T

A

=

AA

T

=

I

, i.e.

A

T

=

A

−1

. A complex

n × n

matrix is unitary if

U

†

U = U U

†

= I, i.e. U

†

= U

−1

.

Note that an orthogonal matrix

A

satisfies

A

ik

(

A

T

kj

) =

δ

ij

, i.e.

A

ik

A

jk

=

δ

ij

.

We can see this as saying “the scalar product of two distinct rows is 0, and the

scalar product of a row with itself is 1”. Alternatively, the rows (and columns —

by considering A

T

) of an orthogonal matrix form an orthonormal set.

Similarly, for a unitary matrix,

U

ik

U

†

kj

=

δ

ij

, i.e.

u

ik

u

∗

jk

=

u

∗

ik

u

jk

=

δ

ij

. i.e.

the rows are orthonormal, using the definition of complex scalar product.

Example.

(i)

The reflection in a plane is an orthogonal matrix. Since

R

ij

=

δ

ij

−

2

n

i

n

j

,

We have

R

ik

R

jk

= (δ

ik

− 2n

i

n

k

)(δ

jk

− 2n

j

n

k

)

= δ

ik

δ

jk

− 2δ

jk

n

i

n

k

− 2δ

ik

n

j

n

k

+ 2n

i

n

k

n

j

n

k

= δ

ij

− 2n

i

n

j

− 2n

j

n

i

+ 4n

i

n

j

(n

k

n

k

)

= δ

ij

(ii)

The rotation is an orthogonal matrix. We could multiply out using suffix

notation, but it would be cumbersome to do so. Alternatively, denote

rotation matrix by

θ

about

ˆn

as

R

(

θ, ˆn

). Clearly,

R

(

θ, ˆn

)

−1

=

R

(

−θ, ˆn

).

We have

R

ij

(−θ, ˆn) = (cos θ)δ

ij

+ n

i

n

j

(1 − cos θ) + ε

ijk

n

k

sin θ

= (cos θ)δ

ji

+ n

j

n

i

(1 − cos θ) − ε

jik

n

k

sin θ

= R

ji

(θ, ˆn)

In other words, R(−θ, ˆn) = R(θ, ˆn)

T

. So R(θ, ˆn)

−1

= R(θ, ˆn)

T

.