3Linear maps
IA Vectors and Matrices
3.4 Matrices
In the examples above, we have represented our linear maps by some object
R
such that
x
0
i
=
R
ij
x
j
. We call
R
the matrix for the linear map. In general, let
α : R
n
→ R
m
be a linear map, and x
0
= α(x).
Let {e
i
} be a basis of R
n
. Then x = x
j
e
j
for some x
j
. Then we get
x
0
= α(x
j
e
j
) = x
j
α(e
j
).
So we get that
x
0
i
= [α(e
j
)]
i
x
j
.
We now define A
ij
= [α(e
j
)]
i
. Then x
0
i
= A
ij
x
j
. We write
A = {A
ij
} =
A
11
··· A
1n
.
.
. A
ij
.
.
.
A
m1
··· A
mn
Here
A
ij
is the entry in the
i
th row of the
j
th column. We say that
A
is an
m × n matrix, and write x
0
= Ax.
We see that the columns of the matrix are the images of the standard basis
vectors under the mapping α.
Example.
3.4.1 Examples
(i)
In
R
2
, consider a reflection in a line with an angle
θ
to the
x
axis. We
know that
ˆ
i 7→ cos
2
θ
ˆ
i
+
sin
2
θ
ˆ
j
, with
ˆ
j 7→ −cos
2
θ
ˆ
j
+
sin
2
θ
ˆ
i
. Then the
matrix is
cos 2θ sin 2θ
sin 2θ −cos 2θ
.
(ii)
In
R
3
, as we’ve previously seen, a rotation by
θ
about the
z
axis is given
by
R =
cos θ −sin θ 0
sin θ cos θ 0
0 0 1
(iii)
In
R
3
, a reflection in plane with normal
ˆ
n
is given by
R
ij
=
δ
ij
−
2
ˆn
i
ˆn
j
.
Written as a matrix, we have
1 − 2ˆn
2
1
−2ˆn
1
ˆn
2
−2ˆn
1
ˆn
3
−2ˆn
2
ˆn
1
1 − 2ˆn
2
2
−2ˆn
2
ˆn
3
−2ˆn
3
ˆn
1
−2ˆn
3
ˆn
2
1 − 2ˆn
2
3
(iv)
Dilation (“stretching”)
α
:
R
3
→ R
3
is given by a map (
x, y, z
)
7→
(λx, µy, νz) for some λ, µ, ν. The matrix is
λ 0 0
0 µ 0
0 0 ν
(v) Shear: Consider S : R
3
→ R
3
that sheers in the x direction:
x
y
x x
0
sheer in x direction
We have (x, y, z) 7→ (x + λy, y, z). Then
S =
1 λ 0
0 1 0
0 0 1
3.4.2 Matrix Algebra
This part is mostly on a whole lot of definitions, saying what we can do with
matrices and classifying them into different types.
Definition
(Addition of matrices)
.
Consider two linear maps
α, β
:
R
n
→ R
m
.
The sum of α and β is defined by
(α + β)(x) = α(x) + β(x)
In terms of the matrix, we have
(A + B)
ij
x
j
= A
ij
x
j
+ B
ij
x
j
,
or
(A + B)
ij
= A
ij
+ B
ij
.
Definition
(Scalar multiplication of matrices)
.
Define (
λα
)
x
=
λ
[
α
(
x
)]. So
(λA)
ij
= λA
ij
.
Definition
(Matrix multiplication)
.
Consider maps
α
:
R
`
→ R
n
and
β
:
R
n
→ R
m
. The composition is
βα
:
R
`
→ R
m
. Take
x ∈ R
`
7→ x
00
∈ R
m
.
Then
x
00
= (
BA
)
x
=
Bx
0
, where
x
0
=
Ax
. Using suffix notation, we have
x
00
i
= (Bx
0
)
i
= b
ik
x
0
k
= B
ik
A
kj
x
j
. But x
00
i
= (BA)
ij
x
j
. So
(BA)
ij
= B
ik
A
kj
.
Generally, an
m ×n
matrix multiplied by an
n ×`
matrix gives an
m ×`
matrix.
(BA)
ij
is given by the ith row of B dotted with the jth column of A.
Note that the number of columns of
B
has to be equal to the number of rows
of
A
for multiplication to be defined. If
`
=
m
as well, then both
BA
and
AB
make sense, but
AB 6
=
BA
in general. In fact, they don’t even have to have the
same dimensions.
Also, since function composition is associative, we get A(BC) = (AB)C.
Definition
(Transpose of matrix)
.
If
A
is an
m × n
matrix, the transpose
A
T
is an n × m matrix defined by (A
T
)
ij
= A
ji
.
Proposition.
(i) (A
T
)
T
= A.
(ii) If x is a column vector
x
1
x
2
.
.
.
x
n
, x
T
is a row vector (x
1
x
2
···x
n
).
(iii) (AB)
T
= B
T
A
T
since (AB)
T
ij
= (AB)
ji
= A
jk
B
ki
= B
ki
A
jk
= (B
T
)
ik
(A
T
)
kj
= (B
T
A
T
)
ij
.
Definition
(Hermitian conjugate)
.
Define
A
†
= (
A
T
)
∗
. Similarly, (
AB
)
†
=
B
†
A
†
.
Definition (Symmetric matrix). A matrix is symmetric if A
T
= A.
Definition
(Hermitian matrix)
.
A matrix is Hermitian if
A
†
=
A
. (The diagonal
of a Hermitian matrix must be real).
Definition
(Anti/skew symmetric matrix)
.
A matrix is anti-symmetric or skew
symmetric if A
T
= −A. The diagonals are all zero.
Definition
(Skew-Hermitian matrix)
.
A matrix is skew-Hermitian if
A
†
=
−A
.
The diagonals are pure imaginary.
Definition
(Trace of matrix)
.
The trace of an
n ×n
matrix
A
is the sum of the
diagonal. tr(A) = A
ii
.
Example.
Consider the reflection matrix
R
ij
=
δ
ij
−
2
ˆn
i
ˆn
j
. We have
tr
(
A
) =
R
ii
= 3 − 2ˆn · ˆn = 3 − 2 = 1.
Proposition. tr(BC) = tr(CB)
Proof. tr(BC) = B
ik
C
ki
= C
ki
B
ik
= (CB)
kk
= tr(CB)
Definition (Identity matrix). I = δ
ij
.
3.4.3 Decomposition of an n × n matrix
Any
n ×n
matrix
B
can be split as a sum of symmetric and antisymmetric parts.
Write
B
ij
=
1
2
(B
ij
+ B
ji
)
| {z }
S
ij
+
1
2
(B
ij
− B
ji
)
| {z }
A
ij
.
We have
S
ij
=
S
ji
, so
S
is symmetric, while
A
ji
=
−A
ij
, and
A
is antisymmetric.
So B = S + A.
Furthermore , we can decompose
S
into an isotropic part (a scalar multiple
of the identity) plus a trace-less part (i.e. sum of diagonal = 0). Write
S
ij
=
1
n
tr(S)δ
ij
| {z }
isotropic part
+ (S
ij
−
1
n
tr(S)δ
ij
)
| {z }
T
ij
.
We have tr(T ) = T
ii
= S
ii
−
1
n
tr(S)δ
ii
= tr(S) −
1
n
tr(S)(n) = 0.
Putting all these together,
B =
1
n
tr(B)I +
1
2
(B + B
T
) −
1
n
tr(B)I
+
1
2
(B −B
T
).
In three dimensions, we can write the antisymmetric part
A
in terms of a single
vector: we have
A =
0 a −b
−a 0 c
b −c 0
and we can consider
ε
ijk
ω
k
=
0 ω
3
−ω
2
−ω
3
0 ω
1
ω
2
−ω
1
0
So if we have ω = (c, b, a), then A
ij
= ε
ijk
ω
k
.
This decomposition can be useful in certain physical applications. For
example, if the matrix represents the stress of a system, different parts of the
decomposition will correspond to different types of stresses.
3.4.4 Matrix inverse
Definition
(Inverse of matrix)
.
Consider an
m×n
matrix
A
and
n×m
matrices
B
and
C
. If
BA
=
I
, then we say
B
is the left inverse of
A
. If
AC
=
I
, then
we say
C
is the right inverse of
A
. If
A
is square (
n × n
), then
B
=
B
(
AC
) =
(
BA
)
C
=
C
, i.e. the left and right inverses coincide. Both are denoted by
A
−1
,
the inverse of A. Therefore we have
AA
−1
= A
−1
A = I.
Note that not all square matrices have inverses. For example, the zero matrix
clearly has no inverse.
Definition (Invertible matrix). If A has an inverse, then A is invertible.
Proposition. (AB)
−1
= B
−1
A
−1
Proof. (B
−1
A
−1
)(AB) = B
−1
(A
−1
A)B = B
−1
B = I.
Definition
(Orthogonal and unitary matrices)
.
A real
n×n
matrix is orthogonal
if
A
T
A
=
AA
T
=
I
, i.e.
A
T
=
A
−1
. A complex
n × n
matrix is unitary if
U
†
U = U U
†
= I, i.e. U
†
= U
−1
.
Note that an orthogonal matrix
A
satisfies
A
ik
(
A
T
kj
) =
δ
ij
, i.e.
A
ik
A
jk
=
δ
ij
.
We can see this as saying “the scalar product of two distinct rows is 0, and the
scalar product of a row with itself is 1”. Alternatively, the rows (and columns —
by considering A
T
) of an orthogonal matrix form an orthonormal set.
Similarly, for a unitary matrix,
U
ik
U
†
kj
=
δ
ij
, i.e.
u
ik
u
∗
jk
=
u
∗
ik
u
jk
=
δ
ij
. i.e.
the rows are orthonormal, using the definition of complex scalar product.
Example.
(i)
The reflection in a plane is an orthogonal matrix. Since
R
ij
=
δ
ij
−
2
n
i
n
j
,
We have
R
ik
R
jk
= (δ
ik
− 2n
i
n
k
)(δ
jk
− 2n
j
n
k
)
= δ
ik
δ
jk
− 2δ
jk
n
i
n
k
− 2δ
ik
n
j
n
k
+ 2n
i
n
k
n
j
n
k
= δ
ij
− 2n
i
n
j
− 2n
j
n
i
+ 4n
i
n
j
(n
k
n
k
)
= δ
ij
(ii)
The rotation is an orthogonal matrix. We could multiply out using suffix
notation, but it would be cumbersome to do so. Alternatively, denote
rotation matrix by
θ
about
ˆn
as
R
(
θ, ˆn
). Clearly,
R
(
θ, ˆn
)
−1
=
R
(
−θ, ˆn
).
We have
R
ij
(−θ, ˆn) = (cos θ)δ
ij
+ n
i
n
j
(1 − cos θ) + ε
ijk
n
k
sin θ
= (cos θ)δ
ji
+ n
j
n
i
(1 − cos θ) − ε
jik
n
k
sin θ
= R
ji
(θ, ˆn)
In other words, R(−θ, ˆn) = R(θ, ˆn)
T
. So R(θ, ˆn)
−1
= R(θ, ˆn)
T
.