1Vector spaces
IB Linear Algebra
1.2 Linear independence, bases and the Steinitz exchange
lemma
Recall that in
R
n
, we had the “standard basis” made of vectors of the form
e
i
= (0
, ··· ,
0
,
1
,
0
, ··· ,
0), with 1 in the
i
th component and 0 otherwise. We
call this a basis because everything in
R
n
can be (uniquely) written as a sum
of (scalar multiples of) these basis elements. In other words, the whole
R
n
is
generated by taking sums and multiples of the basis elements.
We would like to capture this idea in general vector spaces. The most
important result in this section is to prove that for any vector space
V
, any two
basis must contain the same number of elements. This means we can define the
“dimension” of a vector space as the number of elements in the basis.
While this result sounds rather trivial, it is a very important result. We will
in fact prove a slightly stronger statement than what was stated above, and this
ensures that the dimension of a vector space is well-behaved. For example, the
subspace of a vector space has a smaller dimension than the larger space (at
least when the dimensions are finite).
This is not the case when we study modules in IB Groups, Rings and Modules,
which are generalizations of vector spaces. Not all modules have basis, which
makes it difficult to define the dimension. Even for those that have basis, the
behaviour of the “dimension” is complicated when, say, we take submodules.
The existence and well-behavedness of basis and dimension is what makes linear
algebra different from modules.
Definition
(Span)
.
Let
V
be a vector space over
F
and
S ⊆ V
. The span of
S
is defined as
hSi =
(
n
X
i=1
λ
i
s
i
: λ
i
∈ F, s
i
∈ S, n ≥ 0
)
This is the smallest subspace of V containing S.
Note that the sums must be finite. We will not play with infinite sums, since
the notion of convergence is not even well defined in a general vector space.
Example.
(i) Let V = R
3
and S =
1
0
0
,
0
1
1
,
1
2
2
. Then
hSi =
a
b
b
: a, b ∈ R
.
Note that any subset of S of order 2 has the same span as S.
(ii) Let X be a set, x ∈ X. Define the function δx : X → F by
δx(y) =
(
1 y = x
0 y 6= x
.
Then hδx : x ∈ Xi is the set of all functions with finite support.
Definition
(Spanning set)
.
Let
V
be a vector space over
F
and
S ⊆ V
.
S
spans
V if hSi = V .
Definition
(Linear independence)
.
Let
V
be a vector space over
F
and
S ⊆ V
.
Then S is linearly independent if whenever
n
X
i=1
λ
i
s
i
= 0 with λ
i
∈ F, s
1
, s
2
, ··· , s
n
∈ S distinct,
we must have λ
i
= 0 for all i.
If S is not linearly independent, we say it is linearly dependent.
Definition
(Basis)
.
Let
V
be a vector space over
F
and
S ⊆ V
. Then
S
is a
basis for V if S is linearly independent and spans V .
Definition
(Finite dimensional)
.
A vector space is finite dimensional if there
is a finite basis.
Ideally, we would want to define the dimension as the number of vectors in
the basis. However, we must first show that this is well-defined. It is certainly
plausible that a vector space has a basis of size 7 as well as a basis of size 3. We
must show that this can never happen, which is something we’ll do soon.
We will first have an example:
Example.
Again, let
V
=
R
3
and
S
=
1
0
0
,
0
1
1
,
1
2
2
. Then
S
is
linearly dependent since
1
1
0
0
+ 2
0
1
1
+ (−1)
1
2
2
= 0.
S also does not span V since
0
0
1
6∈ hSi.
Note that no linearly independent set can contain
0
, as 1
· 0
=
0
. We also
have h∅i = {0} and ∅ is a basis for this space.
There is an alternative way in which we can define linear independence.
Lemma. S ⊆ V
is linearly dependent if and only if there are distinct
s
0
, ··· , s
n
∈
S and λ
1
, ··· , λ
n
∈ F such that
n
X
i=1
λ
i
s
i
= s
0
.
Proof.
If
S
is linearly dependent, then there are some
λ
1
, ··· , λ
n
∈ F
not all
zero and s
1
, ··· , s
n
∈ S such that
P
λ
i
s
i
= 0. Wlog, let λ
1
6= 0. Then
s
1
=
n
X
i=2
−
λ
i
λ
1
s
i
.
Conversely, if s
0
=
P
n
i=1
λ
i
s
i
, then
(−1)s
0
+
n
X
i=1
λ
i
s
i
= 0.
So S is linearly dependent.
This in turn gives an alternative characterization of what it means to be a
basis:
Proposition.
If
S
=
{e
1
, ··· , e
n
}
is a subset of
V
over
F
, then it is a basis if
and only if every
v ∈ V
can be written uniquely as a finite linear combination
of elements in S, i.e. as
v =
n
X
i=1
λ
i
e
i
.
Proof.
We can view this as a combination of two statements: it can be spanned
in at least one way, and it can be spanned in at most one way. We will see that
the first part corresponds to
S
spanning
V
, and the second part corresponds to
S being linearly independent.
In fact,
S
spanning
V
is defined exactly to mean that every item
v ∈ V
can
be written as a finite linear combination in at least one way.
Now suppose that S is linearly independent, and we have
v =
n
X
i=1
λ
i
e
i
=
n
X
i=1
µ
i
e
i
.
Then we have
0 = v − v =
n
X
i=1
(λ
i
− µ
i
)e
i
.
Linear independence implies that
λ
i
− µ
i
= 0 for all
i
. Hence
λ
i
=
µ
i
. So
v
can
be expressed in a unique way.
On the other hand, if S is not linearly independent, then we have
0 =
n
X
i=1
λ
i
e
i
where λ
i
6= 0 for some i. But we also know that
0 =
n
X
i=1
0 · e
i
.
So there are two ways to write 0 as a linear combination. So done.
Now we come to the key theorem:
Theorem
(Steinitz exchange lemma)
.
Let
V
be a vector space over
F
, and
S
=
{e
1
, ··· , e
n
}
a finite linearly independent subset of
V
, and
T
a spanning
subset of
V
. Then there is some
T
0
⊆ T
of order
n
such that (
T \ T
0
)
∪ S
still
spans V . In particular, |T | ≥ n.
What does this actually say? This says if
T
is spanning and
S
is independent,
there is a way of grabbing
|S|
many elements away from
T
and replace them
with S, and the result will still be spanning.
In some sense, the final remark is the most important part. It tells us that
we cannot have a independent set larger than a spanning set, and most of our
corollaries later will only use this remark.
This is sometimes stated in the following alternative way for |T | < ∞.
Corollary.
Let
{e
1
, ··· , e
n
}
be a linearly independent subset of
V
, and sup-
pose
{f
1
, ··· , f
m
}
spans
V
. Then there is a re-ordering of the
{f
i
}
such that
{e
1
, ··· , e
n
, f
n+1
, ··· , f
m
} spans V .
The proof is going to be slightly technical and notationally daunting. So it
helps to give a brief overview of what we are going to do in words first. The idea
is to do the replacement one by one. The first one is easy. Start with
e
1
. Since
T is spanning, we can write
e
1
=
X
λ
i
t
i
for some
t
i
∈ T, λ
i
∈ F
non-zero. We then replace with
t
1
with
e
1
. The result is
still spanning, since the above formula allows us to write
t
1
in terms of
e
1
and
the other t
i
.
We continue inductively. For the rth element, we again write
e
r
=
X
λ
i
t
i
.
We would like to just pick a random
t
i
and replace it with
e
r
. However, we
cannot do this arbitrarily, since the lemma wants us to replace something in T
with with
e
r
. After all that replacement procedure before, some of the
t
i
might
have actually come from S.
This is where the linear independence of
S
kicks in. While some of the
t
i
might be from S, we cannot possibly have all all of them being from S, or else
this violates the linear independence of
S
. Hence there is something genuinely
from T , and we can safely replace it with e
r
.
We now write this argument properly and formally.
Proof.
Suppose that we have already found
T
0
r
⊆ T
of order 0
≤ r < n
such that
T
r
= (T \ T
0
r
) ∪ {e
1
, ··· , e
r
}
spans V .
(Note that the case
r
= 0 is trivial, since we can take
T
0
r
=
∅
, and the case
r = n is the theorem which we want to achieve.)
Suppose we have these. Since T
r
spans V , we can write
e
r+1
=
k
X
i=1
λ
i
t
i
, λ
i
∈ F, t
i
∈ T
r
.
We know that the
e
i
are linearly independent, so not all
t
i
’s are
e
i
’s. So there is
some j such that t
j
∈ (T \ T
0
r
). We can write this as
t
j
=
1
λ
j
e
r+1
+
X
i6=j
−
λ
i
λ
j
t
i
.
We let T
0
r+1
= T
0
r
∪ {t
j
} of order r + 1, and
T
r+1
= (T \ T
0
r+1
) ∪ {e
1
, ··· , e
r+1
} = (T
r
\ {t
j
}} ∪ {e
r+1
}
Since t
j
is in the span of T
r
∪ {e
r+1
}, we have t
j
∈ hT
r+1
i. So
V ⊇ hT
r+1
i ⊇ hT
r
i = V.
So hT
r+1
i = V .
Hence we can inductively find T
n
.
From this lemma, we can immediately deduce a lot of important corollaries.
Corollary. Suppose V is a vector space over F with a basis of order n. Then
(i) Every basis of V has order n.
(ii) Any linearly independent set of order n is a basis.
(iii) Every spanning set of order n is a basis.
(iv) Every finite spanning set contains a basis.
(v) Every linearly independent subset of V can be extended to basis.
Proof. Let S = {e
1
, ··· , e
n
} be the basis for V .
(i)
Suppose
T
is another basis. Since
S
is independent and
T
is spanning,
|T | ≥ |S|.
The other direction is less trivial, since
T
might be infinite, and Steinitz
does not immediately apply. Instead, we argue as follows: since
T
is
linearly independent, every finite subset of
T
is independent. Also,
S
is
spanning. So every finite subset of
T
has order at most
|S|
. So
|T | ≤ |S|
.
So |T | = |S|.
(ii)
Suppose now that
T
is a linearly independent subset of order
n
, but
hT i 6
=
V
. Then there is some
v ∈ V \hT i
. We now show that
T ∪ {v}
is
independent. Indeed, if
λ
0
v +
m
X
i=1
λ
i
t
i
= 0
with λ
i
∈ F, t
1
, ··· , t
m
∈ T distinct, then
λ
0
v =
m
X
i=1
(−λ
i
)t
i
.
Then
λ
0
v ∈ hT i
. So
λ
0
= 0. As
T
is linearly independent, we have
λ
0
=
···
=
λ
m
= 0. So
T ∪ {v}
is a linearly independent subset of size
> n. This is a contradiction since S is a spanning set of size n.
(iii)
Let
T
be a spanning set of order
n
. If
T
were linearly dependent, then
there is some t
0
, ··· , t
m
∈ T distinct and λ
1
, ··· , λ
m
∈ F such that
t
0
=
X
λ
i
t
i
.
So
t
0
∈ hT \ {t
0
}i
, i.e.
hT \ {t
0
}i
=
V
. So
T \ {t
0
}
is a spanning set of
order n − 1, which is a contradiction.
(iv)
Suppose
T
is any finite spanning set. Let
T
0
⊆ T
be a spanning set of least
possible size. This exists because
T
is finite. If
|T
0
|
has size
n
, then done
by (iii). Otherwise by the Steinitz exchange lemma, it has size
|T
0
| > n
.
So
T
0
must be linearly dependent because
S
is spanning. So there is some
t
0
, ··· , t
m
∈ T
distinct and
λ
1
, ··· , λ
m
∈ F
such that
t
0
=
P
λ
i
t
i
. Then
T
0
\ {t
0
} is a smaller spanning set. Contradiction.
(v)
Suppose
T
is a linearly independent set. Since
S
spans, there is some
S
0
⊆ S
of order
|T |
such that (
S \S
0
)
∪T
spans
V
by the Steinitz exchange
lemma. So by (ii), (S \ S
0
) ∪ T is a basis of V containing T.
Note that the last part is where we actually use the full result of Steinitz.
Finally, we can use this to define the dimension.
Definition
(Dimension)
.
If
V
is a vector space over
F
with finite basis
S
, then
the dimension of V , written
dim V = dim
F
V = |S|.
By the corollary,
dim V
does not depend on the choice of
S
. However, it does
depend on
F
. For example,
dim
C
C
= 1 (since
{
1
}
is a basis), but
dim
R
C
= 2
(since {1, i} is a basis).
After defining the dimension, we can prove a few things about dimensions.
Lemma.
If
V
is a finite dimensional vector space over
F
,
U ⊆ V
is a proper
subspace, then U is finite dimensional and dim U < dim V .
Proof.
Every linearly independent subset of
V
has size at most
dim V
. So let
S ⊆ U
be a linearly independent subset of largest size. We want to show that
S
spans U and |S| < dim V .
If
v ∈ V \hSi
, then
S ∪{v}
is linearly independent. So
v 6∈ U
by maximality
of S. This means that hSi = U.
Since
U 6
=
V
, there is some
v ∈ V \ U
=
V \ hSi
. So
S ∪ {v}
is a linearly
independent subset of order
|S|
+ 1. So
|S|
+ 1
≤ dim V
. In particular,
dim U
=
|S| < dim V .
Proposition.
If
U, W
are subspaces of a finite dimensional vector space
V
, then
dim(U + W ) = dim U + dim W − dim(U ∩ W ).
The proof is not hard, as long as we manage to pick the right basis to do the
proof. This is our slogan:
When you choose a basis, always choose the right basis.
We need a basis for all four of them, and we want to compare the bases. So we
want to pick bases that are compatible.
Proof.
Let
R
=
{v
1
, ··· , v
r
}
be a basis for
U ∩W
. This is a linearly independent
subset of U. So we can extend it to be a basis of U by
S = {v
1
, ··· , v
r
, u
r+1
, ··· , u
s
}.
Similarly, for W , we can obtain a basis
T = {v
1
, ··· , v
r
, w
r+1
, ··· , w
t
}.
We want to show that
dim
(
U
+
W
) =
s
+
t − r
. It is sufficient to prove that
S ∪ T is a basis for U + W .
We first show spanning. Suppose
u
+
w ∈ U
+
W
,
u ∈ U, w ∈ W
. Then
u ∈ hSi and w ∈ hT i. So u + w ∈ hS ∪ T i. So U + W = hS ∪ T i.
To show linear independence, suppose we have a linear relation
r
X
i=1
λ
i
v
i
+
s
X
j=r+1
µ
j
u
j
+
t
X
k=r+1
ν
k
w
k
= 0.
So
X
λ
i
v
i
+
X
µ
j
u
j
= −
X
ν
k
w
k
.
Since the left hand side is something in
U
, and the right hand side is something
in W , they both lie in U ∩ W .
Since
S
is a basis of
U
, there is only one way of writing the left hand vector
as a sum of
v
i
and
u
j
. However, since
R
is a basis of
U ∩ W
, we can write the
left hand vector just as a sum of
v
i
’s. So we must have
µ
j
= 0 for all
j
. Then
we have
X
λ
i
v
i
+
X
ν
k
w
k
= 0.
Finally, since
T
is linearly independent,
λ
i
=
ν
k
= 0 for all
i, k
. So
S ∪ T
is
linearly independent.
Proposition.
If
V
is a finite dimensional vector space over
F
and
U ∪ V
is a
subspace, then
dim V = dim U + dim V /U.
We can view this as a linear algebra version of Lagrange’s theorem. Combined
with the first isomorphism theorem for vector spaces, this gives the rank-nullity
theorem.
Proof.
Let
{u
1
, ··· , u
m
}
be a basis for
U
and extend this to a basis
{u
1
, ··· , u
m
,
v
m+1
, ··· , v
n
}
for
V
. We want to show that
{v
m+1
+
U, ··· , v
n
+
U}
is a basis
for V/U.
It is easy to see that this spans V /U. If v + U ∈ V/U, then we can write
v =
X
λ
i
u
i
+
X
µ
i
v
i
.
Then
v + U =
X
µ
i
(v
i
+ U) +
X
λ
i
(u
i
+ U) =
X
µ
i
(v
i
+ U).
So done.
To show that they are linearly independent, suppose that
X
λ
i
(v
i
+ U) = 0 + U = U.
Then this requires
X
λ
i
v
i
∈ U.
Then we can write this as a linear combination of the u
i
’s. So
X
λ
i
v
i
=
X
µ
j
u
j
for some
µ
j
. Since
{u
1
, ··· , u
m
, v
n+1
, ··· , v
n
}
is a basis for
V
, we must have
λ
i
= µ
j
= 0 for all i, j. So {v
i
+ U} is linearly independent.