1Normed vector spaces

II Linear Analysis



1.7 Hahn–Banach Theorem
Let
V
be a real normed vector space. What can we say about
V
=
B
(
V, R
)?
For instance, If V is non-trivial, must V
be non-trivial?
The main goal of this section is to prove the Hahn–Banach theorem (surprise),
which allows us to produce a lot of elements in
V
. Moreover, it doesn’t just tell
us that
V
is non-empty (this is rather dull), but provides a tool to craft (or at
least prove existence of) elements of V
that satisfy some property we want.
Proposition.
Let
V
be a real normed vector space, and
W V
has co-
dimension 1. Assume we have the following two items:
p : V R (not necessarily linear), which is positive homogeneous, i.e.
p(λv) = λp(v)
for all v V, λ > 0, and subadditive, i.e.
p(v
1
+ v
2
) p(v
1
) + p(v
2
)
for all
v
1
, v
2
V
. We can think of something like a norm, but more
general.
f : W R a linear map such that f(w) p(w) for all w W .
Then there exists an extension
˜
f
:
V R
which is linear such that
˜
f|
W
=
f
and
˜
f(v) p(v) for all v V .
Why do we want this weird theorem? Our objective is to find something in
V
. This theorem tells us that to find a bounded linear map in
V
, we just need
something in
W
bounded by a norm-like object, and then we can extend it to
V
.
Proof.
Let
v
0
V \ W
. Since
W
has co-dimension 1, every element
v V
can be written uniquely as
v
=
w
+
av
0
, for some
w W, a R
. Therefore it
suffices to define
˜
f(v
0
) and then extend linearly to V .
The condition we want to meet is
˜
f(w + av
0
) p(w + av
0
) ()
for all
w W, a R
. If
a
= 0, then this is satisfied since
˜
f
restricts to
f
on
W
.
If a > 0 then () is equivalent to
˜
f(w) + a
˜
f(v
0
) p(w + av
0
).
We can divide by a to obtain
˜
f(a
1
w) +
˜
f(v
0
) p(a
1
w + v
0
).
We let w
0
= a
1
w. So we can write this as
˜
f(v
0
) p(w
0
+ v
0
) f(w
0
),
for all w
0
W .
If a < 0, then () is equivalent to
˜
f(w) + a
˜
f(v
0
) p(w + av
0
).
We now divide by a and flip the sign of the equality. So we have
˜
f(a
1
w) +
˜
f(v
0
) (a
1
)p(w + av
0
).
In other words, we want
˜
f(v
0
) p(a
1
w v
0
) f(a
1
w).
We let w
0
= a
1
w. Then we are left with
˜
f(v
0
) p(w
0
v
0
) + f(w
0
).
for all w
0
W .
Hence we are done if we can define a
˜
f
(
v
0
) that satisfies these two conditions.
This is possible if and only if
p(w
1
v
0
) + f(w
1
) p(w
2
+ v
0
) f(w
2
)
for all w
1
, w
2
. This holds since
f(w
1
) + f(w
2
) = f(w
1
+ w
2
)
p(w
1
+ w
2
)
= p(w
1
v
0
+ w
2
+ v
0
)
p(w
1
v
0
) + p(w
2
+ v
0
).
So the result follows.
The goal is to “iterate” this to get a similar result without the co-dimension
1 assumption. While we can do this directly for finitely many times, this isn’t
helpful (since we already know a lot about finite dimensional normed spaces). To
perform an “infinite iteration”, we need the mysterious result known as Zorn’s
lemma.
Digression on Zorn’s lemma
We first need a few definitions before we can come to Zorn’s lemma.
Definition
(Partial order)
.
A relation
on a set
X
is a partial order if it
satisfies
(i) x x (reflexivity)
(ii) x y and y x implies x = y (antisymmetry)
(iii) x y and y z implies x z (transitivity)
Definition
(Total order)
.
Let (
S,
) be a partial order.
T S
is totally ordered
if for all x, y T , either x y or y x, i.e. every two things are related.
Definition
(Upper bound)
.
Let (
S,
) be a partial order.
S
0
S
subset. We
say b S is an upper bound of this subset if x b for all x S
0
.
Definition
(Maximal element)
.
Let (
S,
) be a partial order. Then
m S
is a
maximal element if x m implies x = m.
The glorious Zorn’s lemma tells us that:
Lemma
(Zorn’s lemma)
.
Let (
S,
) be a non-empty partially ordered set such
that every totally-ordered subset
S
0
has an upper bound in
S
. Then
S
has a
maximal element.
We will not give a proof of this lemma here, but can explain why it should
be true.
We start by picking one element
x
0
in
S
. If it is maximal, then done.
Otherwise, there is some
x
1
> x
0
. If this is not maximal, then pick
x
2
> x
1
. We
do this to infinity “and beyond” after picking infinitely many
x
i
, if we have
not yet reached a maximal element, we take an upper bound of this set, and call
it x
ω
. If this is not maximal, we can continue picking a larger element.
We can do this forever, but if this process never stops, even after infinite
time, we would have picked out more elements than there are in
S
, which is
clearly nonsense. Of course, this is hardly a formal proof. The proper proof can
be found in the IID Logic and Set Theory course.
Back to vector spaces
The Hahn–Banach theorem is just our previous proposition without the constraint
that W has co-dimension 1.
Theorem
(Hahn–Banach theorem*)
.
Let
V
be a real normed vector space, and
W V a subspace. Assume we have the following two items:
p
:
V R
(not necessarily linear), which is positive homogeneous and
subadditive;
f : W R a linear map such that f(w) p(w) for all w W .
Then there exists an extension
˜
f
:
V R
which is linear such that
˜
f|
W
=
f
and
˜
f(v) p(v) for all v V .
Proof. Let S be the set of all pairs (
˜
V ,
˜
f) such that
(i) W
˜
V V
(ii)
˜
f :
˜
V R is linear
(iii)
˜
f|
W
= f
(iv)
˜
f(
˜
v) p(
˜
v) for all
˜
v V
We introduce a partial order
on
S
by (
˜
V
1
,
˜
f
1
)
(
˜
V
2
,
˜
f
2
) if
˜
V
1
˜
V
2
and
˜
f
2
|
˜
V
1
=
˜
f
1
. It is easy to see that this is indeed a partial order.
We now check that this satisfies the assumptions of Zorn’s lemma. Let
{(
˜
V
α
,
˜
f
α
)}
αA
S be a totally ordered set. Define (
˜
V ,
˜
f) by
˜
V =
[
αA
˜
V
α
,
˜
f(x) =
˜
f
α
(x) for x
˜
V
α
.
This is well-defined because
{
(
˜
V ,
˜
f
α
)
}
αA
is totally ordered. So if
x
˜
V
α
1
and
x
˜
V
α
2
, wlog assume (
˜
V
α
1
,
˜
f
α
1
)
(
˜
V
α
2
,
˜
f
α
2
). So
˜
f
α
2
|
˜
V
α
2
=
˜
f
α
1
. So
˜
f
α
1
(x) =
˜
f
α
2
(x).
It should be clear that (
˜
V ,
˜
f
)
S
and (
˜
V ,
˜
f
) is indeed an upper bound of
{(
˜
V
α
,
˜
f
α
)}
αA
. So the conditions of Zorn’s lemma are satisfied.
Hence by Zorn’s lemma, there is an maximal element (
˜
W ,
˜
f
)
S
. Then by
definition,
˜
f
is linear, restricts to
f
on
W
, and bounded by
p
. We now show
that
˜
W = V .
Suppose not. Then there is some
v
0
V \
˜
W
. Define
˜
V
=
span{
˜
W , v
0
}
.
Now
˜
W
is a co-dimensional 1 subspace of
˜
V
. By our previous result, we know
that there is some
˜
˜
f
:
˜
V R
linear such that
˜
˜
f|
˜
W
=
˜
f
and
˜
˜
f
(
v
)
p
(
v
) for all
v
˜
V .
Hence we have (
˜
W ,
˜
˜
f
)
S
but (
˜
W ,
˜
f
)
<
(
˜
V ,
˜
˜
f
). This contradicts the
maximality of (
˜
W ,
˜
f).
There is a particularly important special case of this, which is also known as
Hahn-Banach theorem sometimes.
Corollary
(Hahn-Banach theorem 2.0)
.
Let
W V
be real normed vector
spaces. Given
f W
, there exists a
˜
f V
such that
˜
f|
W
=
f
and
k
˜
fk
V
=
kfk
W
.
Proof.
Use the Hahn-Banach theorem with
p
(
x
) =
kfk
W
kxk
V
for all
x V
.
Positive homogeneity and subadditivity follow directly from the axioms of the
norm. Then by definition
f
(
w
)
p
(
w
) for all
w W
. So Hahn-Banach theorem
says that there is
˜
f
:
V R
linear such that
˜
f|
W
=
f
and
˜
f
(
v
)
p
(
w
) =
kfk
W
kvk
V
.
Now notice that
˜
f(v) kf k
W
kvk
V
,
˜
f(v) =
˜
f(v) kf k
W
kvk
V
implies that |
˜
f(v)| kfk
W
kvk
V
for all v V .
On the other hand, we have (again taking supremum over non-zero v)
k
˜
fk
V
= sup
vV
|
˜
f(v)|
kvk
V
sup
wW
|f(w)|
kwk
W
= kfk
W
.
So indeed we have k
˜
fk
V
= kfk
W
.
We’ll have some quick corollaries of these theorems.
Proposition.
Let
V
be a real normed vector space. For every
v V \ {
0
}
,
there is some f
v
V
such that f
v
(v) = kvk
V
and kf
v
k
V
= 1.
Proof.
Apply Hahn-Banach theorem (2.0) with
W
=
span{v}
,
f
0
v
(
v
) =
kvk
V
.
Corollary.
Let
V
be a real normed vector space. Then
v
=
0
if and only if
f(v) = 0 for all f V
.
Corollary.
Let
V
be a non-trivial real normed vector space,
v, w V
with
v 6= w. Then there is some f V
such that f (v) 6= f (w).
Corollary.
If
V
is a non-trivial real normed vector space, then
V
is non-trivial.
We now want to restrict the discussion to double duals. We define
φ
:
V
V
∗∗
as before by φ(v)(f ) = f(v) for v V, f V
.
Proposition. The map φ : V V
∗∗
is an isometry, i.e. kφ(v)k
V
∗∗
= kvk
V
.
Proof. We have previously shown that
kφk
B(V,V
∗∗
)
1.
It thus suffices to show that the norm is greater than 1, or that
kφ(v)k
V
∗∗
kvk
V
.
We can assume v 6= 0, for which the inequality is trivial. We have
kφ(v)k
V
∗∗
= sup
fV
|φ(v)(f)|
kfk
V
|φ(v)(f
v
)|
kf
v
k
V
= |f
v
(v)| = kvk
V
,
where
f
v
is the function such that
f
v
(
v
) =
kvk
V
, kf
v
k
V
= 1 as we have
previously defined.
So done.
In particular,
φ
is injective and one can view
φ
as an isometric embedding of
V into V
∗∗
.
Definition (Reflexive). We say V is reflexive if φ(V ) = V
∗∗
.
Note that any reflexive space is Banach, since V
∗∗
, being the dual of V
, is
reflexive.
You might have heard that for any infinite dimensional vector space V , the
dual of
V
is always strictly larger than
V
. This does not prevent an infinite
dimensional vector space from being reflexive. When we said the dual of
V
is
always strictly larger than
V
, we are referring to the algebraic dual, i.e. the set of
all linear maps from
V
to
F
. In the definition of reflexive (and everywhere else
where we mention “dual” in this course), we mean the continuous dual, where
we look at the set of all bounded linear maps from
V
to
F
. It is indeed possible
for the continuous dual to be isomorphic to the original space, even in infinite
dimensional spaces, as we will see later.
Example.
Finite-dimensional normed vector spaces are reflexive. Also
`
p
is
reflexive for p (1, ).
Recall that given T B(V, W ), we defined T
B(W
, V
) by
T
(f)(v) = f(Tv)
for v V, f W
.
We have previously shown that
kT
k
B(W
,V
)
kT k
B(V,W )
.
We will now show that in fact equality holds.
Proposition.
kT
k
B(W
,V
)
= kT k
B(V,W )
.
Proof. We have already shown that
kT
k
B(W
,V
)
kT k
B(V,W )
.
For the other inequality, first let ε > 0. Since
kT k
B(V,W )
= sup
vV
kT vk
W
kvk
V
by definition, there is some
v V
such that
kT vk
W
kT k
B(V,W )
kvk
V
ε
.
wlog, assume kvk
V
= 1. So
kT vk
W
kT k
B(V,W )
ε.
Therefore, we get that
kT
k
B(W
,V
)
= sup
fW
kT
(f)k
V
kfk
W
kT
(f
T v
)k
V
|T
(f
T v
)(v)|
= |f
T v
(T v)|
= kT vk
W
kT k
B(V,W )
ε,
where we used the fact that
kf
T v
k
W
and
kvk
V
are both 1. Since
ε
is arbitrary,
we are done.