6Differentiation from ℝm to ℝn

IB Analysis II



6.2 The operator norm
So far, we have only looked at derivatives at a single point. We haven’t discussed
much about the derivative at, say, a neighbourhood or the whole space. We
might want to ask if the derivative is continuous or bounded. However, this is
not straightforward, since the derivative is a linear map, and we need to define
these notions for functions whose values are linear maps. In particular, we want
to understand the map
D
f :
B
r
(a)
L
(
R
n
;
R
m
) given by x
7→ D
f(x). To do so,
we need a metric on the space L(R
n
; R
m
). In fact, we will use a norm.
Let
L
=
L
(
R
n
;
R
m
). This is a vector space over
R
defined with addition and
scalar multiplication defined pointwise. In fact, L is a subspace of C(R
n
, R
m
).
To prove this, we have to prove that all linear maps are continuous. Let
{e
1
, ··· , e
n
} be the standard basis for R
n
, and for
x =
n
X
j=1
x
j
e
j
,
and A L, we have
A(x) =
n
X
j=1
x
j
Ae
j
.
By Cauchy-Schwarz, we know
A(x)
n
X
j=1
|x
j
|∥A(e
j
) x
v
u
u
t
n
X
j=1
A(e
j
)
2
.
So we see
A
is Lipschitz, and is hence continuous. Alternatively, this follows
from the fact that linear maps are differentiable and hence continuous.
We can use this fact to define the norm of linear maps. Since
L
is finite-
dimensional (it is isomorphic to the space of real
m × n
matrices, as vector
spaces, and hence have dimension
mn
), it really doesn’t matter which norm we
pick as they are all Lipschitz equivalent, but a convenient choice is the sup norm,
or the operator norm.
Definition (Operator norm). The operator norm on
L
=
L
(
R
n
;
R
m
) is defined
by
A = sup
xR
n
:x=1
Ax.
Proposition.
(i) A < for all A L.
(ii) · is indeed a norm on L.
(iii)
A = sup
R
n
\{0}
Ax
x
.
(iv) Ax A∥∥x for all x R
n
.
(v)
Let
A L
(
R
n
;
R
m
) and
B L
(
R
m
;
R
p
). Then
BA
=
B A L
(
R
n
;
R
p
)
and
BA B∥∥A.
Proof.
(i) This is since A is continuous and {x R
n
: x = 1} is compact.
(ii) The only non-trivial part is the triangle inequality. We have
A + B = sup
x=1
Ax + Bx
sup
x=1
(Ax + Bx)
sup
x=1
Ax + sup
x=1
Bx
= A + B
(iii) This follows from linearity of A, and for any x R
n
, we have
x
x
= 1.
(iv) Immediate from above.
(v)
BA = sup
R
n
\{0}
BAx
x
sup
R
n
\{0}
B∥∥Ax
x
= B∥∥A.
For certain easy cases, we have a straightforward expression for the operator
norm.
Proposition.
(i)
If
A L
(
R, R
m
), then
A
can be written as
Ax
=
x
a for some a
R
m
.
Moreover,
A
=
a
, where the second norm is the Euclidean norm in
R
n
(ii)
If
A L
(
R
n
, R
), then
A
x = x
·
a for some fixed a
R
n
. Again,
A
=
a
.
Proof.
(i)
Set
A
(1) = a. Then by linearity, we get
Ax
=
xA
(1) =
x
a. Then we have
Ax = |x|∥a.
So we have
Ax
|x|
= a.
(ii) Exercise on example sheet 4.
Theorem (Chain rule). Let
U R
n
be open, a
U
, f :
U R
m
differentiable
at a. Moreover,
V R
m
is open with f (
U
)
V
and g :
V R
p
is differentiable
at f(a). Then g f : U R
p
is differentiable at a, with derivative
D(g f )(a) = Dg(f (a)) Df(a).
Proof.
The proof is very easy if we use the little
o
notation. Let
A
=
D
f(a) and
B = Dg(f(a)). By differentiability of f , we know
f(a + h) = f(a) + Ah + o(h)
g(f (a) + k) = g(f(a)) + Bk + o(k)
Now we have
g f (a + h) = g(f(a) + Ah + o(h)
| {z }
k
)
= g(f (a)) + B(Ah + o(h)) + o(Ah + o(h))
= g f (a) + BAh + B(o(h)) + o(Ah + o(h)).
We just have to show the last term is
o
(h), but this is true since
B
and
A
are
bounded. By boundedness,
B(o(h)) B∥∥o(h).
So B(o(h)) = o(h). Similarly,
Ah + o(h) A∥∥h + o(h) (A + 1)h
for sufficiently small h. So o(Ah + o(h)) is in fact o(h) as well. Hence
g f (a + h) = g f(a) + BAh + o(h).