6Differentiation from ℝm to ℝn

IB Analysis II



6.3 Mean value inequalities
So far, we have just looked at cases where we assume the function is differentiable
at a point. We are now going to assume the function is differentiable in a region,
and see what happens to the derivative.
Recall the mean value theorem from single-variable calculus: if
f
: [
a, b
]
R
is continuous on [a, b] and differentiable on (a, b), then
f(b) f(a) = f
(c)(b a)
for some
c
(
a, b
). This is our favorite theorem, and we have used it many
times in IA Analysis. Here we have an exact equality. However, in general, for
vector-valued functions, i.e. if we are mapping to
R
m
, this is no longer true.
Instead, we only have an inequality.
We first prove it for the case when the domain is a subset of
R
, and then
reduce the general case to this special case.
Theorem. Let f : [
a, b
]
R
m
be continuous on [
a, b
] and differentiable on (
a, b
).
Suppose we can find some
M
such that for all
t
(
a, b
), we have
D
f(
t
)
M
.
Then
f(b) f(a) M(b a).
Proof. Let v = f(b) f(a). We define
g(t) = v · f (t) =
m
X
i=1
v
i
f
i
(t).
Since each
f
i
is differentiable,
g
is continuous on [
a, b
] and differentiable on (
a, b
)
with
g
(t) =
X
v
i
f
i
(t).
Hence, we know
|g
(t)|
m
X
i=1
v
i
f
i
(t)
v
n
X
i=1
f
2
i
(t)
!
1/2
= v∥∥Df(t) Mv.
We now apply the mean value theorem to g to get
g(b) g(a) = g
(t)(b a)
for some t (a, b). By definition of g, we get
v · (f (b) f (a)) = g
(t)(b a).
By definition of v, we have
f(b) f(a)
2
= |g
(t)(b a)| (b a)Mf(b) f(a).
If f (
b
) = f (
a
), then there is nothing to prove. Otherwise, divide by
f(
b
)
f(
a
)
and done.
We now apply this to prove the general version.
Theorem (Mean value inequality). Let a
R
n
and f :
B
r
(a)
R
m
be
differentiable on B
r
(a) with Df (x) M for all x B
r
(a). Then
f(b
1
) f(b
2
) Mb
1
b
2
for any b
1
, b
2
B
r
(a).
Proof. We will reduce this to the previous theorem.
Fix b
1
, b
2
B
r
(a). Note that
tb
1
+ (1 t)b
2
B
r
(a)
for all t [0, 1]. Now consider g : [0, 1] R
m
.
g(t) = f (tb
1
+ (1 t)b
2
).
By the chain rule, g is differentiable and
g
(t) = Dg(t) = (Df(tb
1
+ (1 t)b
2
))(b
1
b
2
)
Therefore
Dg(t) Df(tb
1
+ (1 t)b
2
)∥∥b
1
b
2
Mb
1
b
2
.
Now we can apply the previous theorem, and get
f(b
1
) f(b
2
) = g(1) g(0) Mb
1
b
2
.
Note that here we worked in a ball. In general, we could have worked in a
convex set, since all we need is for tb
1
+ (1 t)b
2
to be inside the domain.
But with this, we have the following easy corollary.
Corollary. Let f :
B
r
(a)
R
n
R
m
have
D
f(x) = 0 for all x
B
r
(a). Then
f is constant.
Proof. Apply the mean value inequality with M = 0.
We would like to extend this corollary. Does this corollary extend to differ-
entiable maps f with Df = 0 defined on any open set U R
n
?
The answer is clearly no. Even for functions
f
:
R R
, this is not true, since
we can have two disjoint intervals [1, 2] [3, 4], and define f(t) to be 1 on [1, 2]
and 2 on [3
,
4]. Then
Df
= 0 but
f
is not constant.
f
is just locally constant on
each interval.
The problem with this is that the sets are disconnected. We cannot connect
points in [1
,
2] and points in [3
,
4] with a line. If we can do so, then we would be
able to show that f is constant.
Definition (Path-connected subset). A subset
E R
n
is path-connected if for
any a, b E, there is a continuous map γ : [0, 1] E such that
γ(0) = a, γ(1) = b.
Theorem. Let
U R
n
be open and path-connected. Then for any differentiable
f : U R
m
, if Df (x) = 0 for all x U, then f is constant on U.
A naive attempt would be to replace
t
b
1
(1
t
)b
2
in the proof of the mean
value theorem with a path γ(t). However, this is not a correct proof, since this
has to assume γ is differentiable. So this doesn’t work. We have to think some
more.
Proof.
We are going to use the fact that f is locally constant. wlog, assume
m
= 1. Given any a
,
b
U
, we show that
f
(a) =
f
(b). Let
γ
: [0
,
1]
U
be
a (continuous) path from a to b. For any
s
(0
,
1), there exists some
ε
such
that
B
ε
(
γ
(
s
))
U
since
U
is open. By continuity of
γ
, there is a
δ
such that
(s δ, s + δ) [0, 1] with γ((s δ, s + δ)) B
ε
(γ(s)) U.
Since
f
is constant on
B
ε
(
γ
(
s
)) by the previous corollary, we know that
g
(
t
) =
f γ
(
t
) is constant on (
s δ, s
+
δ
). In particular,
g
is differentiable at
s
with derivative 0. This is true for all
s
. So the map
g
: [0
,
1]
R
has zero
derivative on (0
,
1) and is continuous on (0
,
1). So
g
is constant. So
g
(0) =
g
(1),
i.e. f(a) = f(b).
If
γ
were differentiable, then this is much easier, since we can show
g
= 0 by
the chain rule:
g
(t) = Df(γ(t))γ
(t).