5Differentiability

IA Analysis I



5.3 Differentiation theorems
Everything we’ve had so far is something we already know. It’s just that now we
can prove them rigorously. In this section, we will come up with genuinely new
theorems, including but not limited to Taylor’s theorem, which gives us Taylor’s
series.
Theorem (Rolle’s theorem). Let
f
be continuous on a closed interval [
a, b
] (with
a < b
) and differentiable on (
a, b
). Suppose that
f
(
a
) =
f
(
b
). Then there exists
x (a, b) such that f
0
(x) = 0.
It is intuitively obvious: if you move up and down, and finally return to the
same point, then you must have changed direction some time. Then
f
0
(
x
) = 0
at that time.
Proof. If f is constant, then we’re done.
Otherwise, there exists
u
such that
f
(
u
)
6
=
f
(
a
). wlog,
f
(
u
)
> f
(
a
). Since
f
is continuous, it has a maximum, and since
f
(
u
)
> f
(
a
) =
f
(
b
), the maximum
is not attained at a or b.
Suppose maximum is attained at x (a, b). Then for any h 6= 0, we have
f(x + h) f(x)
h
(
0 h > 0
0 h < 0
since
f
(
x
+
h
)
f
(
x
)
0 by maximality of
f
(
x
). By considering both sides as we
take the limit h 0, we know that f
0
(x) 0 and f
0
(x) 0. So f
0
(x) = 0.
Corollary (Mean value theorem). Let
f
be continuous on [
a, b
] (
a < b
), and
differentiable on (a, b). Then there exists x (a, b) such that
f
0
(x) =
f(b) f(a)
b a
.
Note that
f(b)f(a)
ba
is the slope of the line joining f(a) and f(b).
f(x)
f(a)
f(b)
The mean value theorem is sometimes described as “rotate your head and
apply Rolle’s”. However, if we actually rotate it, we might end up with a
non-function. What we actually want is a shear.
Proof. Let
g(x) = f(x)
f(b) f(a)
b a
x.
Then
g(b) g(a) = f(b) f(a)
f(b) f(a)
b a
(b a) = 0.
So by Rolle’s theorem, we can find x (a, b) such that g
0
(x) = 0. So
f
0
(x) =
f(b) f(a)
b a
,
as required.
We’ve always assumed that if a function has a positive derivative everywhere,
then the function is increasing. However, it turns out that this is really hard to
prove directly. It does, however, follow quite immediately from the mean value
theorem.
Example. Suppose
f
0
(
x
)
>
0 for every
x
(
a, b
). Then for
u, v
in [
a, b
], we can
find w (u, v) such that
f(v) f(u)
v u
= f
0
(w) > 0.
It follows that f(v) > f(u). So f is strictly increasing.
Similarly, if
f
0
(
x
)
2 for every
x
and
f
(0) = 0, then
f
(1)
2, or else we
can find x (0, 1) such that
2 f
0
(x) =
f(1) f(0)
1 0
= f(1).
Theorem (Local version of inverse function theorem). Let
f
be a function with
continuous derivative on (a, b).
Let
x
(
a, b
) and suppose that
f
0
(
x
)
6
= 0. Then there is an open interval
(
u, v
) containing
x
on which
f
is invertible (as a function from (
u, v
) to
f
((
u, v
))).
Moreover, if g is the inverse, then g
0
(f(z)) =
1
f
0
(z)
for every z (u, v).
This says that if
f
has a non-zero derivative, then it has an inverse locally
and the derivative of the inverse is 1/f
0
.
Note that this not only requires
f
to be differentiable, but the derivative
itself also has to be continuous.
Proof.
wlog,
f
0
(
x
)
>
0. By the continuity, of
f
0
, we can find
δ >
0 such that
f
0
(
z
)
>
0 for every
z
(
x δ, x
+
δ
). By the mean value theorem,
f
is strictly
increasing on (
x δ, x
+
δ
), hence injective. Also,
f
is continuous on (
x δ, x
+
δ
)
by differentiability.
Then done by the inverse function theorem.
Finally, we are going to prove Taylor’s theorem. To do so, we will first need
some lemmas.
Theorem (Higher-order Rolle’s theorem). Let
f
be continuous on [
a, b
] (
a < b
)
and n-times differentiable on an open interval containing [a, b]. Suppose that
f(a) = f
0
(a) = f
(2)
(a) = ··· = f
(n1)
(a) = f(b) = 0.
Then x (a, b) such that f
(n)
(x) = 0.
Proof. Induct on n. The n = 0 base case is just Rolle’s theorem.
Suppose we have
k < n
and
x
k
(
a, b
) such that
f
(k)
(
x
k
) = 0. Since
f
(k)
(
a
) = 0, we can find
x
k+1
(
a, x
k
) such that
f
(k+1)
(
x
k+1
) = 0 by Rolle’s
theorem.
So the result follows by induction.
Corollary. Suppose that
f
and
g
are both differentiable on an open interval
containing [
a, b
] and that
f
(k)
(
a
) =
g
(k)
(
a
) for
k
= 0
,
1
, ··· , n
1, and also
f(b) = g(b). Then there exists x (a, b) such that f
(n)
(x) = g
(n)
(x).
Proof. Apply generalised Rolle’s to f g.
Now we shall show that for any
f
, we can find a polynomial
p
of degree at
most
n
that satisfies the conditions for
g
, i.e. a
p
such that
p
(k)
(
a
) =
f
(k)
(
a
) for
k = 0, 1, ··· , n 1 and p(b) = f(b).
A useful ingredient is the observation that if
Q
k
(x) =
(x a)
k
k!
,
then
Q
(j)
k
(a) =
(
1 j = k
0 j 6= k
Therefore, if
Q(x) =
n1
X
k=0
f
(k)
(a)Q
k
(x),
then
Q
(j)
(a) = f
(j)
(a)
for
j
= 0
,
1
, ··· , n
1. To get
p
(
b
) =
f
(
b
), we use our
n
th degree polynomial
term:
p(x) = Q(x) +
(x a)
n
(b a)
n
f(b) Q(b)
.
Then our final term does not mess up our first
n
1 derivatives, and gives
p(b) = f(b).
By the previous corollary, we can find x (a, b) such that
f
(n)
(x) = p
(n)
(x).
That is,
f
(n)
(x) =
n!
(b a)
n
f(b) Q(b)
.
Therefore
f(b) = Q(b) +
(b a)
n
n!
f
(n)
(x).
Alternatively,
f(b) = f(a) + (b a)f
0
(a) + ··· +
(b a)
n1
(n 1)!
f
(n1)
(a) +
(b a)
n
n!
f
(n)
(x).
Setting b = a + h, we can rewrite this as
Theorem (Taylor’s theorem with the Lagrange form of remainder).
f(a + h) = f(a) + hf
0
(a) + ··· +
h
n1
(n 1)!
f
(n1)
(a)
| {z }
(n1)-degree approximation to f near a
+
h
n
n!
f
(n)
(x)
| {z }
error term
.
for some x (a, a + h).
Strictly speaking, we only proved it for the case when
h >
0, but we can
easily show it holds for h < 0 too by considering g(x) = f (x).
Note that the remainder term is not necessarily small, but this often gives
us the best (
n
1)-degree approximation to
f
near
a
. For example, if
f
(n)
is
bounded by C near a, then
h
n
n!
f
(n)
(x)
C
n!
|h|
n
= o(h
n1
).
Example. Let
f
:
R R
be a differentiable function such that
f
(0) = 1 and
f
0
(
x
) =
f
(
x
) for every
x
(intuitively, we know it is
e
x
, but that thing doesn’t
exist!). Then for every x, we have
f(x) = 1 + x +
x
2
2!
+
x
3
3!
+ ··· =
X
n=0
x
n
n!
.
While it seems like we can prove this works by differentiating it and see that
f
0
(
x
) =
f
(
x
), the sum rule only applies for finite sums. We don’t know we can
differentiate a sum term by term. So we have to use Taylor’s theorem.
Since
f
0
(
x
) =
f
(
x
), it follows that all derivatives exist. By Taylor’s theorem,
f(x) = f(0) + f
0
(0)x +
f
(2)
(0)
2!
x
2
+ ··· +
f
(n1)
(0)
(n 1)!
x
n1
+
f
(n)
(u)
n!
x
n
.
for some u between 0 and x. This equals to
f(x) =
n1
X
k=0
x
k
k!
+
f
(n)
(u)
n!
x
n
.
We must show that the remainder term
f
(n)
(u)
n!
x
n
0 as
n
. Note here
that x is fixed, but u can depend on n.
But we know that
f
(n)
(
u
) =
f
(
u
), but since
f
is differentiable, it is continuous,
and is bounded on [0, x]. Suppose |f (u)| C on [0, x]. Then
f
(n)(u)
n!
x
n
C
n!
|x|
n
0
from limit theorems. So it follows that
f(x) = 1 + x +
x
2
2!
+
x
3
3!
+ ··· =
X
n=0
x
n
n!
.