5Differentiability

IA Analysis I



5.2 Differentiation
Similar to what we did in IA Differential Equations, we define the derivative as
a limit.
Definition (Differentiable function).
f
is differentiable at
a
with derivative
λ
if
lim
xa
f(x) f(a)
x a
= λ.
Equivalently, if
lim
h0
f(a + h) f(a)
h
= λ.
We write λ = f
0
(a).
Here we see why, in the definition of the limit, we say that we don’t care
what happens when
x
=
a
. In our definition here, our function is 0/0 when
x = a, and we can’t make any sense out of what happens when x = a.
Alternatively, we write the definition of differentiation as
f(x + h) f(x)
h
= f
0
(x) + ε(h),
where ε(h) 0 as h 0. Rearranging, we can deduce that
f(x + h) = f(x) + hf
0
(x) + (h),
Note that by the definition of the limit, we don’t have to care what value
ε
takes
when
h
= 0. It can be 0,
π
or 10
10
10
. However, we usually take
ε
(0) = 0 so that
ε is continuous.
Using the small-
o
notation, we usually write
o
(
h
) for a function that satisfies
o(h)
h
0 as h 0. Hence we have
Proposition.
f(x + h) = f(x) + hf
0
(x) + o(h).
We can interpret this as an approximation of f(x + h):
f(x + h) = f(x) + hf
0
(x)
| {z }
linear approximation
+ o(h)
|{z}
error term
.
And differentiability shows that this is a very good approximation with small
o(h) error.
Conversely, we have
Proposition. If
f
(
x
+
h
) =
f
(
x
) +
hf
0
(
x
) +
o
(
h
), then
f
is differentiable at
x
with derivative f
0
(x).
Proof.
f(x + h) f(x)
h
= f
0
(x) +
o(h)
h
f
0
(x).
We can take derivatives multiple times, and get multiple derivatives.
Definition (Multiple derivatives). This is defined recursively:
f
is (
n
+ 1)-
times differentiable if it is
n
-times differentiable and its
n
th derivative
f
(n)
is
differentiable. We write
f
(n+1)
for the derivative of
f
(n)
, i.e. the (
n
+ 1)th
derivative of f .
Informally, we will say
f
is
n
-times differentiable if we can differentiate it
n
times, and the nth derivative is f
(n)
.
We can prove the usual rules of differentiation using the small
o
-notation. It
can also be proven by considering limits directly, but the notation will become a
bit more daunting.
Lemma (Sum and product rule). Let
f, g
be differentiable at
x
. Then
f
+
g
and fg are differentiable at x, with
(f + g)
0
(x) = f
0
(x) + g
0
(x)
(fg)
0
(x) = f
0
(x)g(x) + f(x)g
0
(x)
Proof.
(f + g)(x + h) = f(x + h) + g(x + h)
= f(x) + hf
0
(x) + o(h) + g(x) + hg
0
(x) + o(h)
= (f + g)(x) + h(f
0
(x) + g
0
(x)) + o(h)
fg(x + h) = f(x + h)g(x + h)
= [f(x) + hf
0
(x) + o(h)][g(x) + hg
0
(x) + o(h)]
= f(x)g(x) + h[f
0
(x)g(x) + f(x)g
0
(x)]
+ o(h)[g(x) + f(x) + hf
0
(x) + hg
0
(x) + o(h)] + h
2
f
0
(x)g
0
(x)
| {z }
error term
By limit theorems, the error term is o(h). So we can write this as
= fg(x) + h(f
0
(x)g(x) + f(x)g
0
(x)) + o(h).
Lemma (Chain rule). If
f
is differentiable at
x
and
g
is differentiable at
f
(
x
),
then g f is differentiable at x with derivative g
0
(f(x))f
0
(x).
Proof.
If one is sufficiently familiar with the small-
o
notation, then we can
proceed as
g(f(x + h)) = g(f(x) + hf
0
(x) + o(h)) = g(f(x)) + hf
0
(x)g
0
(f(x)) + o(h).
If not, we can be a bit more explicit about the computations, and use
(
h
)
instead of o(h):
(g f)(x + h) = g(f(x + h))
= g[f(x) + hf
0
(x) +
1
(h)
| {z }
the h term
]
= g(f(x)) +
fg
0
(x) +
1
(h)
g
0
(f(x))
+
hf
0
(x) +
1
(h)
ε
2
(hf
0
(x) +
1
(h))
= g f(x) + hg
0
(f(x))f
0
(x)
+ h
h
ε
1
(h)g
0
(f(x)) +
f
0
(x) + ε
1
(h)
ε
2
hf
0
(x) +
1
(h)
i
|
{z }
error term
.
We want to show that the error term is
o
(
h
), i.e. it divided by
h
tends to 0 as
h 0.
But
ε
1
(
h
)
g
0
(
f
(
x
))
0,
f
0
(
x
)+
ε
1
(
h
) is bounded, and
ε
2
(
hf
0
(
x
)+
1
(
h
))
0
because hf
0
(x) +
1
(h) 0 and ε
2
(0) = 0. So our error term is o(h).
We usually don’t write out the error terms so explicitly, and just use heuristics
like
f
(
x
+
o
(
h
)) =
f
(
x
) +
o
(
h
);
o
(
h
) +
o
(
h
) =
o
(
h
); and
g
(
x
)
·o
(
h
) =
o
(
h
) for any
(bounded) function g.
Example.
(i) Constant functions are differentiable with derivative 0.
(ii) f(x) = λx is differentiable with derivative λ.
(iii)
Using the product rule, we can show that
x
n
is differentiable with derivative
nx
n1
by induction.
(iv) Hence all polynomials are differentiable.
Example. Let f(x) = 1/x. If x 6= 0, then
f(x + h) f(x)
h
=
1
x+h
1
x
h
=
h
x(x+h)
h
=
1
x(x + h)
1
x
2
by limit theorems.
Lemma (Quotient rule). If
f
and
g
are differentiable at
x
, and
g
(
x
)
6
= 0, then
f/g is differentiable at x with derivative
f
g
0
(x) =
f
0
(x)g(x) g
0
(x)f(x)
g(x)
2
.
Proof.
First note that 1
/g
(
x
) =
h
(
g
(
x
)) where
h
(
y
) = 1
/y
. So 1
/g
(
x
) is differ-
entiable at x with derivative
1
g(x)
2
g
0
(x) by the chain rule.
By the product rule, f/g is differentiable at x with derivative
f
0
(x)
g(x)
f(x)
g
0
(x)
g(x)
2
=
f
0
(x)g(x) f(x)g
0
(x)
g(x)
2
.
Lemma. If f is differentiable at x, then it is continuous at x.
Proof.
As
y x
,
f(y) f(x)
y x
f
0
(
x
). Since,
y x
0,
f
(
y
)
f
(
x
)
0 by
product theorem of limits. So f(y) f(x). So f is continuous at x.
Theorem. Let
f
: [
a, b
]
[
c, d
] be differentiable on (
a, b
), continuous on [
a, b
],
and strictly increasing. Suppose that
f
0
(
x
) never vanishes. Suppose further that
f
(
a
) =
c
and
f
(
b
) =
d
. Then
f
has an inverse
g
and for each
y
(
c, d
),
g
is
differentiable at y with derivative 1/f
0
(g(y)).
In human language, this states that if
f
is invertible, then the derivative of
f
1
is 1/f
0
.
Note that the conditions will (almost) always require
f
to be differentiable
on open interval (
a, b
), continuous on closed interval [
a, b
]. This is because it
doesn’t make sense to talk about differentiability at
a
or
b
since the definition of
f
0
(a) requires f to be defined on both sides of a.
Proof. g exists by an earlier theorem about inverses of continuous functions.
Let y, y + k (c, d). Let x = g(y), x + h = g(y + k).
Since
g
(
y
+
k
) =
x
+
h
, we have
y
+
k
=
f
(
x
+
h
). So
k
=
f
(
x
+
h
)
y
=
f(x + h) f(x). So
g(y + k) g(y)
k
=
(x + h) x
f(x + h) f(x)
=
f(x + h) f(x)
h
1
.
As k 0, since g is continuous, g(y + k) g(y). So h 0. So
g(y + k) g(y)
k
[f
0
(x)]
1
= [f
0
(g(y)]
1
.
Example. Let f(x) = x
1/2
for x > 0. Then f is the inverse of g(x) = x
2
. So
f
0
(x) =
1
g
0
(f(x))
=
1
2x
1/2
=
1
2
x
1/2
.
Similarly, we can show that the derivative of x
1/q
is
1
q
x
1/q1
.
Then let’s take x
p/q
= (x
1/q
)
p
. By the chain rule, its derivative is
p(x
1/q
)
p1
·
1
q
x
1/q1
=
p
q
x
p1
q
+
1
q
1
=
p
q
x
p
q
1
.