1Manifolds
III Differential Geometry
1.2 Smooth functions and derivatives
From now on,
M
and
N
will be manifolds. As usual, we would like to talk about
maps between manifolds. What does it mean for such a map to be smooth? In
the case of a function
M → R
, we had to check it on each chart of
M
. Now that
we have functions M → N , we need to check it on charts of both N and M .
Definition
(Smooth function)
.
A function
f
:
M → N
is smooth at a point
p ∈ M
if there are charts (
U, ϕ
) for
M
and (
V, ξ
) for
N
with
p ∈ U
and
f
(
p
)
∈ V
such that ξ ◦ f ◦ ϕ
−1
: ϕ(U) → ξ(V ) is smooth at ϕ(p).
A function is smooth if it is smooth at all points p ∈ M.
A diffeomorphism is a smooth f with a smooth inverse.
We write
C
∞
(
M, N
) for the space of smooth maps
f
:
M → N
. We write
C
∞
(
M
) for
C
∞
(
M, R
), and this has the additional structure of an algebra, i.e.
a vector space with multiplication.
ϕ
ξ
f
ξ ◦ f ◦ ϕ
−1
Equivalently,
f
is smooth at
p
if
ξ ◦ f ◦ ϕ
−1
is smooth at
ϕ
(
p
) for any such charts
(U, ϕ) and (V, ξ).
Example.
Let
ϕ
:
U → R
n
be a chart. Then
ϕ
:
U → ϕ
(
U
) is a diffeomorphism.
Definition
(Curve)
.
A curve is a smooth map
I → M
, where
I
is a non-empty
open interval.
To discuss derivatives, we first look at the case where
U ⊆ R
n
is open.
Suppose
f
:
U → R
is smooth. If
p ∈ U
and
v ∈ R
n
, recall that the directional
derivative is defined by
Df|
p
(v) = lim
t→0
f(p + tv) − f(p)
t
.
If v = e
i
= (0, · · · , 0, 1, 0, · · · , 0), then we write
Df|
p
(e
i
) =
∂f
∂x
i
p
.
Also, we know Df|
p
: R
n
→ R is a linear map (by definition of smooth).
Note that here
p
and
v
are both vectors, but they play different roles —
p
is an element in the domain
U
, while
v
is an arbitrary vector in
R
n
. Even if
v
is enormous, by taking a small enough
t
, we find that
p
+
tv
will eventually be
inside U.
If we have a general manifold, we can still talk about the
p
. However, we
don’t have anything that plays the role of a vector. Our first goal is to define
the tangent space to a manifold that captures where the “directions” live.
An obvious way to do so would be to use a curve. Suppose
γ
:
I → M
is a
curve, with
γ
(0) =
p ∈ U ⊆ M
, and
f
:
U → R
is smooth. We can then take the
derivative of f along γ as before. We let
X(f) =
d
dt
t=0
f(γ(t)).
It is an exercise to see that
X
:
C
∞
(
U
)
→ R
is a linear map, and it satisfies the
Leibniz rule
X(fg) = f (p)X(g) + g(p)X(f).
We denote
X
by
˙γ
(0). We might think of defining the tangent space as curves
up to some equivalence relation, but if we do this, there is no obvious vector
space on it. The trick is to instead define a vector by the derivative
X
induces.
This then has an obvious vector space structure.
Definition
(Derivation)
.
A derivation on an open subset
U ⊆ M
at
p ∈ U
is a
linear map X : C
∞
(U) → R satisfying the Leibniz rule
X(fg) = f (p)X(g) + g(p)X(f).
Definition
(Tangent space)
.
Let
p ∈ U ⊆ M
, where
U
is open. The tangent
space of M at p is the vector space
T
p
M = { derivations on U at p } ≡ Der
p
(C
∞
(U)).
The subscript p tells us the point at which we are taking the tangent space.
Why is this the “right” definition? There are two things we would want to
be true:
(i) The definition doesn’t actually depend on U .
(ii) This definition agrees with the usual definition of tangent vectors in R
n
.
We will do the first part at the end by bump functions, and will do the second
part now. Note that it follows from the second part that every tangent vector
comes from the derivative of a path, because this is certainly true for the usual
definition of tangent vectors in
R
n
(take a straight line), and this is a completely
local problem.
Example. Let U ⊆ R
n
be open, and let p ∈ U. Then we have tangent vectors
∂
∂x
i
p
∈ T
p
R
n
, i = 1, . . . , n.
These correspond to the canonical basis vectors in R
n
.
Lemma.
∂
∂x
1
p
, · · · ,
∂
∂x
n
p
is a basis of T
p
R
n
. So these are all the derivations.
The idea of the proof is to show that a derivation can only depend on the
first order derivatives of a function, and all possibilities will be covered by the
∂
∂x
i
.
Proof. Independence is clear as
∂x
j
∂x
i
= δ
ij
.
We need to show spanning. For notational convenience, we wlog take
p
= 0. Let
X ∈ T
0
R
n
.
We first show that if
g ∈ C
∞
(
U
) is the constant function
g
= 1, then
X(g) = 0. Indeed, we have
X(g) = X(g
2
) = g(0)X(g) + X(g)g(0) = 2X(g).
Thus, if
h
is any constant function, say,
c
, then
X
(
h
) =
X
(
cg
) =
cX
(
g
). So the
derivative of any constant function vanishes.
In general, let f ∈ C
∞
(U). By Taylor’s theorem, we have
f(x
1
, · · · , x
n
) = f(0) +
n
X
i=1
∂f
∂x
i
0
x
i
+ ε,
where ε is a sum of terms of the form x
i
x
j
h with h ∈ C
∞
(U).
We set λ
i
= X(x
i
) ∈ R. We first claim that X(ε) = 0. Indeed, we have
X(x
i
x
j
h) = x
i
(0)X(x
j
h) + (x
j
h)(0)X(x
i
) = 0.
So we have
X(f) =
n
X
i=1
λ
i
∂f
∂x
i
0
.
So we have
X =
n
X
i=1
λ
i
∂
∂x
i
0
.
Given this definition of a tangent vector, we have a rather silly and tautological
definition of the derivative of a smooth function.
Definition
(Derivative)
.
Suppose
F ∈ C
∞
(
M, N
), say
F
(
p
) =
q
. We define
DF |
p
: T
p
M → T
q
N by
DF |
p
(X)(g) = X(g ◦ F )
for X ∈ T
p
M and g ∈ C
∞
(V ) with q ∈ V ⊆ N .
This is a linear map called the derivative of F at p.
M N
R
F
g◦F
g
With a silly definition of a derivative comes a silly definition of the chain
rule.
Proposition
(Chain rule)
.
Let
M, N, P
be manifolds, and
F ∈ C
∞
(
M, N
),
G ∈ C
∞
(N, P ), and p ∈ M, q = F (p). Then we have
D(G ◦ F )|
p
= DG|
q
◦ DF |
p
.
Proof. Let h ∈ C
∞
(P ) and X ∈ T
p
M. We have
DG|
q
(DF |
p
(X))(h) = DF |
p
(X)(h ◦ G) = X(h ◦ G ◦ F ) = D(G ◦ F )|
p
(X)(h).
Note that this does not provide a new, easy proof of the chain rule. Indeed,
to come this far into the course, we have used the actual chain rule something
like ten thousand times.
Corollary.
If
F
is a diffeomorphism, then D
F |
p
is a linear isomorphism, and
(DF |
p
)
−1
= D(F
−1
)|
F (p)
.
In the special case where the domain is
R
, there is a canonical choice of
tangent vector at each point, namely 1.
Definition
(Derivative)
.
Let
γ
:
R → M
be a smooth function. Then we write
dγ
dt
(t) = ˙γ(t) = Dγ|
t
(1).
We now go back to understanding what
T
p
M
is if
p ∈ M
. We let
p ∈ U
where (
U, ϕ
) is a chart. Then if
q
=
ϕ
(
p
), the map D
ϕ|
p
:
T
p
M → T
q
R
n
is a
linear isomorphism.
Definition (
∂
∂x
i
). Given a chart ϕ : U → R
n
with ϕ = (x
1
, · · · , x
n
), we define
∂
∂x
i
p
= (Dϕ|
p
)
−1
∂
∂x
i
ϕ(p)
!
∈ T
p
M.
So
∂
∂x
1
p
, · · · ,
∂
∂x
n
p
is a basis for T
p
M.
Recall that if
f
:
U → R
is smooth, then we can write
f
(
x
1
, · · · , x
n
). Then
we have
∂
∂x
i
p
(f) =
∂f
∂x
i
ϕ(p)
.
So we have a consistent notation.
Now, how does this basis change when we change coordinates? Suppose we
also have coordinates
y
1
, · · · , y
n
near
p
given by some other chart. We then have
∂
∂y
i
p
∈ T
p
M. So we have
∂
∂y
i
p
=
n
X
j=1
α
j
∂
∂x
j
p
for some
α
j
. To figure out what they are, we apply them to the function
x
k
. So
we have
∂
∂y
i
p
(x
k
) =
∂x
k
∂y
i
(p) = α
k
.
So we obtain
∂
∂y
i
p
=
n
X
j=1
∂x
j
∂y
i
(p)
∂
∂x
j
p
.
This is the usual change-of-coordinate formula!
Now let
F ∈ C
∞
(
M, N
), (
U, ϕ
) be a chart on
M
containing
p
with coordinates
x
1
, · · · , x
n
, and (
V, ξ
) a chart on
N
containing
q
=
F
(
p
) with coordinates
y
1
, · · · , y
m
. By abuse of notation, we confuse
F
and
ξ ◦ F ◦ ϕ
−1
. So we write
F = (F
1
, · · · , F
m
) with F
i
= F
i
(x
1
, · · · , x
n
) : U → R.
As before, we have a basis
∂
∂x
1
p
, · · · ,
∂
∂x
n
p
for T
p
M,
∂
∂y
1
q
, · · · ,
∂
∂y
m
q
for T
q
N.
Lemma. We have
DF |
p
∂
∂x
i
p
!
=
m
X
j=1
∂F
j
∂x
i
(p)
∂
∂y
j
q
.
In other words, DF |
p
has matrix representation
∂F
j
∂x
i
(p)
ij
.
Proof. We let
DF |
p
∂
∂x
i
p
!
=
m
X
j=1
λ
j
∂
∂y
j
q
.
for some λ
j
. We apply this to the local function y
k
to obtain
λ
k
=
m
X
j=1
λ
j
∂
∂y
j
q
(y
k
)
= DF
p
∂
∂x
i
p
!
(y
k
)
=
∂
∂x
i
p
(y
k
◦ F )
=
∂
∂x
i
p
(F
k
)
=
∂F
k
∂x
i
(p).
Example.
Let
f
:
C
∞
(
U
) where
U ⊆ M
is an open set containing
p
. Then
D
f|
p
:
T
p
M → T
f(p)
R
∼
=
R
is a linear map. So D
f|
p
is an element in the dual
space (
T
p
M
)
∗
, called the differential of
f
at
p
, and is denoted d
f|
p
. Then we
have
df|
p
(X) = X(f).
(this can, e.g. be checked in local coordinates)