Part IA — Vector Calculus
Based on lectures by B. Allanach
Notes taken by Dexter Chua
Lent 2015
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Curves in R
3
Parameterised curves and arc length, tangents and normals to curves in
R
3
, the radius
of curvature. [1]
Integration in R
2
and R
3
Line integrals. Surface and volume integrals: definitions, examples using Cartesian,
cylindrical and spherical coordinates; change of variables. [4]
Vector operators
Directional derivatives. The gradient of a real-valued function: definition; interpretation
as normal to level surfaces; examples including the use of cylindrical, spherical *and
general orthogonal curvilinear* coordinates.
Divergence, curl and
∇
2
in Cartesian coordinates, examples; formulae for these oper-
ators (statement only) in cylindrical, spherical *and general orthogonal curvilinear*
co ordinates. Solenoidal fields, irrotational fields and conservative fields; scalar poten-
tials. Vector derivative identities. [5]
Integration theorems
Divergence theorem, Green’s theorem, Stokes’s theorem, Green’s second theorem:
statements; informal proofs; examples; application to fluid dynamics, and to electro-
magnetism including statement of Maxwell’s equations. [5]
Laplace’s equation
Laplace’s equation in
R
2
and
R
3
: uniqueness theorem and maximum principle. Solution
of Poisson’s equation by Gauss’s method (for spherical and cylindrical symmetry) and
as an integral. [4]
Cartesian tensors in R
3
Tensor transformation laws, addition, multiplication, contraction, with emphasis on
tensors of second rank. Isotropic second and third rank tensors. Symmetric and
antisymmetric tensors. Revision of principal axes and diagonalization. Quotient
theorem. Examples including inertia and conductivity. [5]
Contents
0 Introduction
1 Derivatives and coordinates
1.1 Derivative of functions
1.2 Inverse functions
1.3 Coordinate systems
2 Curves and Line
2.1 Parametrised curves, lengths and arc length
2.2 Line integrals of vector fields
2.3 Gradients and Differentials
2.4 Work and potential energy
3 Integration in R
2
and R
3
3.1 Integrals over subsets of R
2
3.2 Change of variables for an integral in R
2
3.3 Generalization to R
3
3.4 Further generalizations
4 Surfaces and surface integrals
4.1 Surfaces and Normal
4.2 Parametrized surfaces and area
4.3 Surface integral of vector fields
4.4 Change of variables in R
2
and R
3
revisited
5 Geometry of curves and surfaces
6 Div, Grad, Curl and ∇
6.1 Div, Grad, Curl and ∇
6.2 Second-order derivatives
7 Integral theorems
7.1 Statement and examples
7.1.1 Green’s theorem (in the plane)
7.1.2 Stokes’ theorem
7.1.3 Divergence/Gauss theorem
7.2 Relating and proving integral theorems
8 Some applications of integral theorems
8.1 Integral expressions for div and curl
8.2 Conservative fields and scalar products
8.3 Conservation laws
9 Orthogonal curvilinear coordinates
9.1 Line, area and volume elements
9.2 Grad, Div and Curl
10 Gauss’ Law and Poisson’s equation
10.1 Laws of gravitation
10.2 Laws of electrostatics
10.3 Poisson’s Equation and Laplace’s equation
11 Laplace’s and Poisson’s equations
11.1 Uniqueness theorems
11.2 Laplace’s equation and harmonic functions
11.2.1 The mean value property
11.2.2 The maximum (or minimum) principle
11.3 Integral solutions of Poisson’s equations
11.3.1 Statement and informal derivation
11.3.2 Point sources and δ-functions*
12 Maxwell’s equations
12.1 Laws of electromagnetism
12.2 Static charges and steady currents
12.3 Electromagnetic waves
13 Tensors and tensor fields
13.1 Definition
13.2 Tensor algebra
13.3 Symmetric and antisymmetric tensors
13.4 Tensors, multi-linear maps and the quotient rule
13.5 Tensor calculus
14 Tensors of rank 2
14.1 Decomposition of a second-rank tensor
14.2 The inertia tensor
14.3 Diagonalization of a symmetric second rank tensor
15 Invariant and isotropic tensors
15.1 Definitions and classification results
15.2 Application to invariant integrals
0 Introduction
In the differential equations class, we learnt how to do calculus in one dimension.
However, (apparently) the world has more than one dimension. We live in a
3 (or 4) dimensional world, and string theorists think that the world has more
than 10 dimensions. It is thus important to know how to do calculus in many
dimensions.
For example, the position of a particle in a three dimensional world can be
given by a position vector
x
. Then by definition, the velocity is given by
d
dt
x
=
˙
x
.
This would require us to take the derivative of a vector.
This is not too difficult. We can just differentiate the vector componentwise.
However, we can reverse the problem and get a more complicated one. We can
assign a number to each point in (3D) space, and ask how this number changes
as we move in space. For example, the function might tell us the temperature at
each point in space, and we want to know how the temperature changes with
position.
In the most general case, we will assign a vector to each point in space. For
example, the electric field vector E(x) tells us the direction of the electric field
at each point in space.
On the other side of the story, we also want to do integration in multiple
dimensions. Apart from the obvious “integrating a vector”, we might want to
integrate over surfaces. For example, we can let
v
(
x
) be the velocity of some
fluid at each point in space. Then to find the total fluid flow through a surface,
we integrate v over the surface.
In this course, we are mostly going to learn about doing calculus in many
dimensions. In the last few lectures, we are going to learn about Cartesian
tensors, which is a generalization of vectors.
Note that throughout the course (and lecture notes), summation convention
is implied unless otherwise stated.
1 Derivatives and coordinates
1.1 Derivative of functions
We used to define a derivative as the limit of a quotient and a function is differ-
entiable if the derivative exists. However, this obviously cannot be generalized
to vector-valued functions, since you cannot divide by vectors. So we want
an alternative definition of differentiation, which can be easily generalized to
vectors.
Recall, that if a function
f
is differentiable at
x
, then for a small perturbation
δx, we have
δf
def
= f (x + δx) − f(x) = f
0
(x)δx + o(δx),
which says that the resulting change in
f
is approximately proportional to
δx
(as opposed to 1
/δx
or something else). It can be easily shown that the converse
is true — if f satisfies this relation, then f is differentiable.
This definition is more easily extended to vector functions. We say a function
F
is differentiable if, when
x
is perturbed by
δx
, then the resulting change is
“something” times
δx
plus an
o
(
δx
) error term. In the most general case,
δx
will
be a vector and that “something” will be a matrix. Then that “something” will
be what we call the derivative.
Vector functions R → R
n
We start with the simple case of vector functions.
Definition (Vector function). A vector function is a function F : R → R
n
.
This takes in a number and returns a vector. For example, it can map a time
to the velocity of a particle at that time.
Definition
(Derivative of vector function)
.
A vector function
F
(
x
) is differen-
tiable if
δF
def
= F(x + δx) − F(x) = F
0
(x)δx + o(δx)
for some F
0
(x). F
0
(x) is called the derivative of F(x).
We don’t have anything new and special here, since we might as well have
defined F
0
(x) as
F
0
=
dF
dx
= lim
δx→0
1
δx
[F(x + δx) − F(x)],
which is easily shown to be equivalent to the above definition.
Using differential notation, the differentiability condition can be written as
dF = F
0
(x) dx.
Given a basis
e
i
that is independent of
x
, vector differentiation is performed
componentwise, i.e.
Proposition.
F
0
(x) = F
0
i
(x)e
i
.
Leibnitz identities hold for the products of scalar and vector functions.
Proposition.
d
dt
(fg ) =
df
dt
g + f
dg
dt
d
dt
(g · h) =
dg
dt
· h + g ·
dh
dt
d
dt
(g × h) =
dg
dt
× h + g ×
dh
dt
Note that the order of multiplication must be retained in the case of the cross
product.
Example.
Consider a particle with mass
m
. It has position
r
(
t
), velocity
˙
r
(
t
)
and acceleration
¨
r. Its momentum is p = m
˙
r(t).
Note that derivatives with respect to
t
are usually denoted by dots instead
of dashes.
If F(r) is the force on a particle, then Newton’s second law states that
˙
p = m
¨
r = F.
We can define the angular momentum about the origin to be
L = r × p = mr ×
˙
r.
If we want to know how the angular momentum changes over time, then
˙
L = m
˙
r ×
˙
r + mr ×
¨
r = mr ×
¨
r = r × F.
which is the torque of F about the origin.
Scalar functions R
n
→ R
We can also define derivatives for a different kind of function:
Definition. A scalar function is a function f : R
n
→ R.
A scalar function takes in a position and gives you a number, e.g. the potential
energy of a particle at different positions.
Before we define the derivative of a scalar function, we have to first define
what it means to take a limit of a vector.
Definition
(Limit of vector)
.
The limit of vectors is defined using the norm.
So v → c iff |v − c| → 0. Similarly, f (r) = o(r) means
|f(r)|
|r|
→ 0 as r → 0.
Definition
(Gradient of scalar function)
.
A scalar function
f
(
r
) is differentiable
at r if
δf
def
= f (r + δr) − f(r) = (∇f ) · δr + o(δr)
for some vector ∇f , the gradient of f at r.
Here we have a fancy name “gradient” for the derivative. But we will soon
give up on finding fancy names and just call everything the “derivative”!
Note also that here we genuinely need the new notion of derivative, since
“dividing by δr” makes no sense at all!
The above definition considers the case where
δr
comes in all directions.
What if we only care about the case where
δr
is in some particular direction
n
?
For example, maybe
f
is the potential of a particle that is confined to move in
one straight line only.
Then taking δr = hn, with n a unit vector,
f(r + hn) − f(r) = ∇f · (hn) + o(h) = h(∇f · n) + o(h),
which gives
Definition (Directional derivative). The directional derivative of f along n is
n · ∇f = lim
h→0
1
h
[f(r + hn) − f(r)],
It refers to how fast f changes when we move in the direction of n.
Using this expression, the directional derivative is maximized when
n
is in
the same direction as
∇f
(then
n · ∇f
=
|∇f|
). So
∇f
points in the direction
of greatest slope.
How do we evaluate
∇f
? Suppose we have an orthonormal basis
e
i
. Setting
n = e
i
in the above equation, we obtain
e
i
· ∇f = lim
h→0
1
h
[f(r + he
i
) − f(r)] =
∂f
∂x
i
.
Hence
Theorem. The gradient is
∇f =
∂f
∂x
i
e
i
Hence we can write the condition of differentiability as
δf =
∂f
∂x
i
δx
i
+ o(δx).
In differential notation, we write
df = ∇f · dr =
∂f
∂x
i
dx
i
,
which is the chain rule for partial derivatives.
Example. Take f(x, y, z) = x + e
xy
sin z. Then
∇f =
∂f
∂x
,
∂f
∂y
,
∂f
∂z
= (1 + ye
xy
sin z, xe
xy
sin z, e
xy
cos z)
At (x, y, z) = (0, 1, 0), ∇f = (1, 0, 1). So f increases/decreases most rapidly for
n
=
±
1
√
2
(1
,
0
,
1) with a rate of change of
±
√
2
. There is no change in
f
if
n
is
perpendicular to ±
1
√
2
(1, 0, 1).
Now suppose we have a scalar function
f
(
r
) and we want to consider the rate
of change along a path
r
(
u
). A change
δu
produces a change
δr
=
r
0
δu
+
o
(
δu
),
and
δf = ∇f · δr + o(|δr|) = ∇f · r
0
(u)δu + o(δu).
This shows that f is differentiable as a function of u and
Theorem (Chain rule). Given a function f(r(u)),
df
du
= ∇f ·
dr
du
=
∂f
∂x
i
dx
i
du
.
Note that if we drop the du, we simply get
df = ∇f · dr =
∂f
∂x
i
dx
i
,
which is what we’ve previously had.
Vector fields R
n
→ R
m
We are now ready to tackle the general case, which are given the fancy name of
vector fields.
Definition (Vector field). A vector field is a function F : R
n
→ R
m
.
Definition
(Derivative of vector field)
.
A vector field
F
:
R
n
→ R
m
is differen-
tiable if
δF
def
= F(x + δx) − F(x) = Mδx + o(δx)
for some m × n matrix M . M is the derivative of F.
As promised, M does not have a fancy name.
Given an arbitrary function
F
:
R
n
→ R
m
that maps
x 7→ y
and a choice
of basis, we can write
F
as a set of
m
functions
y
j
=
F
j
(
x
) such that
y
=
(y
1
, y
2
, ··· , y
m
). Then
dy
j
=
∂F
j
∂x
i
dx
i
.
and we can write the derivative as
Theorem. The derivative of F is given by
M
ji
=
∂y
j
∂x
i
.
Note that we could have used this as the definition of the derivative. However,
the original definition is superior because it does not require a selection of
coordinate system.
Definition.
A function is smooth if it can be differentiated any number of times.
This requires that all partial derivatives exist and are totally symmetric in
i, j
and k (i.e. the differential operator is commutative).
The functions we will consider will be smooth except where things obviously
go wrong (e.g. f (x) = 1/x at x = 0).
Theorem
(Chain rule)
.
Suppose
g
:
R
p
→ R
n
and
f
:
R
n
→ R
m
. Suppose that
the coordinates of the vectors in
R
p
, R
n
and
R
m
are
u
a
, x
i
and
y
r
respectively.
By the chain rule,
∂y
r
∂u
a
=
∂y
r
∂x
i
∂x
i
∂u
a
,
with summation implied. Writing in matrix form,
M(f ◦ g)
ra
= M(f)
ri
M(g)
ia
.
Alternatively, in operator form,
∂
∂u
a
=
∂x
i
∂u
a
∂
∂x
i
.
1.2 Inverse functions
Suppose
g, f
:
R
n
→ R
n
are inverse functions, i.e.
g ◦f
=
f ◦ g
=
id
. Suppose
that f (x) = u and g(u) = x.
Since the derivative of the identity function is the identity matrix (if you
differentiate x wrt to x, you get 1), we must have
M(f ◦ g) = I.
Therefore we know that
M(g) = M(f)
−1
.
We derive this result more formally by noting
∂u
b
∂u
a
= δ
ab
.
So by the chain rule,
∂u
b
∂x
i
∂x
i
∂u
a
= δ
ab
,
i.e. M (f ◦ g) = I.
In the n = 1 case, it is the familiar result that du/dx = 1/(dx/du).
Example.
For
n
= 2, write
u
1
=
ρ
,
u
2
=
ϕ
and let
x
1
=
ρ cos ϕ
and
x
2
=
ρ sin ϕ
. Then the function used to convert between the coordinate systems is
g(u
1
, u
2
) = (u
1
cos u
2
, u
1
sin u
2
)
Then
M(g) =
∂x
1
/∂ρ ∂x
1
/∂ϕ
∂x
2
/∂ρ ∂x
2
/∂ϕ
=
cos ϕ −ρ sin ϕ
sin ϕ ρ cos ϕ
We can invert the relations between (x
1
, x
2
) and (ρ, ϕ) to obtain
ϕ = tan
−1
x
2
x
1
ρ =
q
x
2
1
+ x
2
2
We can calculate
M(f) =
∂ρ/∂x
1
∂ρ/∂x
2
∂ϕ/∂x
1
∂ϕ/∂x
2
= M(g)
−1
.
These matrices are known as Jacobian matrices, and their determinants are
known as the Jacobians.
Note that
det M(f) det M (g) = 1.
1.3 Coordinate systems
Now we can apply the results above the changes of coordinates on Euclidean
space. Suppose
x
i
are the coordinates are Cartesian coordinates. Then we can
define an arbitrary new coordinate system
u
a
in which each coordinate
u
a
is a
function of x. For example, we can define the plane polar coordinates ρ, ϕ by
x
1
= ρ cos ϕ, x
2
= ρ sin ϕ.
However, note that
ρ
and
ϕ
are not components of a position vector, i.e. they
are not the “coefficients” of basis vectors like
r
=
x
1
e
1
+
x
2
e
2
are. But we can
associate related basis vectors that point to directions of increasing
ρ
and
ϕ
,
obtained by differentiating
r
with respect to the variables and then normalizing:
e
ρ
= cos ϕ e
1
+ sin ϕ e
2
, e
ϕ
= −sin ϕ e
1
+ cos ϕ e
2
.
e
1
e
2
ρ
e
ρ
e
ϕ
ϕ
These are not “usual” basis vectors in the sense that these basis vectors vary
with position and are undefined at the origin. However, they are still very useful
when dealing with systems with rotational symmetry.
In three dimensions, we have cylindrical polars and spherical polars.
Cylindrical polars Spherical polars
Conversion formulae
x
1
= ρ cos ϕ x
1
= r sin θ cos ϕ
x
2
= ρ sin ϕ x
2
= r sin θ sin ϕ
x
3
= z x
3
= r cos θ
Basis vectors
e
ρ
= (cos ϕ, sin ϕ, 0) e
r
= (sin θ cos ϕ, sin θ sin ϕ, cos θ)
e
ϕ
= (−sin ϕ, cos ϕ, 0) e
ϕ
= (−sin ϕ, cos ϕ, 0)
e
z
= (0, 0, 1) e
θ
= (cos θ cos ϕ, cos θ sin ϕ, −sin θ)
2 Curves and Line
2.1 Parametrised curves, lengths and arc length
There are many ways we can described a curve. We can, say, describe it by
a equation that the points on the curve satisfy. For example, a circle can be
described by
x
2
+
y
2
= 1. However, this is not a good way to do so, as it is
rather difficult to work with. It is also often difficult to find a closed form like
this for a curve.
Instead, we can imagine the curve to be specified by a particle moving along
the path. So it is represented by a function
x
:
R → R
n
, and the curve itself is
the image of the function. This is known as a parametrisation of a curve. In
addition to simplified notation, this also has the benefit of giving the curve an
orientation.
Definition
(Parametrisation of curve)
.
Given a curve
C
in
R
n
, a parametrisation
of it is a continuous and invertible function
r
:
D → R
n
for some
D ⊆ R
whose
image is C.
r
0
(
u
) is a vector tangent to the curve at each point. A parametrization is
regular if r
0
(u) 6= 0 for all u.
Clearly, a curve can have many different parametrizations.
Example. The curve
1
4
x
2
+ y
2
= 1, y ≥ 0, z = 3.
can be parametrised by 2 cos u
ˆ
i + sin u
ˆ
j + 3
ˆ
k
If we change
u
(and hence
r
) by a small amount, then the distance
|δr|
is
roughly equal to the change in arclength
δs
. So
δs
=
|δr|
+
o
(
δr
). Then we have
Proposition. Let s denote the arclength of a curve r(u). Then
ds
du
= ±
dr
du
= ±|r
0
(u)|
with the sign depending on whether it is in the direction of increasing or decreasing
arclength.
Example. Consider a helix described by r(u) = (3 cos u, 3 sin u, 4u). Then
r
0
(u) = (−3 sin u, 3 cos u, 4)
ds
du
= |r
0
(u)| =
p
3
2
+ 4
2
= 5
So s = 5u. i.e. the arclength from r(0) and r(u) is s = 5u.
We can change parametrisation of
r
by taking an invertible smooth function
u 7→ ˜u
, and have a new parametrization
r
(
˜u
) =
r
(
˜u
(
u
)). Then by the chain rule,
dr
du
=
dr
d˜u
×
d˜u
du
dr
d˜u
=
dr
du
/
d˜u
du
It is often convenient to use the arclength
s
as the parameter. Then the tangent
vector will always have unit length since the proposition above yields
|r
0
(s)| =
ds
ds
= 1.
We call d
s
the scalar line element, which will be used when we consider integrals.
Definition (Scalar line element). The scalar line element of C is ds.
Proposition. ds = ±|r
0
(u)|du
2.2 Line integrals of vector fields
Definition
(Line integral)
.
The line integral of a smooth vector field
F
(
r
) along
a path
C
parametrised by
r
(
u
) along the direction (orientation)
r
(
α
)
→ r
(
β
) is
Z
C
F(r) · dr =
Z
β
α
F(r(u)) · r
0
(u) du.
We say d
r
=
r
0
(
u
)d
u
is the line element on
C
. Note that the upper and lower
limits of the integral are the end point and start point respectively, and
β
is not
necessarily larger than α.
For example, we may be moving a particle from
a
to
b
along a curve
C
under a force field
F
. Then we may divide the curve into many small segments
δr
. Then for each segment, the force experienced is
F
(
r
) and the work done is
F(r) · δr. Then the total work done across the curve is
W =
Z
C
F(r) · dr.
Example.
Take
F
(
r
) = (
xe
y
, z
2
, xy
) and we want to find the line integral from
a = (0, 0, 0) to b = (1, 1, 1).
a
b
C
1
C
2
We first integrate along the curve
C
1
:
r
(
u
) = (
u, u
2
, u
3
). Then
r
0
(
u
) =
(1, 2u, 3u
2
), and F(r(u)) = (ue
u
2
, u
6
, u
3
). So
Z
C
1
F · dr =
Z
1
0
F · r
0
(u) du
=
Z
1
0
ue
u
2
+ 2u
7
+ 3u
5
du
=
e
2
−
1
2
+
1
4
+
1
2
=
e
2
+
1
4
Now we try to integrate along another curve
C
2
:
r
(
t
) = (
t, t, t
). So
r
0
(
t
) =
(1, 1, 1).
Z
C
2
F · dr =
Z
F · r
0
(t)dt
=
Z
1
0
te
t
+ 2t
2
dt
=
5
3
.
We see that the line integral depends on the curve C in general, not just a, b.
We can also use the arclength
s
as the parameter. Since d
r
=
t
d
s
, with
t
being the unit tangent vector, we have
Z
C
F · dr =
Z
C
F · t ds.
Note that we do not necessarily have to integrate
F ·t
with respect to
s
. We can
also integrate a scalar function as a function of
s
,
R
C
f
(
s
) d
s
. By convention,
this is calculated in the direction of increasing s. In particular, we have
Z
C
1 ds = length of C.
Definition
(Closed curve)
.
A closed curve is a curve with the same start and
end point. The line integral along a closed curve is (sometimes) written as
H
and is (sometimes) called the circulation of F around C.
Sometimes we are not that lucky and our curve is not smooth. For example,
the graph of an absolute value function is not smooth. However, often we can
break it apart into many smaller segments, each of which is smooth. Alternatively,
we can write the curve as a sum of smooth curves. We call these piecewise smooth
curves.
Definition
(Piecewise smooth curve)
.
A piecewise smooth curve is a curve
C
=
C
1
+
C
2
+
···
+
C
n
with all
C
i
smooth with regular parametrisations. The
line integral over a piecewise smooth C is
Z
C
F · dr =
Z
C
1
F · dr +
Z
C
2
F · dr + ··· +
Z
C
n
F · dr.
Example.
Take the example above, and let
C
3
=
−C
2
. Then
C
=
C
1
+
C
3
is
piecewise smooth but not smooth. Then
I
C
F · dr =
Z
C
1
F · dr +
Z
C
3
F · dr
=
e
2
+
1
4
−
5
3
= −
17
12
+
e
2
.
a
b
C
1
C
3
2.3 Gradients and Differentials
Recall that the line integral depends on the actual curve taken, and not just the
end points. However, for some nice functions, the integral does depend on the
end points only.
Theorem. If F = ∇f (r), then
Z
C
F · dr = f(b) − f(a),
where b and a are the end points of the curve.
In particular, the line integral does not depend on the curve, but the end
points only. This is the vector counterpart of the fundamental theorem of
calculus. A special case is when C is a closed curve, then
H
C
F · dr = 0.
Proof.
Let
r
(
u
) be any parametrization of the curve, and suppose
a
=
r
(
α
),
b = r(β). Then
Z
C
F · dr =
Z
C
∇f · dr =
Z
∇f ·
dr
du
du.
So by the chain rule, this is equal to
Z
β
α
d
du
(f(r(u))) du = [f(r(u))]
β
α
= f(b) − f (a).
Definition
(Conservative vector field)
.
If
F
=
∇f
for some
f
, the
F
is called a
conservative vector field.
The name conservative comes from mechanics, where conservative vector
fields represent conservative forces that conserve energy. This is since if the
force is conservative, then the integral (i.e. work done) about a closed curve is 0,
which means that we cannot gain energy after travelling around the loop.
It is convenient to treat differentials
F ·
d
r
=
F
i
d
x
i
as if they were objects
by themselves, which we can integrate along curves if we feel like doing so.
Then we can define
Definition
(Exact differential)
.
A differential
F ·
d
r
is exact if there is an
f
such that F = ∇f. Then
df = ∇f · dr =
∂f
∂x
i
dx
i
.
To test if this holds, we can use the necessary condition
Proposition. If F = ∇f for some f , then
∂F
i
∂x
j
=
∂F
j
∂x
i
.
This is because both are equal to ∂
2
f/∂x
i
∂x
j
.
For an exact differential, the result from the previous section reads
Z
C
F · dr =
Z
C
df = f(b) −f(a).
Differentials can be manipulated using (for constant λ, µ):
Proposition.
d(λf + µg) = λdf + µdg
d(fg) = (df)g + f(dg)
Using these, it may be possible to find f by inspection.
Example. Consider
Z
C
3x
2
y sin z dx + x
3
sin z dy + x
3
y cos z dz.
We see that if we integrate the first term with respect to
x
, we obtain
x
3
y sin z
.
We obtain the same thing if we integrate the second and third term. So this is
equal to
Z
C
d(x
3
y sin z) = [x
3
y sin z]
b
a
.
2.4 Work and potential energy
Definition
(Work and potential energy)
.
If
F
(
r
) is a force, then
R
C
F ·
d
r
is
the work done by the force along the curve
C
. It is the limit of a sum of terms
F(r) · δr, i.e. the force along the direction of δr.
Consider a point particle moving under
F
(
r
) according to Newton’s second
law: F(r) = m
¨
r.
Since the kinetic energy is defined as
T (t) =
1
2
m
˙
r
2
,
the rate of change of energy is
d
dt
T (t) = m
˙
r ·
¨
r = F ·
˙
r.
Suppose the path of particle is a curve C from a = r(α) to b = r(β), Then
T (β) − T (α) =
Z
β
α
dT
dt
dt =
Z
β
α
F ·
˙
r dt =
Z
C
F · dr.
So the work done on the particle is the change in kinetic energy.
Definition
(Potential energy)
.
Given a conservative force
F
=
−∇V
,
V
(
x
) is
the potential energy. Then
Z
C
F · dr = V (a) − V (b).
Therefore, for a conservative force, we have
F
=
∇V
, where
V
(
r
) is the
potential energy.
So the work done (gain in kinetic energy) is the loss in potential energy. So
the total energy T + V is conserved, i.e. constant during motion.
We see that energy is conserved for conservative forces. In fact, the converse
is true — the energy is conserved only for conservative forces.
3 Integration in R
2
and R
3
3.1 Integrals over subsets of R
2
Definition
(Surface integral)
.
Let
D ⊆ R
2
. Let
r
= (
x, y
) be in Cartesian
coordinates. We can approximate
D
by
N
disjoint subsets of simple shapes, e.g.
triangles, parallelograms. These shapes are labelled by I and have areas δA
i
.
x
y
D
To integrate a function
f
over
D
, we would like to take the sum
P
f
(
r
i
)
δA
i
,
and take the limit as
δA
i
→
0. But we need a condition stronger than simply
δA
i
→
0. We won’t want the areas to grow into arbitrarily long yet thin strips
whose area decreases to 0. So we say that we find an
`
such that each area can
be contained in a disc of diameter `.
Then we take the limit as
` →
0,
N → ∞
, and the union of the pieces tends
to D. For a function f(r), we define the surface integral as
Z
D
f(r) dA = lim
`→0
X
I
f(r
i
)δA
i
.
where
r
i
is some point within each subset
A
i
. The integral exists if the limit
is well-defined (i.e. the same regardless of what
A
i
and
r
i
we choose before we
take the limit) and exists.
If we take f = 1, then the surface integral is the area of D.
On the other hand, if we put
z
=
f
(
x, y
) and plot out the surface
z
=
f
(
x, y
),
then the area integral is the volume under the surface.
The definition allows us to take the
δA
i
to be any weird shape we want.
However, the sensible thing is clearly to take A
i
to be rectangles.
We choose the small sets in the definition to be rectangles, each of size
δA
I
=
δxδy
. We sum over subsets in a narrow horizontal strip of height
δy
with
y
and
δy
held constant. Take the limit as
δx →
0. We get a contribution
δy
R
x
y
f(y, x) dx with range x
y
∈ {x : (x, y) ∈ D}.
x
y
δy
y
x
y
Y
D
We sum over all such strips and take δy → 0, giving
Proposition.
Z
D
f(x, y) dA =
Z
Y
Z
x
y
f(x, y) dx
!
dy.
with x
y
ranging over {x : (x, y) ∈ D}.
Note that the range of the inner integral is given by a set
x
y
. This can be an
interval, or many disconnected intervals, x
y
= [a
1
, b
1
] ∪ [a
2
, b
2
]. In this case,
Z
x
y
f(x) dx =
Z
b
1
a
1
f(x) dx +
Z
b
2
a
2
f(x) dx.
This is useful if we want to integrate over a concave area and we have disconnected
vertical strips.
x
y
We could also do it the other way round, integrating over
y
first, and come up
with the result
Z
D
f(x, y) dA =
Z
X
Z
y
x
f(x, y) dy
dx.
Theorem
(Fubini’s theorem)
.
If
f
is a continuous function and
D
is a compact
(i.e. closed and bounded) subset of R
2
, then
ZZ
f dx dy =
ZZ
f dy dx.
While we have rather strict conditions for this theorem, it actually holds in many
more cases, but those situations have to be checked manually.
Definition (Area element). The area element is dA.
Proposition. dA = dx dy in Cartesian coordinates.
Example.
We integrate over the triangle bounded by (0
,
0)
,
(2
,
0) and (0
,
1).
We want to integrate the function f(x, y) = x
2
y over the area. So
Z
D
f(xy) dA =
Z
1
0
Z
2−2y
0
x
2
y dx
dy
=
Z
1
0
y
x
3
3
2−2y
0
dy
=
8
3
Z
1
0
y(1 − y)
3
dy
=
2
15
We can integrate it the other way round:
Z
D
x
2
y dA =
Z
2
0
Z
1−x/2
0
x
2
y dy dx
=
Z
2
0
x
2
1
2
y
2
1−x/2
0
dx
=
Z
2
0
x
2
2
1 −
x
2
2
dx
=
2
15
Since it doesn’t matter whether we integrate
x
first or
y
first, if we find it
difficult to integrate one way, we can try doing it the other way and see if it is
easier.
While this integral is tedious in general, there is a special case where it is
substantially easier.
Definition
(Separable function)
.
A function
f
(
x, y
) is separable if it can be
written as f (x, y) = h(y)g(x).
Proposition.
Take separable
f
(
x, y
) =
h
(
y
)
g
(
x
) and
D
be a rectangle
{
(
x, y
) :
a ≤ x ≤ b, c ≤ y ≤ d}. Then
Z
D
f(x, y) dx dy =
Z
b
a
g(x) dx
!
Z
d
c
h(y) dy
!
3.2 Change of variables for an integral in R
2
Proposition.
Suppose we have a change of variables (
x, y
)
↔
(
u, v
) that is
smooth and invertible, with regions D, D
0
in one-to-one correspondence. Then
Z
D
f(x, y) dx dy =
Z
D
0
f(x(u, v), y(u, v))|J| du dv,
where
J =
∂(x, y)
∂(u, v)
=
∂x
∂u
∂x
∂v
∂y
∂u
∂y
∂v
is the Jacobian. In other words,
dx dy = |J| du dv.
Proof.
Since we are writing (
x
(
u, v
)
, y
(
u, v
)), we are actually transforming from
(u, v) to (x, y) and not the other way round.
Suppose we start with an area
δA
0
=
δuδv
in the (
u, v
) plane. Then by
Taylors’ theorem, we have
δx = x(u + δu, v + δv) − x(u, v) ≈
∂x
∂u
δu +
∂x
∂v
δv.
We have a similar expression for δy and we obtain
δx
δy
≈
∂x
∂u
∂x
∂v
∂y
∂u
∂y
∂v
δu
δv
Recall from Vectors and Matrices that the determinant of the matrix is how
much it scales up an area. So the area formed by
δx
and
δy
is
|J|
times the area
formed by δu and δv. Hence
dx dy = |J| du dv.
Example. We transform from (x, y) to (ρ, ϕ) with
x = ρ cos ϕ
y = ρ sin ϕ
We have previously calculated that |J| = ρ. So
dA = ρ dρ dϕ.
Suppose we want to integrate a function over a quarter area D of radius R.
x
y
D
Let the function to be integrated be
f
=
exp
(
−
(
x
2
+
y
2
)
/
2) =
exp
(
−ρ
2
/
2). Then
Z
f dA =
Z
fρ dρ dϕ
=
Z
R
ρ=0
Z
π/2
ϕ=0
e
−ρ
2
/2
ρ dϕ
!
δρ
Note that in polar coordinates, we are integrating over a rectangle and the
function is separable. So this is equal to
=
h
−e
−ρ
2
/2
i
R
0
[ϕ]
π/2
0
=
π
2
1 − e
−R
2
/2
. (∗)
Note that the integral exists as R → ∞.
Now we take the case of x, y → ∞ and consider the original integral.
Z
D
f dA =
Z
∞
x=0
Z
∞
y=0
e
−(x
2
+y
2
)/2
dx dy
=
Z
∞
0
e
−x
2
/2
dx
Z
∞
0
e
−y
2
/2
dy
=
π
2
where the last line is from (*). So each of the two integrals must be
p
π/2, i.e.
Z
∞
0
e
−x
2
/2
dx =
r
π
2
.
3.3 Generalization to R
3
We will do exactly the same thing as we just did, but with one more dimension:
Definition
(Volume integral)
.
Consider a volume
V ⊆ R
3
with position vector
r
= (
x, y, z
). We approximate
V
by
N
small disjoint subsets of some simple
shape (e.g. cuboids) labelled by
I
, volume
δV
I
, contained within a solid sphere
of diameter `.
Assume that as
` →
0 and
N → ∞
, the union of the small subsets tend to
V . Then
Z
V
f(r) dV = lim
`→0
X
I
f(r
∗
I
)δV
I
,
where r
∗
I
is any chosen point in each small subset.
To evaluate this, we can take
δV
I
=
δxδyδz
, and take
δx →
0,
δy →
0 and
δz in some order. For example,
Z
V
f(r) dv =
Z
D
Z
Z
xy
f(x, y, z) dz
!
dx dy.
So we integrate
f
(
x, y, z
) over
z
at each point (
x, y
), then take the integral of
that over the area containing all required (x, y).
Alternatively, we can take the area integral first, and have
Z
V
f(r) dV =
Z
z
Z
D
Z
f(x, y, z) dx dy
dz.
Again, if we take f = 1, then we obtain the volume of V .
Often,
f
(
r
) is the density of some quantity, and is usually denoted by
ρ
. For
example, we might have mass density, charge density, or probability density.
ρ
(
r
)
δV
is then the amount of quantity in a small volume
δV
at
r
. Then
R
V
ρ(r) dV is the total amount of quantity in V .
Definition (Volume element). The volume element is dV .
Proposition. dV = dx dy dz.
We can change variables by some smooth, invertible transformation (
x, y, z
)
7→
(u, v, w). Then
Proposition.
Z
V
f dx dy dz =
Z
V
f|J| du dv dw,
with
J =
∂(x, y, z)
∂(u, v, w)
=
∂x
∂u
∂x
∂v
∂x
∂w
∂y
∂u
∂y
∂v
∂y
∂w
∂z
∂u
∂z
∂v
∂z
∂w
Proposition. In cylindrical coordinates,
dV = ρ dρ dϕ dz.
In spherical coordinates
dV = r
2
sin θ dr dθ dϕ.
Proof. Loads of algebra.
Example.
Suppose
f
(
r
) is spherically symmetric and
V
is a sphere of radius
a
centered on the origin. Then
Z
V
f dV =
Z
a
r=0
Z
π
θ=0
Z
2π
ϕ=0
f(r)r
2
sin θ dr dθ dϕ
=
Z
a
0
dr
Z
π
0
dθ
Z
2π
0
dϕ r
2
f(r) sin θ
=
Z
a
0
r
2
f(r)dr
h
− cos θ
i
π
0
h
ϕ
i
2π
0
= 4π
Z
a
0
f(r)r
2
dr.
where we separated the integral into three parts as in the area integrals.
Note that in the second line, we rewrote the integrals to write the differentials
next to the integral sign. This is simply a different notation that saves us from
writing r = 0 etc. in the limits of the integrals.
This is a useful general result. We understand it as the sum of spherical
shells of thickness δr and volume 4πr
2
δr.
If we take
f
= 1, then we have the familiar result that the volume of a sphere
is
4
3
πa
3
.
Example.
Consider a volume within a sphere of radius
a
with a cylinder of
radius b (b < a) removed. The region is defined as
x
2
+ y
2
+ z
2
≤ a
2
x
2
+ y
2
≥ b
2
.
a
b
We use cylindrical coordinates. The second criteria gives
b ≤ ρ ≤ a.
For the x
2
+ y
2
+ z
2
≤ a
2
criterion, we have
−
p
a
2
− ρ
2
≤ z ≤
p
a
2
− ρ
2
.
So the volume is
Z
V
dV =
Z
a
b
dρ
Z
2π
0
dϕ
Z
√
a
2
−ρ
2
−
√
a
2
−ρ
2
dz ρ
= 2π
Z
a
b
2ρ
p
a
2
− ρ
2
dρ
= 2π
2
3
(a
2
− ρ
2
)
3/2
a
b
=
4
3
π(a
2
− b
2
)
3/2
.
Example.
Suppose the density of electric charge is
ρ
(
r
) =
ρ
0
z
a
in a hemisphere
H of radius a, with z ≥ 0. What is the total charge of H?
We use spherical polars. So
r ≤ a, 0 ≤ ϕ ≤ 2π, 0 ≤ θ ≤
π
2
.
We have
ρ(r) =
ρ
0
a
r cos θ.
The total charge Q in H is
Z
H
ρ dV =
Z
a
0
dr
Z
π/2
0
dθ
Z
2π
0
dϕ
ρ
0
a
r cos θr
2
sin θ
=
ρ
0
a
Z
a
0
r
3
dr
Z
π/2
0
sin θ cos θ dθ
Z
2π
0
dϕ
=
ρ
0
a
r
4
4
a
0
1
2
sin
2
θ
π/2
0
[ϕ]
2π
0
=
ρ
0
πa
3
4
.
3.4 Further generalizations
Integration in R
n
Similar to the above,
R
D
f
(
x
1
, x
2
, ···x
n
) d
x
1
d
x
2
···
d
x
n
is simply the integra-
tion over an n-dimensional volume. The change of variable formula is
Proposition.
Z
D
f(x
1
, x
2
, ···x
n
) dx
1
dx
2
··· dx
n
=
Z
D
0
f({x
i
(u)})|J| du
1
du
2
··· du
n
.
Change of variables for n = 1
In the
n
= 1 case, the Jacobian is
dx
du
. However, we use the following formula for
change of variables:
Z
D
f(x) dx =
Z
D
0
f(x(u))
dx
du
du.
We introduce the modulus because of our natural convention about integrating
over
D
and
D
0
. If
D
= [
a, b
] with
a < b
, we write
R
b
a
. But if
a 7→ α
and
b 7→ β
,
but
α > β
, we would like to write
R
α
β
instead, so we introduce the modulus in
the 1D case.
To show that the modulus is the right thing to do, we check case by case: If
a < b and α < β, then
dx
du
is positive, and we have, as expected
Z
b
a
f(x) dx =
Z
β
α
f(u)
dx
du
du.
If α > β, then
dx
du
is negative. So
Z
b
a
f(x) dx =
Z
β
α
f(u)
dx
du
du = −
Z
α
β
f(u)
dx
du
du.
By taking the absolute value of
dx
du
, we ensure that we always have the numerically
smaller bound as the lower bound.
This is not easily generalized to higher dimensions, so we don’t employ the
same trick in other cases.
Vector-valued integrals
We can define
R
V
F
(
r
) d
V
in a similar way to
R
V
f
(
r
) d
V
as the limit of a sum over
small contributions of volume. In practice, we integrate them componentwise. If
F(r) = F
i
(r)e
i
,
then
Z
V
F(r) dV =
Z
V
(F
i
(r) dV )e
i
.
For example, if a mass has density ρ(r), then its mass is
M =
Z
V
ρ(r) dV
and its center of mass is
R =
1
M
Z
V
rρ(r) dV.
Example.
Consider a solid hemisphere
H
with
r ≤ a
,
z ≥
0 with uniform
density ρ. The mass is
M =
Z
H
ρ dV =
2
3
πa
3
ρ.
Now suppose that
R
= (
X, Y, Z
). By symmetry, we expect
X
=
Y
= 0. We can
find this formally by
X =
1
M
Z
H
xρ dV
=
ρ
M
Z
a
0
Z
π/2
0
Z
2π
0
xr
2
sin θ dϕ dθ dr
=
ρ
M
Z
a
0
r
3
dr ×
Z
π/2
0
sin
2
θ dθ ×
Z
2π
0
cos ϕ dϕ
= 0
as expected. Note that it evaluates to 0 because the integral of
cos
from 0 to 2
π
is 0. Similarly, we obtain Y = 0.
Finally, we find Z.
Z =
ρ
M
Z
a
0
r
3
dr
Z
π/2
0
sin θ cos θ dθ
Z
2π
0
dϕ
=
r
M
a
4
4
1
2
sin
2
θ
π/2
0
2π
=
3a
8
.
So R = (0, 0, 3a/8).
4 Surfaces and surface integrals
4.1 Surfaces and Normal
So far, we have learnt how to do calculus with regions of the plane or space.
What we would like to do now is to study surfaces in
R
3
. The first thing to
figure out is how to specify surfaces. One way to specify a surface is to use an
equation. We let
f
be a smooth function on
R
3
, and
c
be a constant. Then
f
(
r
) =
c
defines a smooth surface (e.g.
x
2
+
y
2
+
z
2
= 1 denotes the unit sphere).
Now consider any curve
r
(
u
) on
S
. Then by the chain rule, if we differentiate
f(r) = c with respect to u, we obtain
d
du
[f(r(u))] = ∇f ·
dr
du
= 0.
This means that
∇f
is always perpendicular to
dr
du
. Since
dr
du
is the tangent to
the curve,
∇f
is perpendicular to the tangent. Since this is true for any curve
r(u), ∇f is perpendicular to any tangent of the surface. Therefore
Proposition. ∇f is the normal to the surface f(r) = c.
Example.
(i)
Take the sphere
f
(
r
) =
x
2
+
y
2
+
z
2
=
c
for
c >
0. Then
∇f
= 2(
x, y, z
) =
2r, which is clearly normal to the sphere.
(ii)
Take
f
(
r
) =
x
2
+
y
2
− z
2
=
c
, which is a hyperboloid. Then
∇f
=
2(x, y, −z).
In the special case where
c
= 0, we have a double cone, with a singular apex
0. Here ∇f = 0, and we cannot find a meaningful direction of normal.
Definition
(Boundary)
.
A surface
S
can be defined to have a boundary
∂S
consisting of a piecewise smooth curve. If we define
S
as in the above examples
but with the additional restriction
z ≥
0, then
∂S
is the circle
x
2
+
y
2
=
c
,
z
= 0.
A surface is bounded if it can be contained in a solid sphere, unbounded
otherwise. A bounded surface with no boundary is called closed (e.g. sphere).
Example.
The boundary of a hemisphere is a circle (drawn in red).
Definition
(Orientable surface)
.
At each point, there is a unit normal
n
that’s
unique up to a sign.
If we can find a consistent choice of
n
that varies smoothly across
S
, then
we say
S
is orientable, and the choice of sign of
n
is called the orientation of the
surface.
Most surfaces we encounter are orientable. For example, for a sphere, we can
declare that the normal should always point outwards. A notable example of a
non-orientable surface is the M¨obius strip (or Klein bottle).
For simple cases, we can describe the orientation as “inward” and “outward”.
4.2 Parametrized surfaces and area
However, specifying a surface by an equation
f
(
r
) =
c
is often not too helpful.
What we would like is to put some coordinate system onto the surface, so that
we can label each point by a pair of numbers (
u, v
), just like how we label points
in the x, y-plane by (x, y). We write r(u, v) for the point labelled by (u, v).
Example. Let S be part of a sphere of radius a with 0 ≤ θ ≤ α.
α
We can then label the points on the spheres by the angles θ, ϕ, with
r(θ, ϕ) = (a cos ϕ sin θ, a sin θ sin ϕ, a cos θ) = ae
r
.
We restrict the values of
θ, ϕ
by 0
≤ θ ≤ α
, 0
≤ ϕ ≤
2
π
, so that each point is
only covered once.
Note that to specify a surface, in addition to the function
r
, we also have
to specify what values of (
u, v
) we are allowed to take. This corresponds to a
region
D
of allowed values of
u
and
v
. When we do integrals with these surfaces,
these will become the bounds of integration.
When we have such a parametrization
r
, we would want to make sure this
indeed gives us a two-dimensional surface. For example, the following two
parametrizations would both be bad:
r(u, v) = u, r(u, v) = u + v.
The idea is that
r
has to depend on both
u
and
v
, and in “different ways”.
More precisely, when we vary the coordinates (
u, v
), the point
r