IA Vector Calculus (Full)

Part IA — Vector Calculus

Based on lectures by B. Allanach

Notes taken by Dexter Chua

Lent 2015

These notes are not endorsed by the lecturers, and I have modified them (often

significantly) after lectures. They are nowhere near accurate representations of what

was actually lectured, and in particular, all errors are almost surely mine.

Curves in R

Parameterised curves and arc length, tangents and normals to curves in

, the radius

of curvature. [1]

Integration in R

and R

Line integrals. Surface and volume integrals: definitions, examples using Cartesian,

cylindrical and spherical coordinates; change of variables. [4]

Vector operators

Directional derivatives. The gradient of a real-valued function: definition; interpretation

as normal to level surfaces; examples including the use of cylindrical, spherical *and

general orthogonal curvilinear* coordinates.

Divergence, curl and

∇

in Cartesian coordinates, examples; formulae for these oper-

ators (statement only) in cylindrical, spherical *and general orthogonal curvilinear*

co ordinates. Solenoidal fields, irrotational fields and conservative fields; scalar poten-

tials. Vector derivative identities. [5]

Integration theorems

Divergence theorem, Green’s theorem, Stokes’s theorem, Green’s second theorem:

statements; informal proofs; examples; application to fluid dynamics, and to electro-

magnetism including statement of Maxwell’s equations. [5]

Laplace’s equation

Laplace’s equation in

and

: uniqueness theorem and maximum principle. Solution

of Poisson’s equation by Gauss’s method (for spherical and cylindrical symmetry) and

as an integral. [4]

Cartesian tensors in R

Tensor transformation laws, addition, multiplication, contraction, with emphasis on

tensors of second rank. Isotropic second and third rank tensors. Symmetric and

antisymmetric tensors. Revision of principal axes and diagonalization. Quotient

theorem. Examples including inertia and conductivity. [5]

Contents

0 Introduction

1 Derivatives and coordinates

1.1 Derivative of functions

1.2 Inverse functions

1.3 Coordinate systems

2 Curves and Line

2.1 Parametrised curves, lengths and arc length

2.2 Line integrals of vector fields

2.3 Gradients and Differentials

2.4 Work and potential energy

3 Integration in R

and R

3.1 Integrals over subsets of R

3.2 Change of variables for an integral in R

3.3 Generalization to R

3.4 Further generalizations

4 Surfaces and surface integrals

4.1 Surfaces and Normal

4.2 Parametrized surfaces and area

4.3 Surface integral of vector fields

4.4 Change of variables in R

and R

revisited

5 Geometry of curves and surfaces

6 Div, Grad, Curl and ∇

6.1 Div, Grad, Curl and ∇

6.2 Second-order derivatives

7 Integral theorems

7.1 Statement and examples

7.1.1 Green’s theorem (in the plane)

7.1.2 Stokes’ theorem

7.1.3 Divergence/Gauss theorem

7.2 Relating and proving integral theorems

8 Some applications of integral theorems

8.1 Integral expressions for div and curl

8.2 Conservative fields and scalar products

8.3 Conservation laws

9 Orthogonal curvilinear coordinates

9.1 Line, area and volume elements

9.2 Grad, Div and Curl

10 Gauss’ Law and Poisson’s equation

10.1 Laws of gravitation

10.2 Laws of electrostatics

10.3 Poisson’s Equation and Laplace’s equation

11 Laplace’s and Poisson’s equations

11.1 Uniqueness theorems

11.2 Laplace’s equation and harmonic functions

11.2.1 The mean value property

11.2.2 The maximum (or minimum) principle

11.3 Integral solutions of Poisson’s equations

11.3.1 Statement and informal derivation

11.3.2 Point sources and δ-functions*

12 Maxwell’s equations

12.1 Laws of electromagnetism

12.2 Static charges and steady currents

12.3 Electromagnetic waves

13 Tensors and tensor fields

13.1 Definition

13.2 Tensor algebra

13.3 Symmetric and antisymmetric tensors

13.4 Tensors, multi-linear maps and the quotient rule

13.5 Tensor calculus

14 Tensors of rank 2

14.1 Decomposition of a second-rank tensor

14.2 The inertia tensor

14.3 Diagonalization of a symmetric second rank tensor

15 Invariant and isotropic tensors

15.1 Definitions and classification results

15.2 Application to invariant integrals

0 Introduction

In the differential equations class, we learnt how to do calculus in one dimension.

However, (apparently) the world has more than one dimension. We live in a

3 (or 4) dimensional world, and string theorists think that the world has more

than 10 dimensions. It is thus important to know how to do calculus in many

dimensions.

For example, the position of a particle in a three dimensional world can be

given by a position vector

. Then by definition, the velocity is given by

This would require us to take the derivative of a vector.

This is not too difficult. We can just differentiate the vector componentwise.

However, we can reverse the problem and get a more complicated one. We can

assign a number to each point in (3D) space, and ask how this number changes

as we move in space. For example, the function might tell us the temperature at

each point in space, and we want to know how the temperature changes with

position.

In the most general case, we will assign a vector to each point in space. For

example, the electric field vector E(x) tells us the direction of the electric field

at each point in space.

On the other side of the story, we also want to do integration in multiple

dimensions. Apart from the obvious “integrating a vector”, we might want to

integrate over surfaces. For example, we can let

(

) be the velocity of some

fluid at each point in space. Then to find the total fluid flow through a surface,

we integrate v over the surface.

In this course, we are mostly going to learn about doing calculus in many

dimensions. In the last few lectures, we are going to learn about Cartesian

tensors, which is a generalization of vectors.

Note that throughout the course (and lecture notes), summation convention

is implied unless otherwise stated.

1 Derivatives and coordinates

1.1 Derivative of functions

We used to define a derivative as the limit of a quotient and a function is differ-

entiable if the derivative exists. However, this obviously cannot be generalized

to vector-valued functions, since you cannot divide by vectors. So we want

an alternative definition of differentiation, which can be easily generalized to

vectors.

Recall, that if a function

is differentiable at

, then for a small perturbation

δx, we have

δf

def

= f (x + δx) − f(x) = f

(x)δx + o(δx),

which says that the resulting change in

is approximately proportional to

δx

(as opposed to 1

/δx

or something else). It can be easily shown that the converse

is true — if f satisfies this relation, then f is differentiable.

This definition is more easily extended to vector functions. We say a function

is differentiable if, when

is perturbed by

δx

, then the resulting change is

“something” times

δx

plus an

(

δx

) error term. In the most general case,

δx

will

be a vector and that “something” will be a matrix. Then that “something” will

be what we call the derivative.

Vector functions R → R

We start with the simple case of vector functions.

Definition (Vector function). A vector function is a function F : R → R

This takes in a number and returns a vector. For example, it can map a time

to the velocity of a particle at that time.

Definition

(Derivative of vector function)

A vector function

(

) is differen-

tiable if

δF

def

= F(x + δx) − F(x) = F

(x)δx + o(δx)

for some F

(x). F

(x) is called the derivative of F(x).

We don’t have anything new and special here, since we might as well have

defined F

(x) as

= lim

δx→0

δx

[F(x + δx) − F(x)],

which is easily shown to be equivalent to the above definition.

Using differential notation, the differentiability condition can be written as

dF = F

(x) dx.

Given a basis

that is independent of

, vector differentiation is performed

componentwise, i.e.

Proposition.

(x) = F

(x)e

Leibnitz identities hold for the products of scalar and vector functions.

Proposition.

(fg ) =

g + f

(g · h) =

· h + g ·

(g × h) =

× h + g ×

Note that the order of multiplication must be retained in the case of the cross

product.

Example.

Consider a particle with mass

. It has position

(

), velocity

(

)

and acceleration

r. Its momentum is p = m

r(t).

Note that derivatives with respect to

are usually denoted by dots instead

of dashes.

If F(r) is the force on a particle, then Newton’s second law states that

p = m

r = F.

We can define the angular momentum about the origin to be

L = r × p = mr ×

If we want to know how the angular momentum changes over time, then

L = m

r ×

r + mr ×

r = mr ×

r = r × F.

which is the torque of F about the origin.

Scalar functions R

→ R

We can also define derivatives for a different kind of function:

Definition. A scalar function is a function f : R

→ R.

A scalar function takes in a position and gives you a number, e.g. the potential

energy of a particle at different positions.

Before we define the derivative of a scalar function, we have to first define

what it means to take a limit of a vector.

Definition

(Limit of vector)

The limit of vectors is defined using the norm.

So v → c iff |v − c| → 0. Similarly, f (r) = o(r) means

|f(r)|

|r|

→ 0 as r → 0.

Definition

(Gradient of scalar function)

A scalar function

(

) is differentiable

at r if

δf

def

= f (r + δr) − f(r) = (∇f ) · δr + o(δr)

for some vector ∇f , the gradient of f at r.

Here we have a fancy name “gradient” for the derivative. But we will soon

give up on finding fancy names and just call everything the “derivative”!

Note also that here we genuinely need the new notion of derivative, since

“dividing by δr” makes no sense at all!

The above definition considers the case where

δr

comes in all directions.

What if we only care about the case where

δr

is in some particular direction

For example, maybe

is the potential of a particle that is confined to move in

one straight line only.

Then taking δr = hn, with n a unit vector,

f(r + hn) − f(r) = ∇f · (hn) + o(h) = h(∇f · n) + o(h),

which gives

Definition (Directional derivative). The directional derivative of f along n is

n · ∇f = lim

h→0

[f(r + hn) − f(r)],

It refers to how fast f changes when we move in the direction of n.

Using this expression, the directional derivative is maximized when

is in

the same direction as

∇f

(then

n · ∇f

|∇f|

). So

∇f

points in the direction

of greatest slope.

How do we evaluate

∇f

? Suppose we have an orthonormal basis

. Setting

n = e

in the above equation, we obtain

· ∇f = lim

h→0

[f(r + he

) − f(r)] =

∂f

∂x

Hence

Theorem. The gradient is

∇f =

∂f

∂x

Hence we can write the condition of differentiability as

δf =

∂f

∂x

δx

+ o(δx).

In differential notation, we write

df = ∇f · dr =

∂f

∂x

which is the chain rule for partial derivatives.

Example. Take f(x, y, z) = x + e

sin z. Then

∇f =



∂f

∂x

∂f

∂y

∂f

∂z



= (1 + ye

sin z, xe

sin z, e

cos z)

At (x, y, z) = (0, 1, 0), ∇f = (1, 0, 1). So f increases/decreases most rapidly for

√

1) with a rate of change of

√

. There is no change in

perpendicular to ±

√

(1, 0, 1).

Now suppose we have a scalar function

(

) and we want to consider the rate

of change along a path

(

). A change

δu

produces a change

δr

δu

(

δu

and

δf = ∇f · δr + o(|δr|) = ∇f · r

(u)δu + o(δu).

This shows that f is differentiable as a function of u and

Theorem (Chain rule). Given a function f(r(u)),

= ∇f ·

∂f

∂x

Note that if we drop the du, we simply get

df = ∇f · dr =

∂f

∂x

which is what we’ve previously had.

Vector fields R

→ R

We are now ready to tackle the general case, which are given the fancy name of

vector fields.

Definition (Vector field). A vector field is a function F : R

→ R

Definition

(Derivative of vector field)

A vector field

→ R

is differen-

tiable if

δF

def

= F(x + δx) − F(x) = Mδx + o(δx)

for some m × n matrix M . M is the derivative of F.

As promised, M does not have a fancy name.

Given an arbitrary function

→ R

that maps

x 7→ y

and a choice

of basis, we can write

as a set of

functions

(

) such that

, y

, ··· , y

). Then

∂F

∂x

and we can write the derivative as

Theorem. The derivative of F is given by

∂y

∂x

Note that we could have used this as the definition of the derivative. However,

the original definition is superior because it does not require a selection of

coordinate system.

Definition.

A function is smooth if it can be differentiated any number of times.

This requires that all partial derivatives exist and are totally symmetric in

i, j

and k (i.e. the differential operator is commutative).

The functions we will consider will be smooth except where things obviously

go wrong (e.g. f (x) = 1/x at x = 0).

Theorem

(Chain rule)

Suppose

→ R

and

→ R

. Suppose that

the coordinates of the vectors in

, R

and

are

, x

and

respectively.

By the chain rule,

∂y

∂u

∂y

∂x

∂u

with summation implied. Writing in matrix form,

M(f ◦ g)

= M(f)

M(g)

Alternatively, in operator form,

∂

∂u

∂x

∂u

∂

∂x

1.2 Inverse functions

Suppose

g, f

→ R

are inverse functions, i.e.

g ◦f

f ◦ g

. Suppose

that f (x) = u and g(u) = x.

Since the derivative of the identity function is the identity matrix (if you

differentiate x wrt to x, you get 1), we must have

M(f ◦ g) = I.

Therefore we know that

M(g) = M(f)

−1

We derive this result more formally by noting

∂u

= δ

So by the chain rule,

∂u

∂x

∂u

= δ

i.e. M (f ◦ g) = I.

In the n = 1 case, it is the familiar result that du/dx = 1/(dx/du).

Example.

For

= 2, write

and let

ρ cos ϕ

and

ρ sin ϕ

. Then the function used to convert between the coordinate systems is

g(u

, u

) = (u

cos u

, u

sin u

)

Then

M(g) =



∂x

/∂ρ ∂x

/∂ϕ

∂x

/∂ρ ∂x

/∂ϕ





cos ϕ −ρ sin ϕ

sin ϕ ρ cos ϕ



We can invert the relations between (x

, x

) and (ρ, ϕ) to obtain

ϕ = tan

−1

ρ =

+ x

We can calculate

M(f) =



∂ρ/∂x

∂ϕ/∂x



= M(g)

−1

These matrices are known as Jacobian matrices, and their determinants are

known as the Jacobians.

Note that

det M(f) det M (g) = 1.

1.3 Coordinate systems

Now we can apply the results above the changes of coordinates on Euclidean

space. Suppose

are the coordinates are Cartesian coordinates. Then we can

define an arbitrary new coordinate system

in which each coordinate

is a

function of x. For example, we can define the plane polar coordinates ρ, ϕ by

= ρ cos ϕ, x

= ρ sin ϕ.

However, note that

and

are not components of a position vector, i.e. they

are not the “coefficients” of basis vectors like

are. But we can

associate related basis vectors that point to directions of increasing

and

obtained by differentiating

with respect to the variables and then normalizing:

= cos ϕ e

+ sin ϕ e

, e

= −sin ϕ e

+ cos ϕ e

These are not “usual” basis vectors in the sense that these basis vectors vary

with position and are undefined at the origin. However, they are still very useful

when dealing with systems with rotational symmetry.

In three dimensions, we have cylindrical polars and spherical polars.

Cylindrical polars Spherical polars

Conversion formulae

= ρ cos ϕ x

= r sin θ cos ϕ

= ρ sin ϕ x

= r sin θ sin ϕ

= z x

= r cos θ

Basis vectors

= (cos ϕ, sin ϕ, 0) e

= (sin θ cos ϕ, sin θ sin ϕ, cos θ)

= (−sin ϕ, cos ϕ, 0) e

= (−sin ϕ, cos ϕ, 0)

= (0, 0, 1) e

= (cos θ cos ϕ, cos θ sin ϕ, −sin θ)

2 Curves and Line

2.1 Parametrised curves, lengths and arc length

There are many ways we can described a curve. We can, say, describe it by

a equation that the points on the curve satisfy. For example, a circle can be

described by

= 1. However, this is not a good way to do so, as it is

rather difficult to work with. It is also often difficult to find a closed form like

this for a curve.

Instead, we can imagine the curve to be specified by a particle moving along

the path. So it is represented by a function

R → R

, and the curve itself is

the image of the function. This is known as a parametrisation of a curve. In

addition to simplified notation, this also has the benefit of giving the curve an

orientation.

Definition

(Parametrisation of curve)

Given a curve

, a parametrisation

of it is a continuous and invertible function

D → R

for some

D ⊆ R

whose

image is C.

(

) is a vector tangent to the curve at each point. A parametrization is

regular if r

(u) 6= 0 for all u.

Clearly, a curve can have many different parametrizations.

Example. The curve

+ y

= 1, y ≥ 0, z = 3.

can be parametrised by 2 cos u

i + sin u

j + 3

If we change

(and hence

) by a small amount, then the distance

|δr|

roughly equal to the change in arclength

δs

. So

δs

|δr|

(

δr

). Then we have

Proposition. Let s denote the arclength of a curve r(u). Then

= ±



= ±|r

(u)|

with the sign depending on whether it is in the direction of increasing or decreasing

arclength.

Example. Consider a helix described by r(u) = (3 cos u, 3 sin u, 4u). Then

(u) = (−3 sin u, 3 cos u, 4)

= |r

(u)| =

+ 4

= 5

So s = 5u. i.e. the arclength from r(0) and r(u) is s = 5u.

We can change parametrisation of

by taking an invertible smooth function

u 7→ ˜u

, and have a new parametrization

(

˜u

) =

(

˜u

(

)). Then by the chain rule,

d˜u

It is often convenient to use the arclength

as the parameter. Then the tangent

vector will always have unit length since the proposition above yields

(s)| =

= 1.

We call d

the scalar line element, which will be used when we consider integrals.

Definition (Scalar line element). The scalar line element of C is ds.

Proposition. ds = ±|r

(u)|du

2.2 Line integrals of vector fields

Definition

(Line integral)

The line integral of a smooth vector field

(

) along

a path

parametrised by

(

) along the direction (orientation)

(

)

→ r

(

) is

F(r) · dr =

F(r(u)) · r

(u) du.

We say d

(

is the line element on

. Note that the upper and lower

limits of the integral are the end point and start point respectively, and

is not

necessarily larger than α.

For example, we may be moving a particle from

along a curve

under a force field

. Then we may divide the curve into many small segments

δr

. Then for each segment, the force experienced is

(

) and the work done is

F(r) · δr. Then the total work done across the curve is

W =

F(r) · dr.

Example.

Take

(

) = (

, z

, xy

) and we want to find the line integral from

a = (0, 0, 0) to b = (1, 1, 1).

We first integrate along the curve

(

) = (

u, u

, u

). Then

(

) =

(1, 2u, 3u

), and F(r(u)) = (ue

, u

). So

F · dr =

F · r

(u) du

+ 2u

+ 3u

−

Now we try to integrate along another curve

(

) = (

t, t, t

). So

(

) =

(1, 1, 1).

F · dr =

F · r

(t)dt

+ 2t

We see that the line integral depends on the curve C in general, not just a, b.

We can also use the arclength

as the parameter. Since d

, with

being the unit tangent vector, we have

F · dr =

F · t ds.

Note that we do not necessarily have to integrate

F ·t

with respect to

. We can

also integrate a scalar function as a function of

(

) d

. By convention,

this is calculated in the direction of increasing s. In particular, we have

1 ds = length of C.

Definition

(Closed curve)

A closed curve is a curve with the same start and

end point. The line integral along a closed curve is (sometimes) written as

and is (sometimes) called the circulation of F around C.

Sometimes we are not that lucky and our curve is not smooth. For example,

the graph of an absolute value function is not smooth. However, often we can

break it apart into many smaller segments, each of which is smooth. Alternatively,

we can write the curve as a sum of smooth curves. We call these piecewise smooth

curves.

Definition

(Piecewise smooth curve)

A piecewise smooth curve is a curve

···

with all

smooth with regular parametrisations. The

line integral over a piecewise smooth C is

F · dr =

F · dr +

F · dr + ··· +

F · dr.

Example.

Take the example above, and let

−C

. Then

piecewise smooth but not smooth. Then

F · dr =

F · dr +

F · dr





−

= −

2.3 Gradients and Differentials

Recall that the line integral depends on the actual curve taken, and not just the

end points. However, for some nice functions, the integral does depend on the

end points only.

Theorem. If F = ∇f (r), then

F · dr = f(b) − f(a),

where b and a are the end points of the curve.

In particular, the line integral does not depend on the curve, but the end

points only. This is the vector counterpart of the fundamental theorem of

calculus. A special case is when C is a closed curve, then

F · dr = 0.

Proof.

Let

(

) be any parametrization of the curve, and suppose

(

b = r(β). Then

F · dr =

∇f · dr =

∇f ·

du.

So by the chain rule, this is equal to

(f(r(u))) du = [f(r(u))]

= f(b) − f (a).

Definition

(Conservative vector field)

∇f

for some

, the

is called a

conservative vector field.

The name conservative comes from mechanics, where conservative vector

fields represent conservative forces that conserve energy. This is since if the

force is conservative, then the integral (i.e. work done) about a closed curve is 0,

which means that we cannot gain energy after travelling around the loop.

It is convenient to treat differentials

F ·

as if they were objects

by themselves, which we can integrate along curves if we feel like doing so.

Then we can define

Definition

(Exact differential)

A differential

F ·

is exact if there is an

such that F = ∇f. Then

df = ∇f · dr =

∂f

∂x

To test if this holds, we can use the necessary condition

Proposition. If F = ∇f for some f , then

∂F

∂x

∂F

∂x

This is because both are equal to ∂

f/∂x

∂x

For an exact differential, the result from the previous section reads

F · dr =

df = f(b) −f(a).

Differentials can be manipulated using (for constant λ, µ):

Proposition.

d(λf + µg) = λdf + µdg

d(fg) = (df)g + f(dg)

Using these, it may be possible to find f by inspection.

Example. Consider

y sin z dx + x

sin z dy + x

y cos z dz.

We see that if we integrate the first term with respect to

, we obtain

y sin z

We obtain the same thing if we integrate the second and third term. So this is

equal to

d(x

y sin z) = [x

y sin z]

2.4 Work and potential energy

Definition

(Work and potential energy)

(

) is a force, then

F ·

the work done by the force along the curve

. It is the limit of a sum of terms

F(r) · δr, i.e. the force along the direction of δr.

Consider a point particle moving under

(

) according to Newton’s second

law: F(r) = m

Since the kinetic energy is defined as

T (t) =

the rate of change of energy is

T (t) = m

r ·

r = F ·

Suppose the path of particle is a curve C from a = r(α) to b = r(β), Then

T (β) − T (α) =

dt =

F ·

r dt =

F · dr.

So the work done on the particle is the change in kinetic energy.

Definition

(Potential energy)

Given a conservative force

−∇V

(

) is

the potential energy. Then

F · dr = V (a) − V (b).

Therefore, for a conservative force, we have

∇V

, where

(

) is the

potential energy.

So the work done (gain in kinetic energy) is the loss in potential energy. So

the total energy T + V is conserved, i.e. constant during motion.

We see that energy is conserved for conservative forces. In fact, the converse

is true — the energy is conserved only for conservative forces.

3 Integration in R

and R

3.1 Integrals over subsets of R

Definition

(Surface integral)

Let

D ⊆ R

. Let

= (

x, y

) be in Cartesian

coordinates. We can approximate

disjoint subsets of simple shapes, e.g.

triangles, parallelograms. These shapes are labelled by I and have areas δA

To integrate a function

over

, we would like to take the sum

(

)

δA

and take the limit as

δA

→

0. But we need a condition stronger than simply

δA

→

0. We won’t want the areas to grow into arbitrarily long yet thin strips

whose area decreases to 0. So we say that we find an

such that each area can

be contained in a disc of diameter `.

Then we take the limit as

` →

N → ∞

, and the union of the pieces tends

to D. For a function f(r), we define the surface integral as

f(r) dA = lim

`→0

f(r

)δA

where

is some point within each subset

. The integral exists if the limit

is well-defined (i.e. the same regardless of what

and

we choose before we

take the limit) and exists.

If we take f = 1, then the surface integral is the area of D.

On the other hand, if we put

(

x, y

) and plot out the surface

(

x, y

then the area integral is the volume under the surface.

The definition allows us to take the

δA

to be any weird shape we want.

However, the sensible thing is clearly to take A

to be rectangles.

We choose the small sets in the definition to be rectangles, each of size

δA

δxδy

. We sum over subsets in a narrow horizontal strip of height

δy

with

and

δy

held constant. Take the limit as

δx →

0. We get a contribution

δy

f(y, x) dx with range x

∈ {x : (x, y) ∈ D}.

δy

We sum over all such strips and take δy → 0, giving

Proposition.

f(x, y) dA =

f(x, y) dx

dy.

with x

ranging over {x : (x, y) ∈ D}.

Note that the range of the inner integral is given by a set

. This can be an

interval, or many disconnected intervals, x

= [a

, b

] ∪ [a

, b

]. In this case,

f(x) dx =

f(x) dx +

f(x) dx.

This is useful if we want to integrate over a concave area and we have disconnected

vertical strips.

We could also do it the other way round, integrating over

first, and come up

with the result

f(x, y) dA =



f(x, y) dy



dx.

Theorem

(Fubini’s theorem)

is a continuous function and

is a compact

(i.e. closed and bounded) subset of R

, then

f dx dy =

f dy dx.

While we have rather strict conditions for this theorem, it actually holds in many

more cases, but those situations have to be checked manually.

Definition (Area element). The area element is dA.

Proposition. dA = dx dy in Cartesian coordinates.

Example.

We integrate over the triangle bounded by (0

0) and (0

1).

We want to integrate the function f(x, y) = x

y over the area. So

f(xy) dA =



2−2y

y dx







2−2y

y(1 − y)

We can integrate it the other way round:

y dA =

1−x/2

y dy dx





1−x/2



1 −



Since it doesn’t matter whether we integrate

first or

first, if we find it

difficult to integrate one way, we can try doing it the other way and see if it is

easier.

While this integral is tedious in general, there is a special case where it is

substantially easier.

Definition

(Separable function)

A function

(

x, y

) is separable if it can be

written as f (x, y) = h(y)g(x).

Proposition.

Take separable

(

x, y

) =

(

)

(

) and

be a rectangle

{

(

x, y

) :

a ≤ x ≤ b, c ≤ y ≤ d}. Then

f(x, y) dx dy =

g(x) dx

h(y) dy

3.2 Change of variables for an integral in R

Proposition.

Suppose we have a change of variables (

x, y

)

↔

(

u, v

) that is

smooth and invertible, with regions D, D

in one-to-one correspondence. Then

f(x, y) dx dy =

f(x(u, v), y(u, v))|J| du dv,

where

J =

∂(x, y)

∂(u, v)



∂x

∂u

∂x

∂v

∂y

∂u

∂y

∂v



is the Jacobian. In other words,

dx dy = |J| du dv.

Proof.

Since we are writing (

(

u, v

)

, y

(

u, v

)), we are actually transforming from

(u, v) to (x, y) and not the other way round.

Suppose we start with an area

δA

δuδv

in the (

u, v

) plane. Then by

Taylors’ theorem, we have

δx = x(u + δu, v + δv) − x(u, v) ≈

∂x

∂u

δu +

∂x

∂v

δv.

We have a similar expression for δy and we obtain



δx

δy



≈



∂x

∂u

∂x

∂v

∂y

∂u

∂y

∂v



δu

δv



Recall from Vectors and Matrices that the determinant of the matrix is how

much it scales up an area. So the area formed by

δx

and

δy

|J|

times the area

formed by δu and δv. Hence

dx dy = |J| du dv.

Example. We transform from (x, y) to (ρ, ϕ) with

x = ρ cos ϕ

y = ρ sin ϕ

We have previously calculated that |J| = ρ. So

dA = ρ dρ dϕ.

Suppose we want to integrate a function over a quarter area D of radius R.

Let the function to be integrated be

exp

(

−

(

)

2) =

exp

(

−ρ

2). Then

f dA =

fρ dρ dϕ

ρ=0

π/2

ϕ=0

−ρ

ρ dϕ

δρ

Note that in polar coordinates, we are integrating over a rectangle and the

function is separable. So this is equal to

−e

−ρ

[ϕ]

π/2



1 − e

−R



. (∗)

Note that the integral exists as R → ∞.

Now we take the case of x, y → ∞ and consider the original integral.

f dA =

∞

x=0

∞

y=0

−(x

)/2

dx dy



∞

−x



∞

−y



where the last line is from (*). So each of the two integrals must be

π/2, i.e.

∞

−x

dx =

3.3 Generalization to R

We will do exactly the same thing as we just did, but with one more dimension:

Definition

(Volume integral)

Consider a volume

V ⊆ R

with position vector

= (

x, y, z

). We approximate

small disjoint subsets of some simple

shape (e.g. cuboids) labelled by

, volume

δV

, contained within a solid sphere

of diameter `.

Assume that as

` →

0 and

N → ∞

, the union of the small subsets tend to

V . Then

f(r) dV = lim

`→0

f(r

∗

)δV

where r

∗

is any chosen point in each small subset.

To evaluate this, we can take

δV

δxδyδz

, and take

δx →

δy →

0 and

δz in some order. For example,

f(r) dv =

f(x, y, z) dz

dx dy.

So we integrate

(

x, y, z

) over

at each point (

x, y

), then take the integral of

that over the area containing all required (x, y).

Alternatively, we can take the area integral first, and have

f(r) dV =



f(x, y, z) dx dy



dz.

Again, if we take f = 1, then we obtain the volume of V .

Often,

(

) is the density of some quantity, and is usually denoted by

. For

example, we might have mass density, charge density, or probability density.

(

)

δV

is then the amount of quantity in a small volume

δV

. Then

ρ(r) dV is the total amount of quantity in V .

Definition (Volume element). The volume element is dV .

Proposition. dV = dx dy dz.

We can change variables by some smooth, invertible transformation (

x, y, z

)

7→

(u, v, w). Then

Proposition.

f dx dy dz =

f|J| du dv dw,

with

J =

∂(x, y, z)

∂(u, v, w)



∂x

∂u

∂x

∂v

∂x

∂w

∂y

∂u

∂y

∂v

∂y

∂w

∂z

∂u

∂z

∂v

∂z

∂w



Proposition. In cylindrical coordinates,

dV = ρ dρ dϕ dz.

In spherical coordinates

dV = r

sin θ dr dθ dϕ.

Proof. Loads of algebra.

Example.

Suppose

(

) is spherically symmetric and

is a sphere of radius

centered on the origin. Then

f dV =

r=0

θ=0

2π

ϕ=0

f(r)r

sin θ dr dθ dϕ

dθ

2π

dϕ r

f(r) sin θ

f(r)dr

− cos θ

2π

= 4π

f(r)r

dr.

where we separated the integral into three parts as in the area integrals.

Note that in the second line, we rewrote the integrals to write the differentials

next to the integral sign. This is simply a different notation that saves us from

writing r = 0 etc. in the limits of the integrals.

This is a useful general result. We understand it as the sum of spherical

shells of thickness δr and volume 4πr

δr.

If we take

= 1, then we have the familiar result that the volume of a sphere

πa

Example.

Consider a volume within a sphere of radius

with a cylinder of

radius b (b < a) removed. The region is defined as

+ y

+ z

≤ a

+ y

≥ b

We use cylindrical coordinates. The second criteria gives

b ≤ ρ ≤ a.

For the x

+ y

+ z

≤ a

criterion, we have

−

− ρ

≤ z ≤

− ρ

So the volume is

dV =

dρ

2π

dϕ

√

−ρ

−

√

−ρ

dz ρ

= 2π

2ρ

− ρ

dρ

= 2π



− ρ

)

3/2



π(a

− b

)

3/2

Example.

Suppose the density of electric charge is

(

) =

in a hemisphere

H of radius a, with z ≥ 0. What is the total charge of H?

We use spherical polars. So

r ≤ a, 0 ≤ ϕ ≤ 2π, 0 ≤ θ ≤

We have

ρ(r) =

r cos θ.

The total charge Q in H is

ρ dV =

π/2

dθ

2π

dϕ

r cos θr

sin θ

π/2

sin θ cos θ dθ

2π

dϕ







sin



π/2

[ϕ]

2π

πa

3.4 Further generalizations

Integration in R

Similar to the above,

(

, x

, ···x

) d

···

is simply the integra-

tion over an n-dimensional volume. The change of variable formula is

Proposition.

f(x

, x

, ···x

) dx

··· dx

f({x

(u)})|J| du

··· du

Change of variables for n = 1

In the

= 1 case, the Jacobian is

. However, we use the following formula for

change of variables:

f(x) dx =

f(x(u))



du.

We introduce the modulus because of our natural convention about integrating

over

and

. If

= [

a, b

] with

a < b

, we write

. But if

a 7→ α

and

b 7→ β

but

α > β

, we would like to write

instead, so we introduce the modulus in

the 1D case.

To show that the modulus is the right thing to do, we check case by case: If

a < b and α < β, then

is positive, and we have, as expected

f(x) dx =

f(u)

du.

If α > β, then

is negative. So

f(x) dx =

f(u)

du = −

f(u)

du.

By taking the absolute value of

, we ensure that we always have the numerically

smaller bound as the lower bound.

This is not easily generalized to higher dimensions, so we don’t employ the

same trick in other cases.

Vector-valued integrals

We can define

(

) d

in a similar way to

(

) d

as the limit of a sum over

small contributions of volume. In practice, we integrate them componentwise. If

F(r) = F

(r)e

then

F(r) dV =

(r) dV )e

For example, if a mass has density ρ(r), then its mass is

M =

ρ(r) dV

and its center of mass is

R =

rρ(r) dV.

Example.

Consider a solid hemisphere

with

r ≤ a

z ≥

0 with uniform

density ρ. The mass is

M =

ρ dV =

πa

ρ.

Now suppose that

= (

X, Y, Z

). By symmetry, we expect

= 0. We can

find this formally by

X =

xρ dV

π/2

2π

sin θ dϕ dθ dr

dr ×

π/2

sin

θ dθ ×

2π

cos ϕ dϕ

= 0

as expected. Note that it evaluates to 0 because the integral of

cos

from 0 to 2

is 0. Similarly, we obtain Y = 0.

Finally, we find Z.

Z =

π/2

sin θ cos θ dθ

2π

dϕ





sin



π/2

2π

So R = (0, 0, 3a/8).

4 Surfaces and surface integrals

4.1 Surfaces and Normal

So far, we have learnt how to do calculus with regions of the plane or space.

What we would like to do now is to study surfaces in

. The first thing to

figure out is how to specify surfaces. One way to specify a surface is to use an

equation. We let

be a smooth function on

, and

be a constant. Then

(

) =

defines a smooth surface (e.g.

= 1 denotes the unit sphere).

Now consider any curve

(

) on

. Then by the chain rule, if we differentiate

f(r) = c with respect to u, we obtain

[f(r(u))] = ∇f ·

= 0.

This means that

∇f

is always perpendicular to

. Since

is the tangent to

the curve,

∇f

is perpendicular to the tangent. Since this is true for any curve

r(u), ∇f is perpendicular to any tangent of the surface. Therefore

Proposition. ∇f is the normal to the surface f(r) = c.

Example.

(i)

Take the sphere

(

) =

for

c >

0. Then

∇f

= 2(

x, y, z

) =

2r, which is clearly normal to the sphere.

(ii)

Take

(

) =

− z

, which is a hyperboloid. Then

∇f

2(x, y, −z).

In the special case where

= 0, we have a double cone, with a singular apex

0. Here ∇f = 0, and we cannot find a meaningful direction of normal.

Definition

(Boundary)

A surface

can be defined to have a boundary

∂S

consisting of a piecewise smooth curve. If we define

as in the above examples

but with the additional restriction

z ≥

0, then

∂S

is the circle

= 0.

A surface is bounded if it can be contained in a solid sphere, unbounded

otherwise. A bounded surface with no boundary is called closed (e.g. sphere).

Example.

The boundary of a hemisphere is a circle (drawn in red).

Definition

(Orientable surface)

At each point, there is a unit normal

that’s

unique up to a sign.

If we can find a consistent choice of

that varies smoothly across

, then

we say

is orientable, and the choice of sign of

is called the orientation of the

surface.

Most surfaces we encounter are orientable. For example, for a sphere, we can

declare that the normal should always point outwards. A notable example of a

non-orientable surface is the M¨obius strip (or Klein bottle).

For simple cases, we can describe the orientation as “inward” and “outward”.

4.2 Parametrized surfaces and area

However, specifying a surface by an equation

(

) =

is often not too helpful.

What we would like is to put some coordinate system onto the surface, so that

we can label each point by a pair of numbers (

u, v

), just like how we label points

in the x, y-plane by (x, y). We write r(u, v) for the point labelled by (u, v).

Example. Let S be part of a sphere of radius a with 0 ≤ θ ≤ α.

We can then label the points on the spheres by the angles θ, ϕ, with

r(θ, ϕ) = (a cos ϕ sin θ, a sin θ sin ϕ, a cos θ) = ae

We restrict the values of

θ, ϕ

by 0

≤ θ ≤ α

, 0

≤ ϕ ≤

, so that each point is

only covered once.

Note that to specify a surface, in addition to the function

, we also have

to specify what values of (

u, v

) we are allowed to take. This corresponds to a

region

of allowed values of

and

. When we do integrals with these surfaces,

these will become the bounds of integration.

When we have such a parametrization

, we would want to make sure this

indeed gives us a two-dimensional surface. For example, the following two

parametrizations would both be bad:

r(u, v) = u, r(u, v) = u + v.

The idea is that

has to depend on both

and

, and in “different ways”.

More precisely, when we vary the coordinates (

u, v

), the point

will change

accordingly. By the chain rule, this is given by

δr =

∂r

∂u

δu +

∂r

∂v

δv + o(δu, δv).

Then

∂r

δu

and

∂r

∂v

are tangent vectors to curves on

with

and

constant

respectively. What we want is for them to point in different directions.

Definition

(Regular parametrization)

A parametrization is regular if for all

u, v,

∂r

∂u

∂r

∂v

6= 0,

i.e. there are always two independent tangent directions.

The parametrizations we use will all be regular.

Given a surface, how could we, say, find its area? We can use our parametriza-

tion. Suppose points on the surface are given by

(

u, v

) for (

u, v

)

∈ D

. If we

want to find the area of D itself, we would simply integrate

du dv.

However, we are just using

and

as arbitrary labels for points in the surface,

and one unit of area in

does not correspond to one unit of area in

. Instead,

suppose we produce a small rectangle in

by changing

and

by small

δu, δv

, this corresponds to a rectangle with vertices (

u, v

)

(

δu, v

)

(

u, v

δv

)

(

δu, v

δv

), and spans an area

δuδv

. In the surface

, these small

changes

δu, δv

correspond to changes

∂r

∂u

δu

and

∂r

∂v

δv

, and these span a vector

area of

δS =

∂r

∂u

∂r

∂v

δuδv = n δS.

Note that the order of u, v gives the choice of the sign of the unit normal.

The actual area is then given by

δS =



∂r

∂u

∂r

∂v



δu δv.

Making these into differentials instead of deltas, we have

Proposition. The vector area element is

dS =

∂r

∂u

∂r

∂v

du dv.

The scalar area element is

dS =



∂r

∂u

∂r

∂v



du dv.

By summing and taking limits, the area of S is

dS =



∂r

∂u

∂r

∂v



du dv.

Example. Consider again the part of the sphere of radius a with 0 ≤ θ ≤ α.

Then we have

r(θ, ϕ) = (a cos ϕ sin θ, a sin θ sin ϕ, a cos θ) = ae

So we find

∂r

∂θ

= ae

Similarly, we have

∂r

∂ϕ

= a sin θe

Then

∂r

∂θ

∂r

∂ϕ

= a

sin θ e

dS = a

sin θ dθ dϕ.

Our bounds are 0 ≤ θ ≤ α, 0 ≤ ϕ ≤ 2π.

Then the area is

2π

sin θ dθ dϕ = 2πa

(1 − cos α).

4.3 Surface integral of vector fields

Just computing the area of a surface would be boring. Suppose we have a surface

parametrized by

(

u, v

), where (

u, v

) takes values in

. We would like to ask

how much “stuff” is passing through

, where the flow of stuff is given by a

vector field F(r).

We might attempt to use the integral

|F| dS.

However, this doesn’t work. For example, if all the flow is tangential to the

surface, then nothing is really passing through the surface, but

|F|

is non-zero,

so we get a non-zero integral. Instead, what we should do is to consider the

component of F that is normal to the surface S, i.e. parallel to its normal.

Definition

(Surface integral)

The surface integral or

flux

of a vector field

(

)

over S is defined by

F(r) · dS =

F(r) · n dS =

F(r(u, v)) ·



∂r

∂u

∂r

∂v



du dv.

Intuitively, this is the total amount of

passing through

. For example, if

is the electric field, the flux is the amount of electric field passing through a

surface.

For a given orientation, the integral

F·

is independent of the parametriza-

tion. Changing orientation is equivalent to changing the sign of

, which is in

turn equivalent to changing the order of

and

in the definition of

, which is

also equivalent to changing the sign of the flux integral.

Example. Consider a sphere of radius a, r(θ, ϕ). Then

∂r

∂θ

= ae

∂r

∂ϕ

= a sin θe

The vector area element is

dS = a

sin θe

dθ dϕ,

taking the outward normal n = e

= r/a.

Suppose we want to calculate the fluid flux through the surface. The velocity

field

(

) of a fluid gives the motion of a small volume of fluid

. Assume that

depends smoothly on

(and

). For any small area

δS

, on a surface

, the

volume of fluid crossing it in time δt is u · δS δt.

δS

u δt

So the amount of flow of u over at time δt through S is

δt

u · dS.

u · dS is the rate of volume crossing S.

For example, let

= (

−x,

, z

) and

be the section of a sphere of radius

with 0 ≤ ϕ ≤ 2π and 0 ≤ θ ≤ α. Then

dS = a

sin θn dϕ dθ,

with

n =

(x, y, z).

n · u =

(−x

+ z

) = a(−sin

θ cos

ϕ + cos

θ).

Therefore

u · dS =

2π

sin θ[(cos

θ − 1) cos

ϕ + cos

θ] dϕ dθ

sin θ[π(cos

θ − 1) + 2π cos

θ] dθ

π(3 cos

θ − 1) sin θ dθ

= πa

[cosθ − cos

θ]

= πa

cos α sin

α.

What happens when we change parametrization? Let r(u, v) and r(˜u, ˜v) be

two regular parametrizations for the surface. By the chain rule,

∂r

∂u

∂r

∂˜u

∂u

∂r

∂˜v

∂u

∂r

∂v

∂r

∂˜u

∂v

∂r

∂˜v

∂v

∂r

∂u

∂r

∂v

∂(˜u, ˜v)

∂(u, v)

∂r

∂˜u

∂r

∂˜v

where

∂(˜u,˜v)

∂(u,v)

is the Jacobian.

Since

d˜u d˜v =

∂(˜u, ˜v)

∂(u, v)

du dv,

We recover the formula

dS =



∂r

∂u

∂r

∂v



du dv =



∂r

∂˜u

∂r

∂˜v



d˜u d˜v.

Similarly, we have

dS =

∂r

∂u

∂r

∂v

du dv =

∂r

∂˜u

∂r

∂˜v

d˜u d˜v.

provided (u, v) and (˜u, ˜v) have the same orientation.

4.4 Change of variables in R

and R

revisited

In this section, we derive our change of variable formulae in a slightly different

way.

Change of variable formula in R

We first derive the 2D change of variable formula from the 3D surface integral

formula.

Consider a subset

of the plane

parametrized by

(

u, v

)

, y

(

u, v

)). We

can embed it to R

as r(x(u, v), y(u, v), 0). Then

∂r

∂u

∂r

∂v

= (0, 0, J),

with J being the Jacobian. Therefore

f(r) dS =

f(r(u, v))



∂r

∂u

∂r

∂v



du dv =

f(r(u, v))|J| du dv,

and we recover the formula for changing variables in R

Change of variable formula in R

In R

, suppose we have a volume parametrised by r(u, v, w). Then

δr =

∂r

∂u

δu +

∂r

∂v

δv +

∂r

∂w

δw + o(δu, δv, δw).

Then the cuboid

δu, δv, δw

u, v, w

space is mapped to a parallelopiped of

volume

δV =



∂r

∂u

δu ·



∂r

∂v

δv ×

∂r

∂w

δw





= |J| δu δv δw.

So dV = |J| du dv dw.

5 Geometry of curves and surfaces

Let

(

) be a curve parametrized by arclength

. Since

(

) =

is a unit vector,

t · t

= 1. Differentiating yields

t · t

= 0. So

is a normal to the curve if

= 0.

We define the following:

Definition

(Principal normal and curvature)

Write

κn

, where

is a unit

vector and

κ >

0. Then

(

) is called the principal normal and

(

) is called

the curvature.

Note that we must be differentiating against

, not any other parametrization!

If the curve is given in another parametrization, we can either change the

parametrization or use the chain rule.

We take a curve that can Taylor expanded around s = 0. Then

r(s) = r(0) + sr

(0) +

(0) + O(s

We know that r

= t and r

= t

. So we have

r(s) = r(0) + st(0) +

κ(0)s

n + O(s

How can we interpret

as the curvature? Suppose we want to approximate the

curve near

(0) by a circle. We would expect a more “curved” curve would be

approximated by a circle of smaller radius. So

should be inversely proportional

to the radius of the circle. In fact, we will show that

= 1

, where

is the

radius of the best-fit circle.

Consider the vector equation for a circle passing through

(0) with radius

in the plane defined by t and n.

r(0)

Then the equation of the circle is

r = r(0) + a(1 − cos θ)n + a sin θt.

We can expand this to obtain

r = r(0) + aθt +

an + o(θ

Since the arclength s = aθ, we obtain

r = r(0) + st +

n + O(s

As promised, κ = 1/a, for a the radius of the circle of best fit.

Definition

(Radius of curvature)

The radius of curvature of a curve at a point

r(s) is 1/κ(s).

Since we are in 3D, given

(

) and

(

), there is another normal to the curve.

We can add a third normal to generate an orthonormal basis.

Definition (Binormal). The binormal of a curve is b = t × n.

We can define the torsion similar to the curvature, but with the binormal

instead of the tangent.

Definition (Torsion). Let b

= −τn. Then τ is the torsion.

Note that this makes sense, since

is both perpendicular to

and

, and

hence must be in the same direction as n. (b

= t

× n + t × n

= t × n

, so b

is perpendicular to t; and b ·b = 1 ⇒ b · b

= 0. So b

is perpendicular to b).

The geometry of the curve is encoded in how this basis (

t, n, b

) changes along

it. This can be specified by two scalar functions of arc length — the curvature

(

) and the torsion

(

) (which determines what the curve looks like to third

order in its Taylor expansions and how the curve lifts out of the t, r plane).

Surfaces and intrinsic geometry*

We can study the geometry of surfaces through curves which lie on them. At a

given point

at a surface

with normal

, consider a plane containing

. The

intersection of the plane with the surface yields a curve on the surface through

P . This curve has a curvature κ at P .

If we choose different planes containing

, we end up with different curves of

different curvature. Then we define the following:

Definition

(Principal curvature)

The principal curvatures of a surface at

are

the minimum and maximum possible curvature of a curve through

, denoted

min

and κ

max

respectively.

Definition

(Gaussian curvature)

The Gaussian curvature of a surface at a

point P is K = κ

min

max

Theorem

(Theorema Egregium)

. K

is intrinsic to the surface

. It can be

expressed in terms of lengths, angles etc. which are measured entirely on the

surface. So

can be defined on an arbitrary surface without embedding it on a

higher dimension surface.

The is the start of intrinsic geometry: if we embed a surface in Euclidean

space, we can determine lengths, angles etc on it. But we don’t have to do so —

we can “live in ” the surface and do geometry in it without an embedding.

For example, we can consider a geodesic triangle

on a surface

. It consists

of three geodesics: shortest curves between two points.

Let

be the interior angles of the triangle (defined by using scalar products

of tangent vectors). Then

Theorem (Gauss-Bonnet theorem).

+ θ

= π +

K dA,

integrating over the area of the triangle.

This was not taught in lectures, but there is a question on the example sheet about the

torsion, so I might as well include it here.

6 Div, Grad, Curl and ∇

6.1 Div, Grad, Curl and ∇

Recalled that

∇f

is given by (

∇f

)

∂f

∂x

. We can regard this as obtained from

the scalar field f by applying

∇ = e

∂

∂x

for cartesian coordinates

and orthonormal basis

, where

are orthonormal

and right-handed, i.e. e

× e

= ε

ijk

(it is left handed if e

× e

= −ε

ijk

We can alternatively write this as

∇ =



∂

∂x

∂

∂y

∂

∂z



∇ (nabla or del) is both an operator and a vector. We can apply it to a vector

field F(r) = F

(r)e

using the scalar or vector product.

Definition (Divergence). The divergence or div of F is

∇ · F =

∂F

∂x

∂F

∂x

∂F

∂x

∂F

∂x

Definition (Curl). The curl of F is

∇ × F = ε

ijk

∂F

∂x



∂

∂x

∂

∂y

∂

∂z



Example. Let F = (xe

, y

sin x, xyz). Then

∇ · F =

∂

∂x

∂

∂y

sin x +

∂

∂z

xyz = e

+ 2y sin x + xy.

and

∇ × F =



∂

∂y

(xyz) −

∂

∂z

sin x)





∂

∂z

(xe

) +

∂

∂x

(xyz)





∂

∂x

sin x) −

∂

∂y

(xe

)



= (xz, xe

− yz, y

cos x).

Note that ∇ is an operator, so ordering is important. For example,

F · ∇ = F

∂

∂x

is a scalar differential operator, and

F × ∇ = e

ijk

∂

∂x

is a vector differential operator.

Proposition.

Let

f, g

be scalar functions,

F, G

be vector functions, and

µ, λ

be constants. Then

∇(λf + µg) = λ∇f + µ∇g

∇ · (λF + µG) = λ∇ · F + µ∇ · G

∇ × (λF + µG) = λ∇ × F + µ∇ × G.

Note that Grad and Div can be analogously defined in any dimension

, but

curl is specific to n = 3 because it uses the vector product.

Example.

Consider

with

|r|

. We know that

. So

Therefore

∂r

∂x

= 2x

∂r

∂x

∇r

= e

∂

∂x

) = e

αr

α−1

∂r

∂x

= αr

α−2

Also,

∇ · r =

∂x

= 3.

and

∇ × r = e

ijk

∂x

= 0.

Proposition. We have the following Leibnitz properties:

∇(fg) = (∇f)g + f(∇g)

∇ · (fF) = (∇f ) · F + f (∇ · F)

∇ × (fF) = (∇f ) × F + f (∇ × F)

∇(F · G) = F × (∇ × G) + G × (∇ × F) + (F · ∇)G + (G · ∇)F

∇ × (F × G) = F(∇ · G) − G(∇ · F) + (G · ∇)F − (F · ∇)G

∇ · (F × G) = (∇ × F) · G − F · (∇ × G)

which can be proven by brute-forcing with suffix notation and summation

convention.

There is absolutely no point in memorizing these (at least the last three).

They can be derived when needed via suffix notation.

Example.

∇ · (r

r) = (∇r

)r + r

∇ · r

= (αr

α−2

r) · r + r

(3)

= (α + 3)r

∇ × (r

r) = (∇(r

)) × r + r

(∇ × r)

= αr

α−2

r × r

= 0

6.2 Second-order derivatives

We have

Proposition.

∇ × (∇f) = 0

∇ · (∇ × F) = 0

Proof. Expand out using suffix notation, noting that

ijk

∂

∂x

= 0.

since if, say, k = 3, then

ijk

∂

∂x

∂

∂x

−

∂

∂x

= 0.

The converse of each result holds for fields defined in all of R

Proposition. If F is defined in all of R

, then

∇ × F = 0 ⇒ F = ∇f

for some f .

Definition

(Conservative/irrotational field and scalar potential)

∇f

then f is the scalar potential. We say F is conservative or irrotational.

Similarly,

Proposition.

is defined over all of

and

∇ · H

= 0, then

∇ × A

for some A.

Definition

(Solenoidal field and vector potential)

∇ × A

is the

vector potential and H is said to be solenoidal.

Not that is is true only if F or H is defined on all of R

Definition (Laplacian operator). The Laplacian operator is defined by

∇

= ∇ · ∇ =

∂

∂x



∂

∂x

∂

∂x

∂

∂x



This operation is defined on both scalar and vector fields — on a scalar field,

∇

f = ∇ · (∇f),

whereas on a vector field,

∇

A = ∇(∇ · A) − ∇ × (∇ × A).

7 Integral theorems

7.1 Statement and examples

There are three big integral theorems, known as Green’s theorem, Stoke’s theorem

and Gauss’ theorem. There are all generalizations of the fundamental theorem of

calculus in some sense. In particular, they all say that an

dimensional integral

of a derivative is equivalent to an

n −

1 dimensional integral of the original

function.

We will first state all three theorems with some simple applications. In the

next section, we will see that the three integral theorems are so closely related

that it’s easiest to show their equivalence first, and then prove just one of them.

7.1.1 Green’s theorem (in the plane)

Theorem

(Green’s theorem)

For smooth functions

(

x, y

(

x, y

) and

bounded region in the (x, y) plane with boundary ∂A = C,



∂Q

∂x

−

∂P

∂y



dA =

(P dx + Q dy).

Here

is assumed to be piecewise smooth, non-intersecting closed curve, tra-

versed anti-clockwise.

Example.

Let

and

. If

is the parabola

= 4

and the

line x = a, both with −2a ≤ y ≤ 2a, then Green’s theorem says

− x

) dA =

dx + xy

dy.

From example sheet 1, each side gives

104

105

Example. Let A be a rectangle confined by 0 ≤ x ≤ a and 0 ≤ y ≤ b.

Then Green’s theorem follows directly from the fundamental theorem of calculus

in 1D. We first consider the first term of Green’s theorem:

−

∂P

∂y

dA =

−

∂P

∂y

dy dx

[−P (x, b) + P (x, 0)] dx

P dx

Note that we can convert the 1D integral in the second-to-last line to a line integral

around the curve

, since the

(

0) and

(

x, b

) terms give the horizontal part

, and the lack of d

term means that the integral is nil when integrating the

vertical parts.

Similarly,

∂Q

∂x

dA =

Q dy.

Combining them gives Green’s theorem.

Green’s theorem also holds for a bounded region

, where the boundary

∂A

consists of disconnected components (each piecewise smooth, non-intersecting

and closed) with anti-clockwise orientation on the exterior, and clockwise on the

interior boundary, e.g.

The orientation of the curve comes from imagining the surface as:

and take the limit as the gap shrinks to 0.

7.1.2 Stokes’ theorem

Theorem (Stokes’ theorem). For a smooth vector field F(r),

∇ × F · dS =

∂S

F · dr,

where

is a smooth, bounded surface and

∂S

is a piecewise smooth boundary

of S.

The direction of the line integral is as follows: If we walk along

with

facing up, then the surface is on your left.

It also holds if

∂S

is a collection of disconnected piecewise smooth closed

curves, with the orientation determined in the same way as Green’s theorem.

Example.

Let

be the section of a sphere of radius

with 0

≤ θ ≤ α

. In

spherical coordinates,

dS = a

sin θe

dθ dϕ.

Let F = (0, xz, 0). Then ∇ × F = (−x, 0, z). We have previously shown that

∇ × F · dS = πa

cos α sin

α.

Our boundary ∂C is

r(ϕ) = a(sin α cos ϕ, sin α sin ϕ, cos α).

The right hand side of Stokes’ is

F · dr =

2π

a sin α cos ϕ

| {z }

a cos α

| {z }

a sin α cos ϕ dϕ

| {z }

= a

sin

α cos α

2π

cos

ϕ dϕ

= πa

sin

α cos α.

So they agree.

7.1.3 Divergence/Gauss theorem

Theorem (Divergence/Gauss theorem). For a smooth vector field F(r),

∇ · F dV =

∂V

F · dS,

where

is a bounded volume with boundary

∂V

, a piecewise smooth, closed

surface, with outward normal n.

Example. Consider a hemisphere.

V is a solid hemisphere

+ y

+ z

≤ a

, z ≥ 0,

and ∂V = S

+ S

, the hemisphere and the disc at the bottom.

Take F = (0, 0, z + a) and ∇ ·F = 1. Then

∇ · F dV =

πa

the volume of the hemisphere.

On S

dS = n dS =

(x, y, z) dS.

Then

F · dS =

z(z + a) dS = cos θa(cos θ + 1) a

sin θ dθ dϕ

| {z }

Then

F · dS = a

2π

dϕ

π/2

sin θ(cos

θ + cos θ) dθ

= 2πa



−1

cos

θ −

cos



π/2

πa

On S

, dS = n dS = −(0, 0, 1) dS. Then F · dS = −a dS. So

F · dS = −πa

F · dS +

F · dS =



− 1



πa

in accordance with Gauss’ theorem.

7.2 Relating and proving integral theorems

We will first show the following two equivalences:

– Stokes’ theorem ⇔ Green’s theorem

– 2D divergence theorem ⇔ Greens’ theorem

Then we prove the 2D version of divergence theorem directly to show that all of

the above hold. A sketch of the proof of the 3D version of divergence theorem

will be provided, because it is simply a generalization of the 2D version, except

that the extra dimension makes the notation tedious and difficult to follow.

Proposition. Stokes’ theorem ⇒ Green’s theorem

Proof.

Stokes’ theorem talks about 3D surfaces and Green’s theorem is about

2D regions. So given a region

on the (

x, y

) plane, we pretend that there is a

third dimension and apply Stokes’ theorem to derive Green’s theorem.

Let

be a region in the (

x, y

) plane with boundary

∂A

, parametrised

by arc length, (x(s), y(s), 0). Then the tangent to C is

t =



, 0



Given any P (x, y) and Q(x, y), we can consider the vector field

F = (P, Q, 0),

∇ × F =



0, 0,

∂Q

∂x

−

∂P

∂y



Then the left hand side of Stokes is

F · dr =

F · t ds =

P dx + Q dy,

and the right hand side is

(∇ × F) ·

k dA =



∂Q

∂x

−

∂P

∂y



dA.

Proposition. Green’s theorem ⇒ Stokes’ theorem.

Proof.

Green’s theorem describes a 2D region, while Stokes’ theorem describes

a 3D surface

(

u, v

). Hence to use Green’s to derive Stokes’ we need find some

2D thing to act on. The natural choice is the parameter space, u, v.

Consider a parametrised surface

(

u, v

) corresponding to the region

the

u, v

plane. Write the boundary as

∂A

= (

(

)

, v

(

)). Then

∂S

(

)

, v

(

)).

We want to prove

∂S

F · dr =

(∇ × F) · dS

given

∂A

du + F

dv =



∂F

∂u

−

∂F

∂v



dA.

Doing some pattern-matching, we want

F · dr = F

du + F

for some F

and F

By the chain rule, we know that

dr =

∂r

∂u

du +

∂r

∂v

dv.

So we choose

= F ·

∂r

∂u

, F

= F ·

∂r

∂v

This choice matches the left hand sides of the two equations.

To match the right, recall that

(∇ × F) · dS = (∇ × F) ·



∂r

∂u

∂r

∂v



du dv.

Therefore, for the right hand sides to match, we want

∂F

∂u

−

∂F

∂v

= (∇ × F) ·



∂r

∂u

∂r

∂v



. (∗)

Fortunately, this is true. Unfortunately, the proof involves complicated suffix

notation and summation convention:

∂F

∂u

∂

∂u



F ·

∂r

∂v



∂

∂u



∂x

∂v





∂F

∂x

∂u



∂x

∂v

+ F

∂x

∂u∂v

Similarly,

∂F

∂u

∂

∂u



F ·

∂r

∂u



∂

∂u



∂x

∂u





∂F

∂x

∂v



∂x

∂u

+ F

∂x

∂u∂v

∂F

∂u

−

∂F

∂v

∂x

∂u

∂x

∂v



∂F

∂x

−

∂F

∂x



This is the left hand side of (∗).

The right hand side of (∗) is

(∇ × F) ·



∂r

∂u

∂r

∂v



= ε

ijk

∂F

∂x

kpq

∂x

∂u

∂x

∂v

= (δ

− δ

)

∂F

∂x

∂u

∂x

∂v



∂F

∂x

−

∂F

∂x



∂x

∂u

∂x

∂v

So they match. Therefore, given our choice of

and

, Green’s theorem

translates to Stokes’ theorem.

Proposition. Greens theorem ⇔ 2D divergence theorem.

Proof. The 2D divergence theorem states that

(∇ · G) dA =

∂A

G · n ds.

with an outward normal n.

Write G as (Q, −P ). Then

∇ · G =

∂Q

∂x

−

∂P

∂y

Around the curve

(

) = (

(

)

, y

(

)),

(

) = (

(

)

, y

(

)). Then the normal,

being tangent to

, is

(

) = (

(

)

, −x

(

)) (check that it points outwards!). So

G · n = P

+ Q

Then we can expand out the integrals to obtain

G · n ds =

P dx + Q dy,

and

(∇ · G) dA =



∂Q

∂x

−

∂P

∂y



dA.

Now 2D version of Gauss’ theorem says the two LHS are the equal, and Green’s

theorem says the two RHS are equal. So the result follows.

Proposition. 2D divergence theorem.

(∇ · G) dA =

C=∂A

G · n ds.

Proof.

For the sake of simplicity, we assume that

only has a vertical component,

noting that the same proof works for purely horizontal

, and an arbitrary

just a linear combination of the two.

Furthermore, we assume that

is a simple, convex shape. A more complicated

shape can be cut into smaller simple regions, and we can apply the simple case

to each of the small regions.

Suppose G = G(x, y)

j. Then

∇ · G =

∂G

∂y

Then

∇ · G dA =



∂G

∂y



dx.

Now we divide

into an upper and lower part, with boundaries

(

)

and

−

(

) respectively. Since I cannot draw,

will be pictured as a circle,

but the proof is valid for any simple convex shape.

−

We see that the boundary of

at any specific

is given by

−

(

) and

(

Hence by the Fundamental theorem of Calculus,

∂G

∂y

dy =

(x)

−

(x)

∂G

∂y

dy = G(x, y

(x)) − G(x, y

−

(x)).

To compute the full area integral, we want to integrate over all

. However, the

divergence theorem talks in terms of d

, not d

. So we need to find some way

to relate d

and d

. If we move a distance

δs

, the change in

δs cos θ

, where

is the angle between the tangent and the horizontal. But

is also the angle

between the normal and the vertical. So cos θ = n ·

j. Therefore dx =

j · n ds.

In particular, G dx = G

j · n ds = G · n ds, since G = G

However, at

−

points downwards, so

n ·

happens to be negative. So,

actually, at C

−

, dx = −G · n ds.

Therefore, our full integral is

∇ · G dA =



∂G

∂y



G(x, y

(x)) − G(x, y

−

(x)) dx

G · n ds +

−

G · n ds

G · n ds.

To prove the 3D version, we again consider

(

x, y, z

)

, a purely vertical

vector field. Then

∇ · F dV =

∂F

∂z

dA.

Again, split

∂V

into the top and bottom parts

and

−

(ie the parts

with

k · n ≥

0 and

k · n <

0), and parametrize by

(

x, y

) and

−

(

x, y

). Then

the integral becomes

∇ · F dV =

(F (x, y, z

) − F (x, y, z

−

)) dA =

F · n dS.

8 Some applications of integral theorems

8.1 Integral expressions for div and curl

We can use these theorems to come up with alternative definitions of the div and

curl. The advantage of these alternative definitions is that they do not require a

choice of coordinate axes. They also better describe how we should interpret div

and curl.

Gauss’ theorem for F in a small volume V containing r

gives

∂V

F · dS =

∇ · F dV ≈ (∇· F)(r

) vol(V ).

We take the limit as V → 0 to obtain

Proposition.

(∇ · F)(r

) = lim

diam(V )→1

vol(V )

∂V

F · dS,

where the limit is taken over volumes containing the point r

Similarly, Stokes’ theorem gives, for A a surface containing the point r

∂A

F · dr =

(∇ × F) · n dA ≈ n · (∇ × F)(r

) area(A).

Proposition.

n · (∇ × F)(r

) = lim

diam(A)→0

area(A)

∂A

F · dr,

where the limit is taken over all surfaces A containing r

with normal n.

These are coordinate-independent definitions of div and curl.

Example. Suppose u is a velocity field of fluid flow. Then

u · dS

is the rate of which fluid crosses

. Taking

to be the volume occupied by a

fixed quantity of fluid material, we have

V =

∂V

u · dS

Then, at r

∇ · u = lim

V →0

the relative rate of change of volume. For example, if

(

) =

αr

(ie fluid flowing

out of origin), then ∇ · u = 3α, which increases at a constant rate everywhere.

Alternatively, take a planar area A to be a disc of radius a. Then

∂A

u · dr =

∂A

u · t ds = 2πa × average of u · t around the circumference.

(

u · t

is the component of

which is tangential to the boundary) We define the

quantity

ω =

× (average of u · t).

This is the local angular velocity of the current. As

a →

→ ∞

, but the

average of

u · t

will also decrease since a smooth field is less “twirly” if you look

closer. So ω tends to some finite value as a → 0. We have

∂A

u · dr = 2πa

ω.

Recall that

n · ∇ × u = lim

A→0

πa

∂A

u · dr = 2ω,

ie twice the local angular velocity. For example, if you have a washing machine

rotating at a rate of ω , Then the velocity u = ω × r. Then the curl is

∇ × (ω ×r) = 2ω,

which is twice the angular velocity.

8.2 Conservative fields and scalar products

Definition (Conservative field). A vector field F is conservative if

(i) F = ∇f for some scalar field f; or

(ii)

F · dr is independent of C, for fixed end points and orientation; or

(iii) ∇ × F = 0.

In R

, all three formulations are equivalent.

We have previously shown (i) ⇒ (ii) since

F · dr = f(b) − f(a).

We have also shown that (i) ⇒ (iii) since

∇ × (∇f) = 0.

So we want to show that (iii) ⇒ (ii) and (ii) ⇒ (i)

Proposition. If (iii) ∇ × F = 0, then (ii)

F · dr is independent of C.

Proof.

Given

(

) satisfying

∇ × F

= 0, let

and

be any two curves from

to b.

If S is any surface with boundary ∂S = C −

C, By Stokes’ theorem,

∇ × F · dS =

∂S

F · dr =

F · dr −

F · dr.

But ∇ × F = 0. So

F · dr −

F · dr = 0,

F · dr =

F · dr.

Proposition.

If (ii)

F ·

is independent of

for fixed end points and

orientation, then (i) F = ∇f for some scalar field f.

Proof.

We fix

and define

(

) =

(

)

for any curve from

Assuming (ii),

is well-defined. For small changes

δr

, there is a small

extension of C by δC. Then

f(r + δr) =

C+δC

F(r

) · dr

F · dr

δC

F · dr

= f(r) + F(r) · δr + o(δr).

δf = f(r + δr) − f(r) = F(r) · δr + o(δr).

But the definition of grad is exactly

δf = ∇f · δr + o(δr).

So we have F = ∇f.

Note that these results assume

is defined on the whole of

. It also

works of

is defined on a simply connected domain

, ie a subspace of

without holes. By definition, this means that any two curves

with fixed

end points can be smoothly deformed into one another (alternatively, any loop

can be shrunk into a point).

If we have a smooth transformation from

, the process sweeps out a

surface bounded by C and

C. This is required by the proof that (iii) ⇒ (ii).

is not simply connected, then we obtain a multi-valued

(

) on

general (for the proof (ii)

⇒

(i)). However, we can choose to restrict to a subset

⊆ D such that f (r) is single-valued on D

Example. Take

F =



−y

+ y

, 0



This obeys

∇ × F

= 0, and is defined on

\ {z-axis}

, which is not

simply-connected. We can also write

F = ∇f,

where

f = tan

−1

which is multi-valued. If we integrate it about the closed loop

= 1

, z

= 0,

i.e. a circle about the

axis, the integral gives 2

, as opposed to the expected 0

for a conservative force. This shows that the simply-connected-domain criterion

is important!

However f can be single-valued if we restrict it to

= R

− {half-plane x ≥ 0, y = 0},

which is simply-connected. (Draw and check!) Any closed curve we can draw in

this area will have an integral of 0 (the circle mentioned above will no longer be

closed!).

8.3 Conservation laws

Definition

(Conservation equation)

Suppose we are interested in a quantity

. Let

(

r, t

) be the amount of stuff per unit volume and

(

r, t

) be the flow rate

of the quantity (eg if Q is charge, j is the current density).

The conservation equation is

∂ρ

∂t

+ ∇ · j = 0.

This is stronger than the claim that the total amount of

in the universe is

fixed. It says that

cannot just disappear here and appear elsewhere. It must

continuously flow out.

In particular, let

be a fixed time-independent volume with boundary

S = ∂V . Then

Q(t) =

ρ(r, t) dV

Then the rate of change of amount of Q in V is

∂ρ

∂t

dV = −

∇ · j dV = −

j · ds.

by divergence theorem. So this states that the rate of change of the quantity

is the flux of the stuff flowing out of the surface. ie

cannot just disappear

but must smoothly flow out.

In particular, if

is the whole universe (ie

), and

j →

0 sufficiently rapidly

|r| → ∞

, then we calculate the total amount of

in the universe by taking

to be a solid sphere of radius

, and take the limit as

R → ∞

. Then the surface

integral → 0, and the equation states that

= 0,

Example.

(

r, t

) is the charge density (i.e.

ρδV

is the amount of charge in

a small volume

δV

), then

(

) is the total charge in

(

r, t

) is the electric

current density. So j · dS is the charge flowing through δS per unit time.

Example.

Let

ρu

with

being the velocity field. Then (

ρu δt

)

·δS

is equal

to the mass of fluid crossing δS in time δt. So

= −

j · dS

does indeed imply the conservation of mass. The conservation equation in this

case is

∂ρ

∂t

+ ∇ · (ρu) = 0

For the case where

is constant and uniform (i.e. independent of

and

), we

get that ∇ · u = 0. We say that the fluid is incompressible.

9 Orthogonal curvilinear coordinates

9.1 Line, area and volume elements

In this chapter, we study funny coordinate systems. A coordinate system is,

roughly speaking, a way to specify a point in space by a set of (usually 3)

numbers. We can think of this as a function r(u, v, w).

By the chain rule, we have

dr =

∂r

∂u

du +

∂r

∂v

dv +

∂r

∂w

For a good parametrization,

∂r

∂u



∂r

∂v

∂r

∂w



6= 0,

i.e.

∂r

∂u

∂r

∂v

and

∂r

∂w

are linearly independent. These vectors are tangent to the

curves parametrized by u, v, w respectively when the other two are being fixed.

Even better, they should be orthogonal:

Definition

(Orthogonal curvilinear coordinates)

. u, v, w

are orthogonal curvi-

linear if the tangent vectors are orthogonal.

We can then set

∂r

∂u

= h

∂r

∂v

= h

∂r

∂w

= h

with

, h

0 and

, e

form an orthonormal right-handed basis (i.e.

× e

= e

). Then

dr = h

du + h

dv + h

dw,

and

, h

determine the changes in length along each orthogonal direction

resulting from changes in u, v, w. Note that clearly by definition, we have



∂r

∂u



Example.

(i)

In cartesian coordinates,

(

x, y, z

) =

. Then

= 1,

and e

i, e

j and e

(ii)

In cylindrical polars,

(

ρ, ϕ, z

) =

[

cos ϕ

sin ϕ

] +

. Then

= 1,

and



∂r

∂ϕ



= |(−ρ sin ϕ, ρ sin ϕ, 0)| = ρ.

The basis vectors e

, e

are as in section 1.

(iii) In spherical polars,

r(r, θ, ϕ) = r(cos ϕ sin θ

i + sin θ sin ϕ

j + cos θ

k).

Then h

= 1, h

= r and h

= r sin θ.

Consider a surface with

constant and parametrised by

and

. The vector

area element is

dS =

∂r

∂u

∂r

∂v

du dv = h

× h

du dv = h

du dv.

We interpret this as

δS

having a small rectangle with sides approximately

δu

and h

δv. The volume element is

dV =

∂r

∂u



∂r

∂v

∂r

∂w



du dv dw = h

du dv dw,

i.e. a small cuboid with sides h

, h

and h

respectively.

9.2 Grad, Div and Curl

Consider f (r(u, v, w)) and compare

df =

∂f

∂u

du +

∂f

∂v

dv +

∂f

∂w

dw,

with df = (∇f ) · dr. Since we know that

dr =

∂r

∂u

du +

∂r

∂v

dv +

∂r

∂w

dw = h

du + h

dv + h

dv,

we can compare the terms to know that

Proposition.

∇f =

∂f

∂u

∂f

∂v

∂f

∂w

Example. Take f = r sin θ cos ϕ in spherical polars. Then

∇f = sin θ cos ϕ e

(r cos θ cos ϕ) e

r sin θ

(−r sin θ sin ϕ) e

= cos ϕ(sin θ e

+ cos θ e

) − sin ϕ e

Then we know that the differential operator is

Proposition.

∇ =

∂

∂u

∂

∂v

∂

∂w

We can apply this to a vector field

F = F

+ F

using scalar or vector products to obtain

Proposition.

∇ × F =



∂

∂v

) −

∂

∂w

)



+ two similar terms



∂

∂u

∂

∂v

∂

∂w



and

∇ · F =



∂

∂u

) + two similar terms



There are several ways to obtain these formulae. We can

Proof. (non-examinable)

(i) Apply ∇· or ∇× and differentiate the basis vectors explicitly.

(ii)

First, apply

∇·

∇×

, but calculate the result by writing

in terms of

∇u, ∇v

and

∇w

in a suitable way. Then use

∇×∇f

= 0 and

∇·

(

∇×f

) = 0.

(iii) Use the integral expressions for div and curl.

Recall that

n · ∇ × F = lim

A→0

∂A

F · dr.

So to calculate the curl, we first find the e

component.

Consider an area with

fixed and change

δu

and

δv

. Then

this has an area of h

δuδv with normal e

. Let C be its boundary.

δu

δv

We then integrate around the curve

. We split the curve

up into 4 parts

(corresponding to the four sides), and take linear approximations by assum-

ing

and

are constant when moving through each horizontal/vertical

segment.

F · dr ≈ F

(u, v)h

(u, v) δu + F

(u + δu, v)h

(u + δu, v) δu

− F

(u, v + δv)h

(u, v + δv) δu − F

(u, v)h

(u, v) δv

≈



∂

∂u

−

∂

∂v

)



δuδv.

Divide by the area and take the limit as area → 0, we obtain

lim

A→0

F · dr =



∂

∂u

−

∂

∂v

)



So, by the integral definition of divergence,

· ∇ × F =



∂

∂u

) −

∂

∂v

)



and similarly for other components.

We can find the divergence similarly.

Example. Let A =

tan

in spherical polars. Then

∇ × A =

sin θ



r sin θe

∂

∂r

∂

∂θ

∂

∂ϕ

0 0 r sin θ ·

tan



sin θ

∂

∂θ



sin θ tan



10 Gauss’ Law and Poisson’s equation

10.1 Laws of gravitation

Consider a distribution of mass producing a gravitational force

on a point

mass

. The total force is a sum of contributions from each part of the

mass distribution, and is proportional to m. Write

F = mg(r),

Definition

(Gravitational field)

. g

(

) is the gravitational field, acceleration due

to gravity, or force per unit mass.

The gravitational field is conservative, ie

g · dr = 0.

This means that if you walk around the place and return to the same position,

the total work done is 0 and you did not gain energy, i.e. gravitational potential

energy is conserved.

Gauss’ law tells us what this gravitational field looks like:

Law

(Gauss’ law for gravitation)

Given any volume

bounded by closed

surface S,

g · dS = −4πGM,

where

is Newton’s gravitational constant, and

is the total mass contained

in V .

These equations determine g(r) from a mass distribution.

Example.

We can obtain Newton’s law of gravitation from Gauss’ law together

with an assumption about symmetry.

Consider a total mass

distributed with a spherical symmetry about the

origin

, with all the mass contained within some radius

. By spherical

symmetry, we have g(r) = g(r)

Consider Gauss’ law with

being a sphere of radius

R > a

. Then

g · dS =

g(R)

r ·

r dS =

g(R)dS = 4πR

g(R).

By Gauss’ law, we obtain

4πR

g(R) = −4πGM.

g(R) = −

for R > a.

Therefore the gravitational force on a mass m at r is

F(r) = −

GMm

If we take the limit as

a →

0, we get a point mass

at the origin. Then we

recover Newton’s law of gravitation for point masses.

The condition

g ·

= 0 for any closed

can be re-written by Stoke’s

theorem as

∇ × g · dS = 0,

where S is bounded by the closed curve C. This is true for arbitrary S. So

∇ × g = 0.

In our example above,

∇×g

= 0 due to spherical symmetry. But here we showed

that it is true for all cases.

Note that we exploited symmetry to solve Gauss’ law. However, if the mass

distribution is not sufficiently symmetrical, Gauss’ law in integral form can be

difficult to use. But we can rewrite it in differential form. Suppose

M =

ρ(r) dV,

where ρ is the mass density. Then by Gauss’ theorem

g · dS = −4πGM ⇒

∇ · g dV =

−4πGρ dV.

Since this is true for all V , we must have

Law (Gauss’ Law for gravitation in differential form).

∇ · g = −4πGρ.

Since

∇ × g

= 0, we can introduce a gravitational potential

(

) with

g = −∇ϕ. Then Gauss’ Law becomes

∇

ϕ = 4πGρ.

In the example with spherical symmetry, we can solve that

ϕ(r) = −

for r > a.

10.2 Laws of electrostatics

Consider a distribution of electric charge at rest. They produce a force on a

charge q, at rest at r , which is proportional to q.

Definition

(Electric field)

The force produced by electric charges on another

charge q is F = qE(r), where E(r) is the electric field, or force per unit charge.

Again, this is conservative. So

E · dr = 0

for any closed curve C. It also obeys

Law (Gauss’ law for electrostatic forces).

E · dS =

where ε

is the permittivity of free space, or electric constant.

Then we can write it in differential form, as in the gravitational case.

Law (Gauss’ law for electrostatic forces in differential form).

∇ · E =

Assuming constant (or no) magnetic field, we have

∇ × E = 0.

So we can write E = −∇ϕ.

Definition

(Electrostatic potential)

If we write

−∇ϕ

, then

is the

electrostatic potential, and

∇

ϕ =

Example.

Take a spherically symmetric charge distribution about

with total

charge

. Suppose all charge is contained within a radius

. Then similar

to the gravitational case, we have

E(r) =

4πε

and

ϕ(r) =

−Q

4πε

a →

0, we get point charges. From

, we can recover Coulomb’s law for the

force on another charge q at r:

F = qE =

4πε

Example (Line charge). Consider an infinite line with uniform charge density

per unit length σ.

We use cylindrical polar coordinates:

r =

+ y

By symmetry, the field is radial, i.e.

E(r) = E(r)

Pick

to be a cylinder of length

and radius

. We know that the end caps do

not contribute to the flux since the field lines are perpendicular to the normal.

Also, the curved surface has area 2πrL. Then by Gauss’ law in integral form,

E · dS = E(r)2πrL =

σL

E(r) =

2πε

Note that the field varies as 1

, not 1

. Intuitively, this is because we have

one more dimension of “stuff” compared to the point charge, so the field does

not drop as fast.

10.3 Poisson’s Equation and Laplace’s equation

Definition (Poisson’s equation). The Poisson’s equation is

∇

ϕ = −ρ,

where ρ is given and ϕ(r) is to be solved.

This is the form of the equations for gravity and electrostatics, with

−

πGρ

and ρ/ε

in place of ρ respectively.

When ρ = 0, we get

Definition (Laplace’s equation). Laplace’s equation is

∇

ϕ = 0.

One example is irrotational and incompressible fluid flow: if the velocity is

(

), then irrotationality gives

∇ϕ

for some velocity potential

. Since it is

incompressible, ∇ · u = 0 (cf. previous chapters). So ∇

ϕ = 0.

The expressions for

∇

can be found in non-Cartesian coordinates, but are a

bit complicated.

We’re concerned here mainly with cases exhibiting spherical or cylindrical

symmetry (use

for radial coordinate here). i.e. when

(

) has spherical or

cylindrical symmetry. Write ϕ = ϕ(r). Then

∇ϕ = ϕ

(r)

Then Laplace’s equation ∇

ϕ = 0 becomes an ordinary differential equation.

– For spherical symmetry, using the chain rule, we have

∇

ϕ = ϕ

)

= 0.

Then the general solution is

ϕ =

+ B.

– For cylindrical symmetry, with r

= x

+ x

, we have

∇

ϕ = ϕ

(rϕ

)

= 0.

Then

ϕ = A ln r + B.

Then solutions to Poisson’s equations can be obtained in a similar way, i.e. by

integrating the differential equations directly, or by adding particular integrals

to the solutions above.

For example, for a spherically symmetric solution of

∇

−ρ

, with

constant, recall that

∇

(

+ 1)

α−2

. Taking

= 2, we find the particular

integral

ϕ = −

So the general solution with spherical symmetry and constant ρ

ϕ(r) =

+ B −

To determine

A, B

, we must specify boundary conditions. If

is defined on all

, we often require

ϕ →

0 as

|r| → ∞

. If

is defined on a bounded volume

V , then there are two kinds of common boundary conditions on ∂V :

– Specify ϕ on ∂V — a Dirichlet condition

–

Specify

n · ∇ϕ

(sometimes written as

∂ϕ

∂n

): a Neumann condition. (

the outward normal on ∂V ).

The type of boundary conditions we get depends on the physical content of

the problem. For example, specifying

∂ϕ

∂n

corresponds to specifying the normal

component of g or E.

We can also specify different boundary conditions on different boundary

components.

Example.

We might have a spherically symmetric distribution with constant

, defined in a ≤ r ≤ b, with ϕ(a) = 0 and

∂ϕ

∂n

(b) = 0.

Then the general solution is

ϕ(r) =

+ B −

We apply the first boundary condition to obtain

+ B −

= 0.

The second boundary condition gives

n · ∇ϕ = −

−

b = 0.

These conditions give

A = −

, B =

Example.

We might also be interested with spherically symmetric solution with

∇

ϕ =

(

−ρ

r ≤ a

0 r > a

with

non-singular at

= 0 and

(

)

→

0 as

r → ∞

, and

ϕ, ϕ

continuous at

r = a. This models the gravitational potential on a uniform planet.

Then the general solution from above is

ϕ =

(

+ B −

r ≤ a

+ D r > a.

Since

is non-singular at

= 0, we have

= 0. Since

ϕ →

0 as

r → ∞

= 0.

ϕ =

(

B −

r ≤ a

r > a.

This is the gravitational potential inside and outside a planet of constant density

and radius a. We want ϕ and ϕ

to be continuous at r = a. So we have

B +

4πρ

πGρ

a = −

The second equation gives

−GM

. Substituting that into the first equation

to find B, we get

ϕ(r) =

(



− 3



r ≤ a

−

r > a

Since g = −ϕ

, we have

g(r) =

(

−

GMr

r ≤ a

−

r > a

We can plot the potential energy:

ϕ(r)

r = a

We can also plot −g(r), the inward acceleration:

−g(r)

r = a

Alternatively, we can apply Gauss’ Law for a flux of

(

)

out of

, a

sphere of radius R. For R ≤ a,

g · dS = 4πR

g(R) = −4πGM





g(R) = −

GMR

For R ≥ a, we can simply apply Newton’s law of gravitation.

In general, even if the problem has nothing to do with gravitation or electro-

statics, if we want to solve

∇

−ρ

with

and

sufficiently symmetric, we

can consider the flux of ∇ϕ out of a surface S = ∂V :

∇ϕ · dS = −

ρ dV,

by divergence theorem. This is called the Gauss Flux method.

11 Laplace’s and Poisson’s equations

11.1 Uniqueness theorems

Theorem.

Consider

∇

−ρ

for some

(

) on a bounded volume

with

S = ∂V being a closed surface, with an outward normal n.

Suppose ϕ satisfies either

(i) Dirichlet condition, ϕ(r) = f (r) on S

(ii) Neumann condition

∂ϕ(r)

∂n

= n · ∇ϕ = g(r) on S.

where f, g are given. Then

(i) ϕ(r) is unique

(ii) ϕ(r) is unique up to a constant.

This theorem is practically important - if you find a solution by any magical

means, you know it is the only solution (up to a constant).

Since the proof of the cases of the two different boundary conditions are very

similar, they will be proved together. When the proof is broken down into (i)

and (ii), it refers to the specific cases of each boundary condition.

Proof.

Let

(

) and

(

) satisfy Poisson’s equation, each obeying the boundary

conditions (N) or (D). Then Ψ(

) =

(

)

− ϕ

(

) satisfies

∇

Ψ = 0 on

linearity, and

(i) Ψ = 0 on S; or

(ii)

∂Ψ

∂n

= 0 on S.

Combining these two together, we know that Ψ

∂Ψ

∂n

= 0 on the surface. So using

the divergence theorem,

∇ · (Ψ∇Ψ) dV =

(Ψ∇Ψ) · dS = 0.

But

∇ · (Ψ∇Ψ) = (∇Ψ) · (∇Ψ) + Ψ ∇

|{z}

= |(∇Ψ)|

|∇Ψ|

dV = 0.

Since

|∇

≥

0, the integral can only vanish if

|∇

= 0. So

∇

Ψ = 0. So Ψ =

a constant on V . So

(i) Ψ = 0 on S ⇒ c = 0. So ϕ

= ϕ

on V .

(ii) ϕ

(r) = ϕ

(r) + C, as claimed.

We’ve proven uniqueness. How about existence? It turns out it isn’t difficult

to craft a boundary condition in which there are no solutions.

For example, if we have

∇

−ρ

with the condition

∂ϕ

∂n

, then by

the divergence theorem,

∇

ϕ dV =

∂S

∂ϕ

∂n

dS.

Using Poisson’s equation and the boundary conditions, we have

ρ dV +

∂V

g dS = 0

So if ρ and g don’t satisfy this equation, then we can’t have any solutions.

The theorem can be similarly proved and stated for regions in

, R

, ···

, by

using the definitions of grad, div and the divergence theorem. The result also

extends to unbounded domains. To prove it, we can take a sphere of radius

and impose the boundary conditions

Ψ(

)

) or

∂Ψ

∂n

(

)

) as

R → ∞. Then we just take the relevant limits to complete the proof.

Similar results also apply to related equations and different kinds of boundary

conditions, eg

on different parts of the boundary. But we have to analyse

these case by case and see if the proof still applies.

The proof uses a special case of the result

Proposition (Green’s first identity).

(u∇v) · dS =

(∇u) · (∇v) dV +

u∇

v dV,

By swapping u and v around and subtracting the equations, we have

Proposition (Green’s second identity).

(u∇v − v∇u) · dS =

(u∇

v − v∇

u) dV.

These are sometimes useful, but can be easily deduced from the divergence

theorem when needed.

11.2 Laplace’s equation and harmonic functions

Definition

(Harmonic function)

A harmonic function is a solution to Laplace’s

equation ∇

ϕ = 0.

These have some very special properties.

11.2.1 The mean value property

Proposition

(Mean value property)

Suppose

(

) is harmonic on region

containing a solid sphere defined by

|r−a| ≤ R

, with boundary

|r−a|

for some R. Define

¯ϕ(R) =

4πR

ϕ(r) dS.

Then ϕ(a) = ¯ϕ(R).

In words, this says that the value at the center of a sphere is the average of

the values on the surface on the sphere.

Proof.

Note that

¯ϕ

(

)

→ ϕ

(

) as

R →

0. We take spherical coordinates (

u, θ, χ

)

centered on r = a. The scalar element (when u = R) on S

dS = R

sin θ dθ dχ.

is independent of R. Write

¯ϕ(R) =

4π

Differentiate this with respect to

, noting that d

S/R

is independent of

Then we obtain

¯ϕ(R) =

4πR

∂ϕ

∂u



u=R

But

∂ϕ

∂u

= e

· ∇ϕ = n · ∇ϕ =

∂ϕ

∂n

on S

. So

¯ϕ(R) =

4πR

∇ϕ · dS =

4πR

∇

ϕ dV = 0

by divergence theorem. So

¯ϕ

(

) does not depend on

, and the result follows.

11.2.2 The maximum (or minimum) principle

In this section, we will talk about maxima of functions. It should be clear that

the results also hold for minima.

Definition

(Local maximum)

We say that

(

) has a local maximum at

for some ε > 0, ϕ(r) < ϕ(a) when 0 < |r − a| < ε.

Proposition

(Maximum principle)

If a function

is harmonic on a region

then ϕ cannot have a maximum at an interior point of a of V .

Proof.

Suppose that

had a local maximum at

in the interior. Then there is

an ε such that for any r such that 0 < |r − a| < ε, we have ϕ(r) < ϕ(a).

Note that if there is an

that works, then any smaller

will work. Pick an

sufficiently small such that the region

|r − a| < ε

lies within

(possible since

lies in the interior of V ).

Then for any r such that |r − a| = ε, we have ϕ(r) < ϕ(a).

¯ϕ(ε) =

4πR

ϕ(r) dS < ϕ(a),

which contradicts the mean value property.

We can understand this by performing a local analysis of stationary points

by differentiation. Suppose at

, we have

∇ϕ

= 0. Let the eigenvalues of the

Hessian matrix

∂

∂x

. But since

is harmonic, we have

∇

= 0,

i.e.

∂

∂x

= 0. But

is the trace of the Hessian matrix, which is the

sum of eigenvalues. So

= 0.

Recall that a maximum or minimum occurs when all eigenvalues have the

same sign. This clearly cannot happen if the sum is 0. Therefore we can only

have saddle points.

(note we ignored the case where all

= 0, where this analysis is inconclusive)

11.3 Integral solutions of Poisson’s equations

11.3.1 Statement and informal derivation

We want to find a solution to Poisson’s equations. We start with a discrete case,

and try to generalize it to a continuous case.

If there is a single point source of strength λ at a, the potential ϕ is

ϕ =

4π

|r − a|

(we have λ = −4πGM for gravitation and Q/ε

for electrostatics)

If we have many sources

at positions

, the potential is a sum of terms

ϕ(r) =

4π

|r − r

If we have infinitely many of them, having a distribution of

(

) with

(

) d

being the contribution from a small volume at position

. It would be reasonable

to guess that the solution is what we obtain by replacing the sum with an integral:

Proposition.

The solution to Poisson’s equation

∇

−ρ

, with boundary

conditions |ϕ(r)| = O(1/|r|) and |∇ϕ(r)| = O(1/|r|

), is

ϕ(r) =

4π

ρ(r

)

|r − r

For

(

) non-zero everywhere, but suitably well-behaved as

| → ∞

, we can

also take V

= R

Example. Suppose

∇

ϕ =

(

−ρ

|r| ≤ a

0 |r| > a.

Fix

and introduce polar coordinates

, θ, χ

for

. We take the

= 0 direction

to be the direction along the line from r

to r.

Then

ϕ(r) =

4π

|r − r

We have

= r

sin θ dr

dθ dχ.

We also have

|r − r

| =

+ r

− 2rr

cos θ

by the cosine rule (c

= a

+ b

− 2ab cos C). So

ϕ(r) =

4π

dθ

2π

dχ

sin θ

√

+ r

− 2rr

cos θ

+ r

− rr

cos θ

θ=π

θ=0

(|r + r

| + |r − r

(

r > r

2r r < r

If r > a, then r > r

always. So

ϕ(r) = ρ

If r < a, then the integral splits into two parts:

ϕ(r) = ρ





= ρ



−



11.3.2 Point sources and δ-functions*

Recall that

Ψ =

4π|r − a|

is our potential for a point source. When r 6= a, we have

∇Ψ = −

4π

r − a

|r − a|

, ∇

Ψ = 0.

What about when

? Ψ is singular at this point, but can we say anything

about ∇

Ψ?

For any sphere with center a, we have

∇Ψ · dS = −λ.

By the divergence theorem, we have

∇

Ψ dV = −λ.

for

being a solid sphere with

∂V

. Since

∇

Ψ is zero at any point

r 6

we must have

∇

Ψ = −λδ(r − a),

where δ is the 3d delta function, which satisfies

f(r)δ(r − a) dV = f(a)

for any volume containing a.

In short, we have

∇



|r − r



= −4πδ(r − r

Using these, we can verify that the integral solution of Poisson’s equation we

obtained previously is correct:

∇

Ψ(r) = ∇



4π

ρ(r

)

|r − r



4π

ρ(r

)∇



|r − r



= −

ρ(r

)δ(r − r

) dV

= −ρ(r),

as required.

12 Maxwell’s equations

12.1 Laws of electromagnetism

Maxwell’s equations are a set of four equations that describe the behaviours

of electromagnetism. Together with the Lorentz force law, these describe all

we know about (classical) electromagnetism. All other results we know are

simply mathematical consequences of these equations. It is thus important to

understand the mathematical properties of these equations.

To begin with, there are two fields that govern electromagnetism, known

as the electric and magnetic field. These are denoted by

(

r, t

) and

(

r, t

)

respectively.

To understand electromagnetism, we need to understand how these fields

are formed, and how these fields affect charged particles. The second is rather

straightforward, and is given by the Lorentz force law.

Law (Lorentz force law). A point charge q experiences a force of

F = q(E +

r × B).

The dynamics of the field itself is governed by Maxwell’s equations. To state

the equations, we need to introduce two more concepts.

Definition

(Charge and current density)

. ρ

(

r, t

) is the charge density, defined

as the charge per unit volume.

(

r, t

) is the current density, defined as the electric current per unit area of

cross section.

Then Maxwell’s equations say

Law (Maxwell’s equations).

∇ · E =

∇ · B = 0

∇ × E +

∂B

∂t

= 0

∇ × B − µ

∂E

∂t

= µ

where

is the electric constant (permittivity of free space) and

is the

magnetic constant (permeability of free space), which are constants determined

experimentally.

We can quickly derive some properties we know from these four equations.

The conservation of electric charge comes from taking the divergence of the last

equation.

∇ · (∇ × B)

| {z }

−µ

∂

∂t

(∇ · E)

| {z }

=ρ/ε

= µ

∇ · j.

∂ρ

∂t

+ ∇ · j = 0.

We can also take the volume integral of the first equation to obtain

∇ · E dV =

ρ dV =

By the divergence theorem, we have

E · dS =

which is Gauss’ law for electric fields

We can integrate the second equation to obtain

B · dS = 0.

This roughly states that there are no “magnetic charges”.

The remaining Maxwell’s equations also have integral forms. For example,

C=∂S

E · dr =

∇ × E dS = −

B · dS,

where the first equality is from from Stoke’s theorem. This says that a changing

magnetic field produces a current.

12.2 Static charges and steady currents

If ρ, j, E, B are all independent of time, E and B are no longer linked.

We can solve the equations for electric fields:

∇ · E = ρ/ε

∇ × E = 0

Second equation gives E = −∇ϕ. Substituting into first gives ∇

ϕ = −ρ/ε

The equations for the magnetic field are

∇ · B = 0

∇ × B = µ

First equation gives

∇ × A

for some vector potential

. But the vector

potential is not well-defined. Making the transformation

A 7→ A

∇χ

(

)

produces the same

, since

∇ ×

(

∇χ

) = 0. So choose

such that

∇ · A

= 0.

Then

∇

A = ∇(∇ · A

|{z}

) − ∇ × (∇ × A

| {z }

) = −µ

In summary, we have

Electrostatics Magnetostatics

∇ · E = ρ/ε

∇ · B = 0

∇ × E = 0 ∇ × B = µ

∇

ϕ = −ρ/ε

∇

A = −µ

sets the scale of electrostatic effects,

e.g. the Coulomb force

sets the scale of magnetic effects,

e.g. force between two wires with cur-

rents.

12.3 Electromagnetic waves

Consider Maxwell’s equations in empty space, i.e.

= 0,

. Then Maxwell’s

equations give

∇

E = ∇(∇ · E) − ∇ × (∇ × E) = ∇ ×

∂B

∂t

∂

∂t

(∇ × B) = µ

∂

Define c =

√

. Then the equation gives



∇

−

∂

∂t



E = 0.

This is the wave equation describing propagation with speed

. Similarly, we

can obtain



∇

−

∂

∂t



B = 0.

So Maxwell’s equations predict that there exists electromagnetic waves in free

space, which move with speed

√

≈ 3.00 × 10

m s

−1

, which is the speed

of light! Maxwell then concluded that light is electromagnetic waves!

13 Tensors and tensor fields

13.1 Definition

There are two ways we can think of a vector in

. We can either interpret it as

a “point” in space, or we can view it simply as a list of three numbers. However,

the list of three numbers is simply a representation of the vector with respect to

some particular basis. When we change basis, in order to represent the same

point, we will need to use a different list of three numbers. In particular, when

we perform a rotation by R

, the new components of the vector is given by

= R

Similarly, we can imagine a matrix as either a linear transformation or an

array of 9 numbers. Again, when we change basis, in order to represent the

same transformation, we will need a different array of numbers. This time, the

transformation is given by

= R

We can think about this from another angle. To define an arbitrary quantity

, we can always just write down 9 numbers and be done with it. Moreover,

we can write down a different set of numbers in a different basis. For example,

we can define

in our favorite basis, but

= 0 in all other bases. We

can do so because we have the power of the pen.

However, for this

to represent something physically meaningful, i.e. an

actual linear transformation, we have to make sure that the components of

transform sensibly under a basis transformation. By “sensibly”, we mean that it

has to follow the transformation rule

. For example, the

we defined in the previous paragraph does not transform sensibly. While it is

something we can define and write down, it does not correspond to anything

meaningful.

The things that transform sensibly are known as tensors. For example,

vectors and matrices (that transform according to the usual change-of-basis

rules) are tensors, but that A

is not.

In general, tensors are allowed to have an arbitrary number of indices. In

order for a quantity

ij···k

to be a tensor, we require it to transform according to

ij···k

= R

···R

pq···r

which is an obvious generalization of the rules for vectors and matrices.

Definition

(Tensor)

A tensor of rank

has components

ij···k

(with

indices)

with respect to each basis

}

or coordinate system

}

, and satisfies the

following rule of change of basis:

ij···k

= R

···R

pq···r

Example.

–

A tensor

of rank 0 doesn’t transform under change of basis, and is a

scalar.

– A tensor T of rank 1 transforms under T

= R

. This is a vector.

–

A tensor

of rank 2 transforms under

. This is a matrix.

Example.

(i) If u, v, ···w are n vectors, then

ij···k

= u

···w

defines a tensor of rank

. To check this, we check the tensor transformation

rule. We do the case for

= 2 for simplicity of expression, and it should

be clear that this can be trivially extended to arbitrary n:

= u

= (R

)(R

)

= R

)

= R

Then linear combinations of such expressions are also tensors, e.g.

+ a

for any u, v, a, b.

(ii) δ

and

ijk

are tensors of rank 2 and 3 respectively — with the special

property that their components are unchanged with respect to the basis

coordinate:

= R

= δ

since R

= (RR

)

= I

. Also

ijk

= R

pqr

= (det R)ε

ijk

= ε

ijk

using results from Vectors and Matrices.

(iii)

(Physical example) In some substances, an applied electric field

gives rise

to a current density

, according to the linear relation

, where

is the conductivity tensor.

Note that this relation entails that the resulting current need not be in the

same direction as the electric field. This might happen if the substance

has special crystallographic directions that favours electric currents.

However, if the substance is isotropic, we have

σδ

for some

. In

this case, the current is parallel to the field.

13.2 Tensor algebra

Definition

(Tensor addition)

Tensors

and

of the same rank can be added;

T + S is also a tensor of the same rank, defined as

(T + S)

ij···k

= T

ij···k

+ S

ij···k

in any coordinate system.

To check that this is a tensor, we check the transformation rule. Again, we

only show for n = 2:

(T + S)

= T

+ S

= R

+ R

= (R

)(T

+ S

Definition

(Scalar multiplication)

A tensor

of rank

can be multiplied by

a scalar α. αT is a tensor of the same rank, defined by

(αT )

= αT

It is trivial to check that the resulting object is indeed a tensor.

Definition

(Tensor product)

Let

be a tensor of rank

and

be a tensor of

rank m. The tensor product T ⊗ S is a tensor of rank n + m defined by

(T ⊗ S)

···x

···y

= T

···x

···y

It is trivial to show that this is a tensor.

We can similarly define tensor products for any (positive integer) number of

tensors, e.g. for n vectors u, v ··· , w, we can define

T = u ⊗ v ⊗··· ⊗w

ij···k

= u

···w

as defined in the example in the beginning of the chapter.

Definition

(Tensor contraction)

For a tensor

of rank

with components

ijp···q

, we can contract on the indices

i, j

to obtain a new tensor of rank

n −

p···q

= δ

ijp···q

= T

iip···q

Note that we don’t have to always contract on the first two indices. We can

contract any pair we like.

To check that contraction produces a tensor, we take the ranks 2

example.

Contracting, we get

,a rank-0 scalar. We have

= T

, since R is an orthogonal matrix.

If we view

as a matrix, then the contraction is simply the trace of

the matrix. So our result above says that the trace is invariant under basis

transformations — as we already know in IA Vectors and Matrices.

Note that our usual matrix product can be formed by first applying a tensor

product to obtain M

, then contract with δ

to obtain M

13.3 Symmetric and antisymmetric tensors

Definition

(Symmetric and anti-symmetric tensors)

A tensor

of rank

symmetric in the indices i, j if it obeys

ijp···q

= T

jip···q

It is anti-symmetric if

ijp···q

= −T

jip···q

Again, a tensor can be symmetric or anti-symmetric in any pair of indices, not

just the first two.

This is a property that holds in any coordinate systems, if it holds in one,

since

k`r...s

= R

···R

ijp···q

= ±R

···R

jip···q

= ±T

`kr···s

as required.

Definition

(Totally symmetric and anti-symmetric tensors)

A tensor is totally

(anti-)symmetric if it is (anti-)symmetric in every pair of indices.

Example. δ

is totally symmetric, while

ijk

−ε

jik

is totally antisym-

metric.

There are totally symmetric tensors of arbitrary rank n. But in R

– Any totally antisymmetric tensor of rank 3 is λε

ijk

for some scalar λ.

–

There are no totally antisymmetric tensors of rank greater than 3, except

for the trivial tensor with all components 0.

Proof: exercise (hint: pigeonhole principle)

13.4 Tensors, multi-linear maps and the quotient rule

Tensors as multi-linear maps

In Vectors and Matrices, we know that matrices are linear maps. We will prove

an analogous fact for tensors.

Definition

(Multilinear map)

A map

that maps

vectors

a, b, ··· , c

is multi-linear if it is linear in each of the vectors a, b, ··· , c individually.

We will show that a tensor

of rank

is a equivalent to a multi-linear map

from n vectors a, b, ··· , c to R defined by

T (a, b, ··· , c) = T

ij···k

···c

To show that tensors are equivalent to multi-linear maps, we have to show the

following:

(i)

Defining a map with a tensor makes sense, i.e. the expression

ij···k

···c

is the same regardless of the basis chosen;

(ii)

While it is always possible to write a multi-linear map as

ij···k

···c

we have to show that

ij···k

is indeed a tensor, i.e. transform according to

the tensor transformation rules.

To show the first property, just note that the

ij···k

···c

is a tensor

product (followed by contraction), which retains tensor-ness. So it is also a

tensor. In particular, it is a rank 0 tensor, i.e. a scalar, which is independent of

the basis.

To show the second property, assuming that

is a multi-linear map, it must

be independent of the basis, so

ij···k

···c

= T

ij···k

···c

Since

by tensor transformation rules, multiplying both sides by

gives v

= R

. Substituting in gives

ij···k

)(R

) ···(R

) = T

pq···r

···c

Since this is true for all a, b, ···c, we must have

ij···k

···R

= T

pq···r

Hence T

ij···k

obeys the tensor transformation rule, and is a tensor.

This shows that there is a one-to-one correspondence between tensors of rank

n and multi-linear maps.

This gives a way of thinking about tensors independent of any coordinate

system or choice of basis, and the tensor transformation rule emerges naturally.

Note that the above is exactly what we did with linear maps and matrices.

The quotient rule

If T

i ···j

|{z}

p ···q

|{z}

is a tensor of rank n + m, and u

p···q

is a tensor of rank m then

i,···j

= T

i···jp···q

p···q

is a tensor of rank

, since it is a tensor product of

and

, followed by

contraction.

The converse is also true:

Proposition

(Quotient rule)

Suppose that

i···jp···q

is an array defined in each

coordinate system, and that

i···j

i···jp···q

p···q

is also a tensor for any tensor

p···q

. Then T

i···jp···q

is also a tensor.

Note that we have previously seen the special case of

= 1, which says

that linear maps are tensors.

Proof.

We can check the tensor transformation rule directly. However, we can

reuse the result above to save some writing.

Consider the special form

p···q

···d

for any vectors

c, ···d

. By

assumption,

i···j

= T

i···jp···q

···d

is a tensor. Then

i···j

···b

= T

i···jp···q

···b

···d

is a scalar for any vectors

a, ··· , b, c, ··· , d

. Since

i···jp···q

···b

···d

is a

scalar and hence gives the same result in every coordinate system,

i···jp···q

is a

multi-linear map. So T

i···jp···q

is a tensor.

13.5 Tensor calculus

Tensor fields and derivatives

Just as with scalars or vectors, we can define tensor fields:

Definition

(Tensor field)

A tensor field is a tensor at each point in space

ij···k

(x), which can also be written as T

ij···k

We assume that the fields are smooth so they can be differentiated any

number of times

∂

∂x

···

∂

∂x

ij···k

except for where things obviously fail, e.g. for where

is not defined. We now

claim:

Proposition.

∂

∂x

···

∂

∂x

| {z }

ij ···k

| {z }

, (∗)

is a tensor of rank n + m.

Proof.

To show this, it suffices to show that

∂

∂x

satisfies the tensor transfor-

mation rules for rank 1 tensors (i.e. it is something like a rank 1 tensor). Then

by the exact same argument we used to show that tensor products preserve

tensorness, we can show that the (

∗

) is a tensor. (we cannot use the result of

tensor products directly, since this is not exactly a product. But the exact same

proof works!)

Since x

= R

, we have

∂x

= R

(noting that

∂x

= δ

). Similarly,

∂x

= R

Note that R

, R

are constant matrices.

Hence by the chain rule,

∂

∂x



∂x



∂

∂x

= R

∂

∂x

∂

∂x

obeys the vector transformation rule. So done.

Integrals and the tensor divergence theorem

It is also straightforward to do integrals. Since we can sum tensors and take

limits, the definition of a tensor-valued integral is straightforward.

For example,

ij···k

(

) d

is a tensor of the same rank as

ij···k

(think of

the integral as the limit of a sum).

For a physical example, recall our discussion of the flux of quantities for a

fluid with velocity

(

) through a surface element — assume a uniform density

. The flux of volume is

u · nδs

δS

. So the flux of mass is

ρu

δS

Then the flux of the

th component of momentum is

ρu

δS

kδS

(mass times velocity), where

ρu

. Then the flux through the surface

dS.

It is easy to generalize the divergence theorem from vectors to tensors. We

can then use it to discuss conservation laws for tensor quantities.

Let

be a volume bounded by a surface

∂V

and

ij···k`

be a smooth

tensor field. Then

Theorem (Divergence theorem for tensors).

ij···k`

dS =

∂

∂x

ij···k`

) dV,

with n being an outward pointing normal.

The regular divergence theorem is the case where

has one index and is a

vector field.

Proof.

Apply the usual divergence theorem to the vector field

defined by

= a

···c

ij···k`

, where a, b, ··· , c are fixed constant vectors.

Then

∇ · v =

∂v

∂x

= a

···c

∂

∂x

ij···k`

and

n · v = n

= a

···c

ij···k`

Since

a, b, ··· , c

are arbitrary, therefore they can be eliminated, and the tensor

divergence theorem follows.

14 Tensors of rank 2

14.1 Decomposition of a second-rank tensor

This decomposition might look arbitrary at first sight, but as time goes on, you

will find that it is actually very useful in your future career (at least, the lecturer

claims so).

Any second rank tensor can be written as a sum of its symmetric and

anti-symmetric parts

= S

+ A

where

+ T

), A

− T

Here

has 9 independent components, whereas

and

have 6 and 3

independent components, since they must be of the form

) =





a d e

d b f

e f c





, (A

) =





0 a b

−a 0 c

−b −c 0





The symmetric part can be be further reduced to a traceless part plus an isotropic

(i.e. multiple of δ

) part:

= P

where

is the trace of

and

−

is traceless. Then

has 5 independent components while Q has 1.

Since the antisymmetric part has 3 independent components, just like a usual

vector, we should be able to write

in terms of a single vector. In fact, we can

write the antisymmetric part as

= ε

ijk

for some vector

. To figure out what this

is, we multiply by

ij`

on both

sides and use some magic algebra to obtain

ijk

where the last equality is from the fact that only antisymmetric parts contribute

to the sum.

Then

) =





0 B

−B

0 B

−B





To summarize,

= P

+ ε

ijk

where B

pqj

, Q = T

and P

= P

−

Example.

The derivative of a vector field

(

) is a tensor

∂F

∂x

, a tensor

field. Our decomposition given above has the symmetric traceless piece



∂F

∂x

∂F

∂x



−

∂F

∂x



∂F

∂x

∂F

∂x



−

∇ · F,

an antisymmetric piece A

= ε

ijk

, where

ijk

∂F

∂x

= −

(∇ × F)

and trace

Q =

∂F

∂x

= ∇ · F.

Hence a complete description involves a scalar

∇ · F

, a vector

∇ × F

, and a

symmetric traceless tensor P

14.2 The inertia tensor

Consider masses

with positions

, all rotating with angular velocity

about

0. So the velocities are v

= ω ×r

. The total angular momentum is

L =

× m

× (ω × r

)

(|r

ω −(r

· ω )r

by vector identities. In components, we have

= I

where

Definition (Inertia tensor). The inertia tensor is

[|r

− (r

)

For a rigid body occupying volume

with mass density

(

), we replace the

sum with an integral to obtain

ρ(r)(x

− x

) dV.

By inspection, I is a symmetric tensor.

Example.

Consider a rotating cylinder with uniform density

. The total

mass is 2`πa

Use cylindrical polar coordinates:

= r cos θ

= r sin θ

= x

dV = r dr dθ dx

We have

+ x

) dV

= ρ

2π

−`

(r dr dθ dx

)

= ρ

· 2π · 2`





= ε

π`a

Similarly, we have

+ x

) dV

= ρ

2π

−`

sin

θ + x

)r dr dθ dx

= ρ

2π

sin

θ [x

]

−`





−`

dθ dr

= ρ

2π



sin

θ2` +



dθ dr

= ρ



2πa ·

+ 2`

2π

sin



= ρ

πa





By symmetry, the result for I

is the same.

How about the off-diagonal elements?

= −

= −ρ

−`

2π

cos θx

dr dx

dθ

= 0

Since

2π

θ cos θ

= 0. Similarly, the other off-diagonal elements are all 0. So

the non-zero components are

= I

= M





In the particular case where ` =

√

, we have I

. So in this case,

L =

for rotation about any axis.

14.3 Diagonalization of a symmetric second rank tensor

Recall that using matrix notation,

T = (T

), T

= (T

), R = (R

and the tensor transformation rule T

= R

becomes

= RT R

−1

is symmetric, it can be diagonalized by such an orthogonal transformation.

This means that there exists a basis of orthonormal eigenvectors

, e

for

with real eigenvalues

, λ

respectively. The directions defined by

, e

are the principal axes for

, and the tensor is diagonal in Cartesian coordinates

along these axes.

This applies to any symmetric rank-2 tensor. For the special case of the

inertia tensor, the eigenvalues are called the principal moments of inertia.

As exemplified in the previous example, we can often guess the correct

principal axes for

based on the symmetries of the body. With the axes we

chose, I

was found to be diagonal by direct calculation.

15 Invariant and isotropic tensors

15.1 Definitions and classification results

Definition

(Invariant and isotropic tensor)

A tensor

is invariant under a

particular rotation R if

ij···k

= R

···R

pq···r

= T

ij···k

i.e. every component is unchanged under the rotation.

A tensor

which is invariant under every rotation is isotropic, i.e. the same

in every direction.

Example. The inertia tensor of a sphere is isotropic by symmetry.

and

ijk

are also isotropic tensors. This ensures that the component

definitions of the scalar and vector products

a·b

and (

a×b

)

ijk

are independent of the Cartesian coordinate system.

Isotropic tensors in R

can be classified:

Theorem.

(i) There are no isotropic tensors of rank 1, except the zero tensor.

(ii) The most general rank 2 isotropic tensor is T

= αδ

for some scalar α.

(iii)

The most general rank 3 isotropic tensor is

ijk

βε

ijk

for some scalar

(iv)

All isotropic tensors of higher rank are obtained by combining

and

ijk

using tensor products, contractions, and linear combinations.

We will provide a sketch of the proof:

Proof.

We analyze conditions for invariance under specific rotations through

or π/2 about coordinate axes.

(i) Suppose T

is rank-1 isotropic. Consider a rotation about x

through π:

) =





−1 0 0

0 −1 0

0 0 1





We want

−T

. So

= 0. Similarly,

= 0. By

consider a rotation about, say x

, we have T

= 0.

(ii) Suppose T

is rank-2 isotropic. Consider

) =





0 1 0

−1 0 0

0 0 1





which is a rotation through π/2 about the x

axis. Then

= R

= T

and

= R

= −T

So T

= T

= 0. Similarly, we have T

= T

= 0.

We also have

= R

= T

So T

= T

By picking a rotation about a different axis, we have

and

= T

Hence T

= αδ

(iii)

Suppose that

ijk

is rank-3 isotropic. Using the rotation by

about the

axis, we have

133

= R

pqr

= −T

133

So T

133

= 0. We also have

111

= R

pqr

= −T

111

= 0. We have similar results for

rotations about other axes and

other choices of indices.

Then we can show that T

ijk

= 0 unless all i, j, k are distinct.

Now consider

) =





0 1 0

−1 0 0

0 0 1





a rotation about x

through π/2. Then

123

= R

pqr

= R

213

= −T

213

123

−T

213

. Along with similar results for other indices and axes of

rotation, we find that

ijk

is totally antisymmetric, and

ijk

βε

ijk

for

some β.

Example. The most general isotropic tensor of rank 4 is

ijk`

= αδ

+ βδ

+ γδ

for some scalars

α, β, γ

. There are no other independent combinations. (we

might think we can write a rank-4 isotropic tensor in terms of

ijk

, like

ijp

k`p

but this is just

− δ

. It turns out that anything you write with

ijk

can be written in terms of δ

instead)

15.2 Application to invariant integrals

We have the following very useful theorem. It might seem a bit odd and arbitrary

at first sight — if so, read the example below first (after reading the statement

of the theorem), and things will make sense!

Theorem. Let

ij···k

f(x)x

···x

dV.

where f (x) is a scalar function and V is some volume.

Given a rotation

, consider an active transformation:

is mapped

with

, i.e. we map the components but not the basis, and

V is mapped to V

Suppose that under this active transformation,

(i) f (x) = f(x

(ii) V

= V (e.g. if V is all of space or a sphere).

Then T

ij···k

is invariant under the rotation.

Proof.

First note that the Jacobian of the transformation

is 1, since it is

simply the determinant of

(

⇒

∂x

), which is by definition

1. So dV = dV

Then we have

···R

pq···r

f(x)x

···x

f(x

···x

dV using (i)

f(x

···x

using (ii)

f(x)x

···x

dV since x

and x

are dummy

= T

ij···k

The result is particularly useful if (i) and (ii) hold for any rotation

, in

which case T

ij···k

is isotropic.

Example. Let

dV,

with

being a solid sphere of

|r| < a

. Our result applies with

= 1, which,

being a constant, is clearly invariant under rotations. Also the solid sphere is

invariant under any rotation. So

must be isotropic. But the only rank 2

isotropic tensor is αδ

. Hence we must have

= αδ

and all we have to do is to determine the scalar α.

Taking the trace, we have

= 3α =

dV = 4π

· r

dr =

πa

Normally if we are only interested in the

i 6

case, we just claim that

= 0

by saying “by symmetry, it is 0”. But now we can do it (more) rigorously!

There is a closely related result for the inertia tensor of a solid sphere of

constant density ρ

, or of mass M =

πa

Recall that

− x

) dV.

We see that

is isotropic (since we have just shown that

is isotropic,

and x

is also isotropic). Let I

= βδ

. Then

− x

) dV

= ρ



dV −



= ρ

(δ

− T

)

= ρ



πa

−

πa



πa