2Fourier series
IB Methods
2.3 Differentiability and Fourier series
Integration is a smoothening operator. If we take, say, the step function
Θ(x) =
(
1 x > 0
0 x < 0
.
Then the integral is given by
Z
Θ(x) dx =
(
x x > 0
0 x < 0
.
This is now a continuous function. If we integrate it again, we make the positive
side look like a quadratic curve, and it is now differentiable.
x
y
R
dx
x
y
On the other hand, when we differentiate a function, we generally make it worse.
For example, when we differentiate the continuous function
R
Θ(
x
) d
x
, we obtain
the discontinuous function Θ(
x
). If we attempt to differentiate this again, we
get the δ(non)function.
Hence, it is often helpful to characterize the “smoothness” of a function by
how many times we can differentiate it. It turns out this is rather relevant to
the behaviour of the Fourier series.
Suppose that we have a function that is itself continuous and whose first
m −
1 derivatives are also continuous, but
f
(m)
has isolated discontinuities at
{θ
1
, θ
2
, θ
3
, ··· , θ
r
}.
We can look at the kth Fourier coefficient:
ˆ
f
k
=
1
2π
Z
π
−π
e
−ikθ
f(θ) dθ
=
−
1
2πik
e
−ikθ
f(θ)
π
−π
+
1
2πik
Z
π
−π
e
−ikθ
f
0
(θ) dθ
The first term vanishes since
f
(
θ
) is continuous and takes the same value at
π
and −π. So we are left with
=
1
2πik
Z
π
−π
e
−ikθ
f
0
(θ) dθ
= ···
=
1
(ik)
m
1
2π
Z
π
−π
e
ikθ
f
(m)
(θ) dθ.
Now we have to be careful, since
f
(m)
is no longer continuous. However provided
that
f
(m)
is everywhere finite, we can approximate this by removing small strips
(
θ
i
−ε, θ
i
+
ε
) from our domain of integration, and take the limit
ε →
0. We can
write this as
= lim
ε→0
1
(ik)
m
1
2π
Z
θ
1
−ε
−π
+
Z
θ
2
−ε
θ
1
+ε
+ ··· +
Z
π
θ
r
+ε
e
ikθ
f
(m)
(θ) dθ
=
1
(ik)
m+1
1
2π
"
r
X
s=1
(f
(m)
(θ
+
s
) − f
(m)
(θ
−
s
)) +
Z
(−π,π)\θ
e
−ikθ
f
(m+1)
(θ) dθ
#
.
We now have to stop. So
ˆ
f
k
decays like
1
k
m+1
if our function and its (
m −
1)
derivatives are continuous. This means that if a function is more differentiable,
then the coefficients decay more quickly.
This makes sense. If we have a rather smooth function, then we would expect
the first few Fourier terms (with low frequency) to account for most of the
variation of f. Hence the coefficients decay really quickly.
However, if the function is jiggly and bumps around all the time, we would
expect to need some higher frequency terms to account for the minute variation.
Hence the terms would not decay away that quickly. So in general, if we can
differentiate it more times, then the terms should decay quicker.
An important result, which is in some sense a Linear Algebra result, is
Parseval’s theorem.
Theorem (Parseval’s theorem).
(f, f ) =
Z
π
−π
f(θ)
2
dθ = 2π
X
n∈Z

ˆ
f
n

2
Proof.
(f, f ) =
Z
π
−π
f(θ)
2
dθ
=
Z
π
−π
X
m∈Z
ˆ
f
∗
m
e
−imθ
!
X
n∈Z
ˆ
f
n
e
inθ
!
dθ
=
X
m,n∈Z
ˆ
f
∗
m
ˆ
f
n
Z
π
−π
e
i(n−m)θ
dθ
= 2π
X
m,n∈Z
ˆ
f
∗
m
ˆ
f
n
δ
mn
= 2π
X
n∈Z

ˆ
f
n

2
Note that this is not a fully rigorous proof, since we assumed not only that
the Fourier series converges to the function, but also that we could commute the
infinite sums. However, for the purposes of an applied course, this is sufficient.
Last time, we computed that the sawtooth function
f
(
θ
) =
θ
has Fourier
coefficients
ˆ
f
0
= 0,
ˆ
f
n
=
i(−1)
n
n
for n 6= 0.
But why do we care? It turns out this has some applications in number theory.
You might have heard of the Riemann ζfunction, defined by
ζ(s) =
∞
X
n=1
1
n
s
.
We will show that this obeys the property that for any
m
,
ζ
(2
m
) =
π
2m
q
for
some
q ∈ Q
. This may not be obvious at first sight. So let’s apply Parseval’s
theorem for the sawtooth defined by
f
(
θ
) =
θ
. By direct computation, we know
that
(f, f ) =
Z
π
−π
θ
2
dθ =
2π
3
3
.
However, by Parseval’s theorem, we know that
(f, f ) = 2π
X
n∈Z

ˆ
f
n

2
= 4π
∞
X
n=1
1
n
2
.
Putting these together, we learn that
∞
X
n=1
1
n
2
= ζ(2) =
π
2
6
.
We have just done it for the case where
m
= 1. But if we integrate the sawtooth
function repeatedly, then we can get the general result for all m.
At this point, we might ask, why are we choosing these
e
imθ
as our basis?
Surely there are a lot of different sets of basis we can use. For example, in finite
dimensions, we can just arbitrary choose random vectors (that are not linearly
dependent) to get a set of basis vectors. However, in practice, we don’t pick
them randomly. We often choose a basis that can reveal the symmetry of a
system. For example, if we have a rotationally symmetric system, we would like
to use polar coordinates. Similarly, if we have periodic functions, then
e
imθ
is
often a good choice of basis.