Part III — Modular Forms and Lfunctions
Based on lectures by A. J. Scholl
Notes taken by Dexter Chua
Lent 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
Mo dular Forms are classical objects that appear in many areas of mathematics, from
number theory to representation theory and mathematical physics. Most famous is,
of course, the role they played in the proof of Fermat’s Last Theorem, through the
conjecture of ShimuraTaniyamaWeil that elliptic curves are modular. One connection
b etween modular forms and arithmetic is through the medium of
L
functions, the
basic example of which is the Riemann
ζ
function. We will discuss various types of
Lfunction in this course and give arithmetic applications.
Prerequisites
Prerequisites for the course are fairly modest; from number theory, apart from basic
elementary notions, some knowledge of quadratic fields is desirable. A fair chunk of the
course will involve (fairly 19thcentury) analysis, so we will assume the basic theory of
holomorphic functions in one complex variable, such as are found in a first course on
complex analysis (e.g. the 2nd year Complex Analysis course of the Tripos).
Contents
0 Introduction
1 Some preliminary analysis
1.1 Characters of abelian groups
1.2 Fourier transforms
1.3 Mellin transform and Γfunction
2 Riemann ζfunction
3 Dirichlet Lfunctions
4 The modular group
5 Modular forms of level 1
5.1 Basic definitions
5.2 The space of modular forms
5.3 Arithmetic of ∆
6 Hecke operators
6.1 Hecke operators and algebras
6.2 Hecke operators on modular forms
7 Lfunctions of eigenforms
8 Modular forms for subgroups of SL
2
(Z)
8.1 Definitions
8.2 The Petersson inner product
8.3 Examples of modular forms
9 Hecke theory for Γ
0
(N)
10 Modular forms and rep theory
0 Introduction
One of the big problems in number theory is the socalled Langlands programme,
which is relates “arithmetic objects” such as representations of the Galois group
and elliptic curves over
Q
, with “analytic objects” such as modular forms and
more generally automorphic forms and representations.
Example. y
2
+
y
=
x
3
− x
is an elliptic curve, and we can associate to it the
function
f(z) = q
Y
n≥1
(1 − q
n
)
2
(1 − q
11n
)
2
=
∞
X
n=1
a
n
q
n
, q = e
2πiz
,
where we assume
Im z >
0, so that
q <
1. The relation between these two
objects is that the number of points of
E
over
F
p
is equal to 1 +
p − a
p
, for
p 6
= 11. This strange function
f
is a modular form, and is actually cooked up
from the slightly easier function
η(z) = q
1/24
∞
Y
n=1
(1 − q
n
)
by
f(z) = (η(z)η(11z))
2
.
This function
η
is called the Dedekind eta function, and is one of the simplest
examples of a modular forms (in the sense that we can write it down easily).
This satisfies the following two identities:
η(z + 1) = e
iπ/12
η(z), η
−1
z
=
r
z
i
η(z).
The first is clear, and the second takes some work to show. These transformation
laws are exactly what makes this thing a modular form.
Another way to link E and f is via the Lseries
L(E, s) =
∞
X
n=1
a
n
n
s
,
which is a generalization of the Riemann ζfunction
ζ(s) =
∞
X
n=1
1
n
s
.
We are in fact not going to study elliptic curves, as there is another course
on that, but we are going to study the modular forms and these
L
series. We are
going to do this in a fairly classical way, without using algebraic number theory.
1 Some preliminary analysis
1.1 Characters of abelian groups
When we were young, we were forced to take some “applied” courses, and learnt
about these beasts known as Fourier transforms. At that time, the hope was
that we can leave them for the engineers, and never meet them ever again.
Unfortunately, it turns out Fourier transforms are also important in “pure”
mathematics, and we must understand them well.
Let’s recall how Fourier transforms worked. We had two separate but closely
related notions. First of all, we can take the Fourier transform of a function
f : R → C. The idea is that we wanted to write any such function as
f(x) =
Z
∞
−∞
e
2πiyx
ˆ
f(y) dy.
One way to think about this is that we are expanding
f
in the basis
χ
y
(
x
) =
e
2πiyx
.
We also could take the Fourier series of a periodic function, i.e. a function defined
on R/Z. In this case, we are trying to write our function as
f(x) =
∞
X
n=−∞
c
n
e
2πinx
.
In this case, we are expanding our function in the basis
χ
n
(
x
) =
e
2πinx
. What
is special about these basis {χ
y
} and {χ
y
}?
We observe that
R
and
R/Z
are not just topological spaces, but in fact
abelian topological groups. These
χ
y
and
χ
n
are not just functions to
C
, but
continuous group homomorphisms to U(1)
⊆ C
. In fact, these give all continuous
group homomorphisms from R and R/Z to U(1).
Definition
(Character)
.
Let
G
be an abelian topological group. A (unitary)
character of
G
is a continuous homomorphism
χ
:
G →
U(1), where U(1) =
{z ∈
C  z = 1}.
To better understand Fourier transforms, we must understand characters,
and we shall spend some time doing so.
Example. For any group G, there is the trivial character χ
0
(g) ≡ 1.
Example. The product of two characters is a character.
Example. If χ is a character, then so is χ
∗
, and χχ
∗
= 1.
Thus, we see that the collection of all characters form a group under multi
plication.
Definition
(Character group)
.
Let
G
be a group. The character group (or
Pontryagin dual )
ˆ
G is the group of all characters of G.
It is usually not hard to figure out what the character group is.
Example. Let G = R. For y ∈ R, we let χ
y
: R → U(1) be
χ
y
(x) = e
2πixy
.
For each
y ∈ R
, this is a character, and all characters are of this form. So
ˆ
R
∼
=
R
under this correspondence.
Example.
Take
G
=
Z
with the discrete topology. A character is uniquely
determined by the image of 1, and any element of U(1) can be the image of 1.
So we have
ˆ
G
∼
=
U(1).
Example.
Take
G
=
Z/N Z
. Then the character is again determined by the
image of 1, and the allowed values are exactly the N th roots of unity. So
ˆ
G
∼
=
µ
N
= {ζ ∈ C
×
: ζ
N
= 1}.
Example.
Let
G
=
G
1
× G
2
. Then
ˆ
G
∼
=
ˆ
G
1
×
ˆ
G
2
. So, for example,
ˆ
R
n
=
R
n
.
Under this correspondence, a y ∈ R
n
corresponds to
χ
y
(x) = e
2πx·y
.
Example. Take G = R
×
. We have
G
∼
=
{±1} × R
×
>0
∼
=
{±1} × R,
where we have an isomorphism between
R
×
>0
∼
=
R
by the exponential map. So
we have
ˆ
G
∼
=
Z/2Z × R.
Explicitly, given (ε, σ) ∈ Z/2Z × R, then character is given by
x 7→ sgn(x)
ε
x
iσ
.
Note that
ˆ
G
has a natural topology for which the evaluation maps (
χ ∈
ˆ
G
)
7→ χ
(
g
)
∈
U(1) are all continuous for all
g
. Moreover, evaluation gives us a
map G →
ˆ
ˆ
G.
Theorem
(Pontryagin duality)
.
Pontryagin duality If
G
is locally compact,
then G →
ˆ
ˆ
G is an isomorphism.
Since this is a course on number theory, and not topological groups, we will
not prove this.
Proposition.
Let
G
be a finite abelian group. Then

ˆ
G
=
G
, and
G
and
ˆ
G
are in fact isomorphic, but not canonically.
Proof.
By the classification of finite abelian groups, we know
G
is a product of
cyclic groups. So it suffices to prove the result for cyclic groups
Z/N Z
, and the
result is clear since
\
Z/N Z = µ
N
∼
=
Z/N Z.
1.2 Fourier transforms
Equipped with the notion of characters, we can return to our original goal of
understand Fourier transforms. We shall first recap the familiar definitions of
Fourier transforms in specific cases, and then come up with the definition of
Fourier transforms in full generality. In the mean time, we will get some pesky
analysis out of the way.
Definition
(Fourier transform)
.
Let
f
:
R → C
be an
L
1
function, i.e.
R
f
d
x <
∞. The Fourier transform is
ˆ
f(y) =
Z
∞
−∞
e
−2πixy
f(x) dx =
Z
∞
−∞
χ
y
(x)
−1
f(x) dx.
This is a bounded and continuous function on R.
We will see that the “correct” way to think about the Fourier transform is
to view it as a function on
ˆ
R instead of R.
In general, there is not much we can say about how wellbehaved
ˆ
f
will
be. In particular, we cannot expect the “Fourier inversion theorem” to hold for
general
L
1
functions. If we like analysis, then we can figure out exactly how
much we need to assume about
ˆ
f
. But we don’t. We chicken out and only
consider functions that decay rely fast at infinity. This makes our life much
easier.
Definition (Schwarz space). The Schwarz space is defined by
S(R) = {f ∈ C
∞
(R) : x
n
f
(k)
(x) → 0 as x → ±∞ for all k, n ≥ 0}.
Example. The function
f(x) = e
−πx
2
.
is in the Schwarz space.
One can prove the following:
Proposition. If f ∈ S(R), then
ˆ
f ∈ S(R), and the Fourier inversion formula
ˆ
ˆ
f = f (−x)
holds.
Everything carries over when we replace
R
with
R
n
, as long as we promote
both x and y into vectors.
We can also take the Fourier transform of functions defined on
G
=
R/Z
.
For n ∈ Z, we let χ
n
∈
ˆ
G by
χ
n
(x) = e
2πinx
.
These are exactly all the elements of
ˆ
G
, and
ˆ
G
∼
=
Z
. We then define the Fourier
coefficients of a periodic function f : R/Z → C by
c
n
(f) =
Z
1
0
e
−2πinx
f(x) dx =
Z
R/Z
χ
n
(x)
−1
f(x) dx.
Again, under suitable regularity conditions on
f
, e.g. if
f ∈ C
∞
(
R/Z
), we have
Proposition.
f(x) =
X
n∈Z
c
n
(f)e
2πinx
=
X
n∈Z
∼
=
ˆ
G
c
n
(f)χ
n
(x).
This is the Fourier inversion formula for G = R/Z.
Finally, in the case when G = Z/NZ, we can define
Definition
(Discrete Fourier transform)
.
Given a function
f
:
Z/N Z → C
, we
define the Fourier transform
ˆ
f : µ
N
→ C by
ˆ
f(ζ) =
X
a∈Z/NZ
ζ
−a
f(a).
This time there aren’t convergence problems to worry with, so we can quickly
prove this result:
Proposition. For a function f : Z/NZ → C, we have
f(x) =
1
N
X
ζ∈µ
N
ζ
x
ˆ
f(ζ).
Proof.
We see that both sides are linear in
f
, and we can write each function
f
as
f =
X
a∈Z/NZ
f(a)δ
a
,
where
δ
a
(x) =
(
1 x = a
0 x 6= a
.
So we wlog f = δ
a
. Thus we have
ˆ
f(ζ) = ζ
−a
,
and the RHS is
1
N
X
ζ∈µ
N
ζ
x−a
.
We now note the fact that
X
ζ∈µ
N
ζ
k
=
(
N k ≡ 0 (mod N)
0 otherwise
.
So we know that the RHS is equal to δ
a
, as desired.
It is now relatively clear what the general picture should be, except that we
need a way to integrate functions defined on an abelian group. Since we are not
doing analysis, we shall not be very precise about what we mean:
Definition
(Haar measure)
.
Let
G
be a topological group. A Haar measure
is a left translationinvariant Borel measure on
G
satisfying some regularity
conditions (e.g. being finite on compact sets).
Theorem. Let G be a locally compact abelian group G. Then there is a Haar
measure on G, unique up to scaling.
Example. On G = R, the Haar measure is the usual Lebesgue measure.
Example.
If
G
is discrete, then the Haar measure is the counting measure, so
that
Z
f dg =
X
g∈G
f(g).
Example. If G = R
×
>0
, then the integral given by the Haar measure is
Z
f(x)
dx
x
,
since
dx
x
is invariant under multiplication of x by a constant.
Now we can define the general Fourier transform.
Definition
(Fourier transform)
.
Let
G
be a locally compact abelian group with
a Haar measure d
g
, and let
f
:
G → C
be a continuous
L
1
function. The Fourier
transform
ˆ
f :
ˆ
G → C is given by
ˆ
f(χ) =
Z
G
χ(g)
−1
f(g) dg.
It is possible to prove the following theorem:
Theorem
(Fourier inversion theorem)
.
Given a locally compact abelian group
G
with a fixed Haar measure, there is some constant
C
such that for “suitable”
f : G → C, we have
ˆ
ˆ
f(g) = Cf(−g),
using the canonical isomorphism G →
ˆ
ˆ
G.
This constant is necessary, because the measure is only defined up to a
multiplicative constant.
One of the most important results of this investigation is the following result:
Theorem (Poisson summation formula). Let f ∈ S(R
n
). Then
X
a∈Z
n
f(a) =
X
b∈Z
n
ˆ
f(b).
Proof. Let
g(x) =
X
a∈Z
n
f(x + a).
This is now a function that is invariant under translation of
Z
n
. It is easy to
check this is a welldefined
C
∞
function on
R
n
/Z
n
, and so has a Fourier series.
We write
g(x) =
X
b∈Z
n
c
b
(g)e
2πib·x
,
with
c
b
(g) =
Z
R
n
/Z
n
e
−2πib·x
g(x) dx =
X
a∈Z
n
Z
[0,1]
n
e
2πib·x
f(x + a) dx.
We can then do a change of variables
x 7→ x − a
, which does not change the
exponential term, and get that
c
b
(g) =
Z
R
n
e
−2πib·x
f(x) dx =
ˆ
f(b).
Finally, we have
X
a∈Z
n
f(a) = g(0) =
X
b∈Z
n
c
b
(x) =
X
b∈Z
n
ˆ
f(b).
1.3 Mellin transform and Γfunction
We unfortunately have a bit more analysis to do, which we will use a lot later
on. This is the Mellin transform.
Definition (Mellin transform). Let f : R
>0
→ C be a function. We define
M(f, s) =
Z
∞
0
y
s
f(y)
dy
y
,
whenever this converges.
We want to think of this as an analytic function of
s
. The following lemma
tells us when we can actually do so
Lemma. Suppose f : R
>0
→ C is such that
– y
N
f(y) → 0 as y → ∞ for all N ∈ Z
– there exists m such that y
m
y(f) is bounded as y → 0
Then M (f, s) converges and is an analytic function of s for Re(s) > m.
The conditions say
f
is rapidly decreasing at
∞
and has moderate growth at
0.
Proof. We know that for any 0 < r < R < ∞, the integral
Z
R
r
y
s
f(y)
dy
y
is analytic for all s since f is continuous.
By assumption, we know
R
∞
R
→
0 as
R → ∞
uniformly on compact subsets
of C. So we know
Z
∞
r
y
s
f(y)
dy
y
converges uniformly on compact subsets of C.
On the other hand, the integral
R
r
0
as
r →
0 converges uniformly on compact
subsets of
{s ∈ C
:
Re
(
s
)
> m}
by the other assumption. So the result
follows.
This transform might seem a bit strange, but we can think of this as an
analytic continuation of the Fourier transform.
Example. Suppose we are in the rather good situation that
Z
∞
0
f
dy
y
< ∞.
In practice, this will hardly ever be the case, but this is a good place to start
exploring. In this case, the integral actually converges on
iR
, and equals the
Fourier transform of f ∈ L
1
(G) = L
1
(R
×
>0
). Indeed, we find
ˆ
G = {y 7→ y
iσ
: σ ∈ R}
∼
=
R,
and
dy
y
is just the invariant measure on
G
. So the formula for the Mellin
transform is exactly the formula for the Fourier transform, and we can view the
Mellin transform as an analytic continuation of the Fourier transform.
We now move on to explore properties of the Mellin transform. When we
make a change of variables y ↔ αy, by inspection of the formula, we find
Proposition.
M(f(αy), s) = α
−s
M(f, s)
for α > 0.
The following is a very important example of the Mellin transform:
Definition (Γ function). The Γ function is the Mellin transform of
f(y) = e
−y
.
Explicitly, we have
Γ(s) =
Z
∞
0
e
−y
y
s
dy
y
.
By general theory, we know this is analytic for Re(s) > 0.
If we just integrate by parts, we find
Γ(s) =
Z
∞
0
e
−y
y
s−1
dy =
e
−y
y
s
s
∞
0
+
1
s
Z
∞
0
e
−y
y
s
dy =
1
s
Γ(s + 1).
So we find that
Proposition.
sΓ(s) = Γ(s + 1).
Moreover, we can compute
Γ(1) =
Z
∞
0
e
−y
dy = 1.
So we get
Proposition. For an integer n ≥ 1, we have
Γ(n) = (n − 1)!.
In general, iterating the above formula, we find
Γ(s) =
1
s(s + 1) ···(s + N − 1)
Γ(s + N).
Note that the right hand side makes sense for
Re
(
s
)
> −N
(except at non
positive integer points). So this allows us to extend Γ(
s
) to a meromorphic
function on {Re(s) > −N }, with simple poles at 0, 1, ··· , 1 − N of residues
res
s=1−N
Γ(s) =
(−1)
N−1
(N − 1)!
.
Of course, since
N
was arbitrary, we know Γ(
s
) extends to a meromorphic
function on C \ Z
≤0
.
Here are two facts about the Γ function that we are not going to prove,
because, even if the current experience might suggest otherwise, this is not an
analysis course.
Proposition.
(i) The Weierstrass product: We have
Γ(s)
−1
= e
γs
s
Y
n≥1
1 +
s
n
e
−s/n
for all
s ∈ C
. In particular, Γ(
s
) is never zero. Here
γ
is the Euler
Mascheroni constant, given by
γ = lim
n→∞
1 +
1
2
+ ··· +
1
n
− log n
.
(ii) Duplication and reflection formulae:
π
1
2
Γ(2s) = 2
2s−1
Γ(s)Γ
s +
1
2
and
Γ(s)Γ(1 − s) =
π
sin πz
.
The main reason why we care about the Mellin transform in this course is
that a lot of Dirichlet series are actually Mellin transforms of some functions.
Suppose we have a Dirichlet series
∞
X
n=1
a
n
n
s
,
where the a
n
grow not too quickly. Then we can write
(2π)
−s
Γ(s)
∞
X
n=1
a
n
n
s
=
∞
X
n=1
a
n
(2πn)
−s
M(e
−y
, s)
=
∞
X
n=1
M(e
−2πny
, s)
= M(f, s),
where we set
f(y) =
X
n≥1
a
n
e
−2πny
.
Since we know about the analytic properties of the Γ function, by understanding
M(f, s), we can deduce useful properties about the Dirichlet series itself.
2 Riemann ζfunction
We ended the previous chapter by briefly mentioning Dirichlet series. The first
and simplest example one can write down is the Riemann ζfunction.
Definition (Riemann ζfunction). The Riemann ζfunction is defined by
ζ(s) =
X
n≥1
1
n
s
for Re(s) > 1.
This
ζ
function is related to prime numbers by the following famous formula:
Proposition (Euler product formula). We have
ζ(s) =
Y
p prime
1
1 − p
−s
.
Proof.
Euler’s proof was purely formal, without worrying about convergence.
We simply note that
Y
p prime
1
1 − p
−s
=
Y
p
(1 + p
−s
+ (p
2
)
−s
+ ···) =
X
n≥1
n
−s
,
where the last equality follows by unique factorization in
Z
. However, to prove
this properly, we need to be a bit more careful and make sure things converge.
Saying the infinite product
Q
p
convergence is the same as saying
P
p
−s
converges, by basic analysis, which is okay since we know
ζ
(
s
) converges absolutely
when Re(s) > 1. Then we can look at the difference
ζ(s) −
Y
p≤X
1
1 − p
−s
= ζ(s) −
Y
p≤X
(1 + p
−s
+ p
−2s
+ ···)
=
Y
n∈N
X
n
−s
,
where
N
X
is the set of all
n ≥
1 such that at least one prime factor is
≥ X
. In
particular, we know
ζ(s) −
Y
p≤X
1
1 − p
−s
≤
X
n≥X
n
−s
 → 0
as X → ∞. So the result follows.
The Euler product formula is the beginning of the connection between the
ζ
function and the distribution of primes. For example, as the product converges
for
Re
(
s
)
>
1, we know in particular that
ζ
(
s
)
6
= 0 for all
s
when
Re
(
s
)
>
1.
Whether or not
Re
(
s
) vanishes elsewhere is a less straightforward matter, and
this involves quite a lot of number theory.
We will, however, not care about primes in this course. Instead, we look at
some further analytic properties of the
ζ
function. To do so, we first write it as
a Mellin transform.
Theorem. If Re(s) > 1, then
(2π)
−s
Γ(s)ζ(s) =
Z
∞
0
y
s
e
2πy
− 1
dy
y
= M(f, s),
where
f(y) =
1
e
2πy
− 1
.
This is just a simple computation.
Proof. We can write
f(y) =
e
−2πy
1 − e
−2πy
=
X
n≥1
e
−2πny
for y > 0.
As y → 0, we find
f(y) ∼
1
2πy
.
So when Re(s) > 1, the Mellin transform converges, and equals
X
n≥1
M(e
−2πny
, s) =
X
n≥1
(2πn)
−s
M(e
−y
, s) = (2π)
−s
Γ(s)ζ(s).
Corollary. ζ
(
s
) has a meromorphic continuation to
C
with a simple pole at
s = 1 as its only singularity, and
res
s=1
ζ(s) = 1.
Proof. We can write
M(f, s) = M
0
+ M
∞
=
Z
1
0
+
Z
∞
1
y
s
e
2πy
− 1
dy
y
.
The second integral
M
∞
is convergent for all
s ∈ C
, hence defines a holomorphic
function.
For any fixed N, we can expand
f(y) =
N−1
X
n=−1
c
n
y
n
+ y
N
g
N
(y)
for some g ∈ C
∞
(R), as f has a simple pole at y = 0, and
c
−1
=
1
2π
.
So for Re(s) > 1, we have
M
0
=
N−1
X
n=−1
c
n
Z
1
0
y
n+s−1
dy +
Z
N
0
y
N+s−1
g
N
(y) dy
=
N−1
X
n=−1
c
n
s + n
y
s+n
+
Z
1
0
g
N
(y)y
s+N−1
dy.
We now notice that this formula makes sense for
Re
(
s
)
> −N
. Thus we have
found a meromorphic continuation of
(2π)
−s
Γ(s)ζ(s)
to
{Re
(
s
)
> N }
, with at worst simple poles at
s
= 1
− N,
2
− N, ··· ,
0
,
1. Also,
we know Γ(
s
) has a simple pole at
s
= 0
, −
1
, −
2
, ···
. So
ζ
(
s
) is analytic at
s = 0, −1, −2, ···. Since c
−1
=
1
2π
and Γ(1) = 1, we get
res
s=1
ζ(s) = 1.
Now we note that by the Euler product formula, if there are only finitely
many primes, then ζ(s) is certainly analytic everywhere. So we deduce
Corollary. There are infinitely many primes.
Given a function
ζ
, it is natural to ask what values it takes. In particular,
we might ask what values it takes at integers. There are many theorems and
conjectures concerning the values at integers of
L
functions (which are Dirichlet
series like the
ζ
function). These properties relate to subtle numbertheoretic
quantities. For example, the values of
ζ
(
s
) at negative integers are closely
related to the class numbers of the cyclotomic fields
Q
(
ζ
p
). These are also
related to early (partial) proofs of Fermat’s last theorem, and things like the
Birch–SwinnertonDyer conjecture on elliptic curves.
We shall take a tiny step by figuring out the values of
ζ
(
s
) at negative integers.
They are given by the Bernoulli numbers.
Definition
(Bernoulli numbers)
.
The Bernoulli numbers are defined by a gen
erating function
∞
X
n=0
B
n
t
n
n!
=
t
e
t
− 1
=
1 +
t
2!
+
t
2
3!
+ ···
−1
.
Clearly, all Bernoulli numbers are rational. We can directly compute
B
0
= 1, B
1
= −
1
2
, ··· .
The first thing to note about this is the following:
Proposition. B
n
= 0 if n is odd and n ≥ 3.
Proof. Consider
f(t) =
t
e
t
− 1
+
t
2
=
X
n≥0,n6=1
B
n
t
n
n!
.
We find that
f(t) =
t
2
e
t
+ 1
e
t
− 1
= f(−t).
So all the odd coefficients must vanish.
Corollary. We have
ζ(0) = B
1
= −
1
2
, ζ(1 − n) = −
B
n
n
for
n >
1. In particular, for all
n ≥
1 integer, we know
ζ
(1
−n
)
∈ Q
and vanishes
if n > 1 is odd.
Proof. We know
(2π)
−s
Γ(s)ζ(s)
has a simple pole at s = 1 − n, and the residue is c
n−1
, where
1
e
2πy
− 1
=
X
n≥−1
c
n
y
n
.
So we know
c
n−1
= (2π)
n−1
B
n
n!
.
We also know that
res
s=1−n
Γ(s) =
(−1)
n−1
(n − 1)!
,
we get that
ζ(1 − n) = (−1)
n−1
B
n
n
.
If
n
= 1, then this gives
−
1
2
. If
n
is odd but
>
1, then this vanishes. If
n
is even,
then this is −
B
n
n
, as desired.
To end our discussion on the
ζ
function, we shall prove a functional equation,
relating
ζ
(
s
) to
ζ
(1
− s
). To do so, we relate the
ζ
function to another Mellin
transform. We define
Θ(y) =
X
n∈Z
e
−πn
2
y
= 1 + 2
X
n≥1
e
−πn
2
y
.
This is convergent for for y > 0. So we can write
Θ(y) = ϑ(iy),
where
ϑ(z) =
X
n∈Z
e
πin
2
z
,
which is analytic for
e
πiz
 <
1, i.e.
Im
(
z
)
>
0. This is Jacobi’s
ϑ
function. This
function is also important in algebraic geometry, representation theory, and even
applied mathematics. But we will just use it for number theory. We note that
Θ(y) → 1
as y → ∞, so we can’t take its Mellin transform. What we can do is
Proposition.
M
Θ(y) −1
2
,
s
2
= π
−s/2
Γ
s
2
ζ(s).
The proof is again just do it.
Proof. The left hand side is
X
n≥1
M
e
−πn
2
y
,
s
2
=
X
n≥1
(πn
2
)
−s/2
M
e
−y
,
s
2
= π
−s/2
Γ
s
2
ζ(s).
To produce a functional equation for ζ, we first do it for Θ.
Theorem (Functional equation for Θfunction). If y > 0, then
Θ
1
y
= y
1/2
Θ(y), (∗)
where we take the positive square root. More generally, taking the branch of
√
which is positive real on the positive real axis, we have
ϑ
−
1
z
=
z
i
1/2
ϑ(z).
The proof is rather magical.
Proof. By analytic continuation, it suffices to prove (∗). Let
g
t
(x) = e
−πtx
2
= g
1
(t
1/2
x).
In particular,
g
1
(x) = e
−πx
2
.
Now recall that
ˆg
1
=
g
1
. Moreover, the Fourier transform of
f
(
αx
) is
1
α
ˆ
f
(
y/α
).
So
ˆg
t
(y) = t
−1/2
ˆg
1
(t
−1/2
y) = t
−1/2
g
1
(t
−1/2
y) = t
−1/2
e
−πy
2
/t
.
We now apply the Poisson summation formula:
Θ(t) =
X
n∈Z
e
−πn
2
t
=
X
n∈Z
g
t
(n) =
X
n∈Z
ˆg
t
(n) = t
−1/2
Θ(1/t).
Before we continue, we notice that most of the time, when we talk about the
Γfunction, there are factors of π floating around. So we can conveniently set
Notation.
Γ
R
(s) = π
−s/2
Γ(s/2).
Γ
C
(s) = 2(2π)
−s
Γ(s)
These are the real/complex Γfactors.
We also define
Notation.
Z(s) ≡ Γ
R
(s)ζ(s) = π
−s/2
Γ
s
2
ζ(s).
The theorem is then
Theorem (Functional equation for ζfunction).
Z(s) = Z(1 − s).
Moreover, Z(s) is meromorphic, with only poles at s = 1 and 0.
Proof. For Re(s) > 1, we have
2Z(s) = M
Θ(y) −1,
s
2
=
Z
∞
0
[Θ(y) −1]y
s/2
dy
y
=
Z
1
0
+
Z
∞
1
[Θ(y) −1]y
s/2
dy
y
The idea is that using the functional equation for the Θfunction, we can relate
the
R
1
0
part and the
R
∞
1
part. We have
Z
1
0
(Θ(y) −1)y
s/2
dy
y
=
Z
1
0
(Θ(y) −y
−1/2
)y
s/2
dy
y
+
Z
1
0
y
s−1
2
− y
1/2
dy
y
=
Z
1
0
(y
−1/2
Θ(1/y) −y
−1/2
)
dy
y
+
2
s − 1
−
2
s
.
In the first term, we change variables y ↔ 1/y, and get
=
Z
∞
1
y
1/2
(Θ(y) −1)y
−s/2
dy
y
+
2
s − 1
−
2
s
.
So we find that
2Z(s) =
Z
∞
1
(Θ(y) −1)(y
s/2
+ y
1−s
2
)
dy
y
+
2
s − 1
−
2
s
= 2Z(1 − s).
Note that what we’ve done by separating out the
y
s−1
2
− y
s/2
term is that we
separated out the two poles of our function.
Later on, we will come across more
L
functions, and we will prove functional
equations in the same way.
Note that we can write
Z(s) = Γ
R
(s)
Y
p primes
1
1 − p
−s
,
and the term Γ
R
(
s
) should be thought of as the Euler factor for
p
=
∞
, i.e. the
Archimedean valuation on Q.
3 Dirichlet Lfunctions
We now move on to study a significant generalization of
ζ
functions, namely
Dirichlet
L
functions. While these are generalizations of the
ζ
function, it turns
out the
ζ
function is a very particular kind of
L
function. For example, most
L
functions are actually analytic on all of
C
, except for (finite multiples of) the
ζfunction.
Recall that a Dirichlet series is a series of the form
∞
X
n=1
a
n
n
s
.
A Dirichlet
L
function is a Dirichlet series whose coefficients come from Dirichlet
characters.
Definition
(Dirichlet characters)
.
Let
N ≥
1. A Dirichlet character mod
N
is
a character χ : (Z/NZ)
×
→ C
×
.
As before, we write
\
(Z/N Z)
×
for the group of characters.
Note that in the special case N = 1, we have
Z/N Z = {0 = 1} = (Z/NZ)
×
,
and so
\
(Z/N Z)
×
∼
=
{1}, and the only Dirichlet character is identically 1.
Not all characters are equal. Some are less exciting than others. Suppose
χ
is a character mod
N
, and we have some integer
d >
1. Then we have the
reduction mod N map
(Z/N dZ)
×
(Z/N Z)
×
,
and we can compose
χ
with this to get a character mod
Nd
. This is a rather
boring character, because the value of
x ∈
(
Z/N dZ
)
×
only depends on the value
of x mod N.
Definition
(Primitive character)
.
We say a character
χ ∈
\
(Z/nZ)
×
is primitive
if there is no M < N with M  N with χ
0
∈
\
(Z/M Z)
×
such that
χ = χ
0
◦ (reduction mod M).
Similarly we define
Definition
(Equivalent characters)
.
We say characters
χ
1
∈
\
(Z/N
1
Z)
×
and
χ
2
∈
\
(Z/N
2
Z)
×
are equivalent if for all
x ∈ Z
such that (
x, N
1
N
2
) = 1, we have
χ
1
(x mod N
1
) = χ
2
(x mod N
2
).
It is clear that if we produce a new character from an old one via reduction
mod Nd, then they are equivalent.
One can show the following:
Proposition.
If
χ ∈
\
(Z/N Z)
×
, then there exists a unique
M  N
and a primitive
χ
∗
∈
\
(Z/M Z)
×
that is equivalent to χ.
Definition
(Conductor)
.
The conductor of a character
χ
is the unique
M  N
such that there is a primitive χ
∗
∈
\
(Z/M Z)
×
that is equivalent to χ.
Example. Take
χ = χ
0
∈
\
(Z/N Z)
×
,
given by
χ
0
(
x
)
≡
1. If
N >
1, then
χ
0
is not primitive, and the associated
primitive character is the trivial character modulo
M
= 1. So the conductor is 1.
Using these Dirichlet characters, we can define Dirichlet Lseries:
Definition
(Dirichlet
L
series)
.
Let
χ ∈
\
(Z/N Z)
×
be a Dirichlet character. The
Dirichlet Lseries of χ is
L(χ, s) =
X
n≥1
(n,N)=1
χ(n)n
−s
.
Since χ(n) = 1, we again know this is absolutely convergent for Re(s) > 1.
As
χ
(
mn
) =
χ
(
m
)
χ
(
n
) whenever (
mn, N
) = 1, we get the same Euler product
as we got for the ζfunction:
Proposition.
L(χ, s) =
Y
prime pN
1
1 − χ(p)p
−s
.
The proof of convergence is again similar to the case of the ζfunction.
It follows that
Proposition.
Suppose
M  N
and
χ
M
∈
\
(Z/M Z)
×
and
χ
N
∈
\
(Z/N Z)
×
are
equivalent. Then
L(χ
M
, s) =
Y
pM
pN
1
1 − χ
M
(p)p
−s
L(χ
N
, s).
In particular,
L(χ
M
, s)
L(χ
N
, s)
=
Y
pM
pN
1
1 − χ
M
(p)p
−s
is analytic and nonzero for Re(s) > 0.
We’ll next show that
L
(
χ, s
) has a meromorphic continuation to
C
, and is
analytic unless χ = χ
0
.
Theorem.
(i) L
(
χ, s
) has a meromorphic continuation to
C
, which is analytic except for
at worst a simple pole at s = 1.
(ii)
If
χ 6
=
χ
0
(the trivial character), then
L
(
χ, s
) is analytic everywhere. On
the other hand, L(χ
0
, s) has a simple pole with residue
ϕ(N)
N
=
Y
pN
1 −
1
p
,
where ϕ is the Euler function.
Proof. More generally, let φ : Z/NZ → C be any N periodic function, and let
L(φ, s) =
∞
X
n=1
φ(n)n
−s
.
Then
(2π)
−s
Γ(s)L(φ, s) =
∞
X
n=1
φ(n)M(e
−2πny
, s) = M(f(y), s),
where
f(y) =
X
n≥1
φ(n)e
−2πny
.
We can then write
f(y) =
N
X
n=1
∞
X
r=0
φ(n)e
−2π(n+rN)y
=
N
X
n=1
φ(n)
e
−2πny
1 − e
−2πNy
=
N
X
n=1
φ(n)
e
2π(N −n)y
e
2πNy
− 1
.
As 0 ≤ N − n < N, this is O(e
−2πy
) as y → ∞. Copying for ζ(s), we write
M(f, s) =
Z
1
0
+
Z
∞
1
f(y)y
s
dy
y
≡ M
0
(s) + M
∞
(s).
The second term is analytic for all s ∈ C, and the first term can be written as
M
0
(s) =
N
X
n=1
φ(n)
Z
1
0
e
2π(N−n)y
e
2πNy
− 1
y
s
dy
y
.
Now for any L, we can write
e
2π(N−n)y
e
2πNy
− 1
=
1
2πNy
+
L−1
X
r=0
c
r,n
y
r
+ y
L
g
L,n
(y)
for some g
L,n
(y) ∈ C
∞
[0, 1]. Hence we have
M
0
(s) =
N
X
n=1
φ(n)
Z
1
0
1
2πNy
y
s
dy
y
+
Z
1
0
L−1
X
r=0
c
r,n
y
r+s−1
dy
!
+ G(s),
where G(s) is some function analytic for Re(s) > −L. So we see that
(2π)
−s
Γ(s)L(φ, s) =
N
X
n=1
φ(n)
1
2πN(s −1)
+
c
0,n
s
+ ··· +
c
L−1,n
s + L − 1
+ G(s).
As Γ(
s
) has poles at
s
= 0
, −
1
, ···
, this cancels with all the poles apart from the
one at s = 1.
The first part then follows from taking
φ(n) =
(
χ(n) (n, N) = 1
0 (n, N) ≥ 1
.
By reading off the formula, since Γ(1) = 1, we know
res
s=1
L(χ, s) =
1
N
N
X
n=1
φ(n).
If
χ 6
=
χ
0
, then this vanishes by the orthogonality of characters. Otherwise, it is
(Z/N Z)
×
/N = ϕ(N )/N.
Note that this is consistent with the result
L(χ
0
, s) =
Y
pN
(1 − p
−s
)ζ(s).
So for a nontrivial character, our Lfunction doesn’t have a pole.
The next big theorem is that in fact
L
(
χ,
1) is nonzero. In number theory,
there are lots of theorems of this kind, about nonvanishing of
L
functions at
different points.
Theorem. If χ 6= χ
0
, then L(χ, 1) 6= 0.
Proof. The trick is the consider all characters together. We let
ζ
N
(s) =
Y
χ∈
\
(Z/NZ)
×
L(χ, s) =
Y
pN
Y
χ
(1 − χ(p)p
−s
)
−1
for
Re
(
s
)
>
1. Now we know
L
(
χ
0
, s
) has a pole at
s
= 1, and is analytic
everywhere else. So if any other
L
(
χ,
1) = 0, then
ζ
N
(
s
) is analytic on
Re
(
s
)
>
0.
We will show that this cannot be the case.
We begin by finding a nice formula for the product of (1
− χ
(
p
)
p
−s
)
−1
over
all characters.
Claim. If p  N, and T is any complex number, then
Y
χ∈
\
(Z/NZ)
×
(1 − χ(p)T ) = (1 − T
f
p
)
ϕ(N)/f
p
,
where f
p
is the order of p in (Z/nZ)
×
.
So
ζ
N
(s) =
Y
pN
(1 − p
−f
p
s
)
−ϕ(N)/f
p
.
To see this, we write f = f
p
, and, for convenience, write
G = (Z/NZ)
×
H = hpi ⊆ G.
We note that
ˆ
G
naturally contains
[
G/H
=
{χ ∈
ˆ
G
:
χ
(
p
) = 1
}
as a subgroup.
Also, we know that

[
G/H = G/H = ϕ(N)/f.
Also, the restriction map
ˆ
G
[
G/H
→
ˆ
H
is obviously injective, hence an isomorphism by counting orders. So
Y
χ∈
ˆ
G
(1−χ(p)T ) =
Y
χ∈
ˆ
H
(1−χ(p)T )
ϕ(N)/f
=
Y
ζ∈µ
f
(1−ζT )
ϕ(N)/f
= (1−T
f
)
ϕ(N)/f
.
We now notice that when we expand the product of
ζ
N
, at least formally, then we
get a Dirichlet series with nonnegative coefficients. We now prove the following
peculiar property of such Dirichlet series:
Claim. Let
D(s) =
X
n≥1
a
n
n
−s
be a Dirichlet series with real
a
n
≥
0, and suppose this is absolutely convergent
for
Re
(
s
)
> σ >
0. Then if
D
(
s
) can be analytically continued to an analytic
function
˜
D on {Re(s) > 0}, then the series converges for all real s > 0.
Let
ρ > σ
. Then by the analytic continuation, we have a convergent Taylor
series on {s − ρ < ρ}
D(s) =
X
k≥0
1
k!
D
(k)
(ρ)(s − ρ)
k
.
Moreover, since
ρ > σ
, we can directly differentiate the Dirichlet series to obtain
the derivatives:
D
(k)
(ρ) =
X
n≥1
a
n
(−log n)
k
n
−ρ
.
So if 0 < x < ρ, then
D(x) =
X
k≥0
1
k