IA Groups (Full)

Part IA — Groups

Based on lectures by J. Goedecke

Notes taken by Dexter Chua

Michaelmas 2014

These notes are not endorsed by the lecturers, and I have modified them (often

significantly) after lectures. They are nowhere near accurate representations of what

was actually lectured, and in particular, all errors are almost surely mine.

Examples of groups

Axioms for groups. Examples from geometry: symmetry groups of regular polygons,

cub e, tetrahedron. Permutations on a set; the symmetric group. Subgroups and

homomorphisms. Symmetry groups as subgroups of general permutation groups. The

M¨obius group; cross-ratios, preservation of circles, the p oint at infinity. Conjugation.

Fixed p oints of M¨obius maps and iteration. [4]

Lagrange’s theorem

Cosets. Lagrange’s theorem. Groups of small order (up to order 8). Quaternions.

Fermat-Euler theorem from the group-theoretic point of view. [5]

Group actions

Group actions; orbits and stabilizers. Orbit-stabilizer theorem. Cayley’s theorem

(every group is isomorphic to a subgroup of a permutation group). Conjugacy classes.

Cauchy’s theorem. [4]

Quotient groups

Normal subgroups, quotient groups and the isomorphism theorem. [4]

Matrix groups

The general and special linear groups; relation with the M¨obius group. The orthogonal

and special orthogonal groups. Proof (in

) that every element of the orthogonal

group is the product of reflections and every rotation in

has an axis. Basis change

as an example of conjugation. [3]

Permutations

Permutations, cycles and transpositions. The sign of a permutation. Conjugacy in

and in A

. Simple groups; simplicity of A

. [4]

Contents

0 Introduction

1 Groups and homomorphisms

1.1 Groups

1.2 Homomorphisms

1.3 Cyclic groups

1.4 Dihedral groups

1.5 Direct products of groups

2 Symmetric group I

2.1 Symmetric groups

2.2 Sign of permutations

3 Lagrange’s Theorem

3.1 Small groups

3.2 Left and right cosets

4 Quotient groups

4.1 Normal subgroups

4.2 Quotient groups

4.3 The Isomorphism Theorem

5 Group actions

5.1 Group acting on sets

5.2 Orbits and Stabilizers

5.3 Important actions

5.4 Applications

6 Symmetric groups II

6.1 Conjugacy classes in S

6.2 Conjugacy classes in A

7 Quaternions

8 Matrix groups

8.1 General and special linear groups

8.2 Actions of GL

(C)

8.3 Orthogonal groups

8.4 Rotations and reflections in R

and R

8.5 Unitary groups

9 More on regular polyhedra

9.1 Symmetries of the cube

9.2 Symmetries of the tetrahedron

10 M¨obius group

10.1 M¨obius maps

10.2 Fixed points of M¨obius maps

10.3 Permutation properties of M¨obius maps

10.4 Cross-ratios

11 Projective line (non-examinable)

0 Introduction

Group theory is an example of algebra. In pure mathematics, algebra (usually)

does not refer to the boring mindless manipulation of symbols. Instead, in

algebra, we have some set of objects with some operations on them. For example,

we can take the integers with addition as the operation. However, in algebra, we

allow any set and any operations, not just numbers.

Of course, such a definition is too broad to be helpful. We categorize algebraic

structures into different types. In this course, we will study a particular kind of

structures, groups. In the IB Groups, Rings and Modules course, we will study

rings and modules as well.

These different kinds of structures are defined by certain axioms. The group

axioms will say that the operation must follow certain rules, and any set and

operation that satisfies these rules will be considered to form a group. We will

then have a different set of axioms for rings, modules etc.

As mentioned above, the most familiar kinds of algebraic structures are

number systems such as integers and rational numbers. The focus of group

theory, however, is not on things that resemble “numbers”. Instead, it is the

study of symmetries.

First of all, what is a symmetry? We are all familiar with, say, the symmetries

of an (equilateral) triangle (we will always assume the triangle is equilateral). We

rotate a triangle by 120

◦

, and we get the original triangle. We say that rotating

by 120

◦

is a symmetry of a triangle. In general, a symmetry is something we do

to an object that leaves the object intact.

Of course, we don’t require that the symmetry leaves everything intact.

Otherwise, we would only be allowed to do nothing. Instead, we require certain

important things to be intact. For example, when considering the symmetries

of a triangle, we only care about how the resultant object looks, but don’t care

about where the individual vertices went.

In the case of the triangle, we have six symmetries: three rotations (rotation

by 0

◦

, 120

◦

and 240

◦

), and three reflections along the axes below:

These six together form the underlying set of the group of symmetries. A more

sophisticated example is the symmetries of

. We define these as operations on

that leave distances between points unchanged. These include translations,

rotations, reflections, and combinations of these.

So what is the operation? This operation combines two symmetries to give a

new symmetry. The natural thing to do is to do the symmetry one after another.

For example, if we combine the two 120

◦

rotations, we get a 240

◦

rotation.

Now we are studying algebra, not geometry. So to define the group, we

abstract away the triangle. Instead, we define the group to be six objects, say

{e, r, r

, s, rs, r

, with rules defining how we combine two elements to get a

third. Officially, we do not mention the triangle at all when defining the group.

We can now come up with the group axioms. What rules should the set of

symmetries obey? First of all, we must have a “do nothing” symmetry. We call

this the identity element. When we compose the identity with another symmetry,

the other symmetry is unchanged.

Secondly, given a symmetry, we can do the reverse symmetry. So for any

element, there is an inverse element that, when combined with the original, gives

the identity.

Finally, given three symmetries, we can combine them, one after another. If

we denote the operation of the group as

∗

, then if we have three symmetries,

x, y, z

, we should be able to form

x ∗ y ∗ z

. If we want to define it in terms of

the binary operation

∗

, we can define it as (

x ∗ y

)

∗ z

, where we first combine

the first two symmetries, then combine the result with the third. Alternatively,

we can also define it as

x ∗

(

y ∗ z

). Intuitively, these two should give the same

result, since both are applying

after

. Hence we have the third rule

x ∗ (y ∗ z) = (x ∗ y) ∗ z.

Now a group is any set with an operation that satisfies the three rules above.

In group theory, the objective is to study the properties of groups just assuming

these three axioms. It turns out that there is a lot we can talk about.

1 Groups and homomorphisms

1.1 Groups

Definition (Binary operation). A (binary) operation is a way of combining two

elements to get a new element. Formally, it is a map ∗ : A × A → A.

Definition (Group). A group is a set

with a binary operation

∗

satisfying

the following axioms:

1. There is some e ∈ G such that for all a, we have

a ∗ e = e ∗ a = a. (identity)

2. For all a ∈ G, there is some a

−1

∈ G such that

a ∗ a

−1

= a

−1

∗ a = e. (inverse)

3. For all a, b, c ∈ G, we have

(a ∗ b) ∗ c = a ∗ (b ∗ c). (associativity)

Definition (Order of group). The order of the group, denoted by

|G|

, is the

number of elements in G. A group is a finite group if the order is finite.

Note that technically, the inverse axiom makes no sense, since we have not

specified what

is. Even if we take it to be the

given by the identity axiom,

the identity axiom only states there is some

that satisfies that property, but

there could be many! We don’t know which one

a ∗ a

−1

is supposed to be equal

to! So we should technically take that to mean there is some

−1

such that

a ∗ a

−1

and

−1

∗ a

satisfy the identity axiom. Of course, we will soon show that

identities are indeed unique, and we will happily talk about “the” identity.

Some people put a zeroth axiom called “closure”:

0. For all a, b ∈ G, we have a ∗ b ∈ G. (closure)

Technically speaking, this axiom also makes no sense — when we say

∗

is a

binary operation, by definition,

a ∗ b

must be a member of

. However, in

practice, we often have to check that this axiom actually holds. For example, if

we let G be the set of all matrices of the form





1 x y

0 1 z

0 0 1





under matrix multiplication, we will have to check that the product of two such

matrices is indeed a matrix of this form. Officially, we are checking that the

binary operation is a well-defined operation on G.

It is important to know that it is generally not true that

a∗b

b∗a

. There is

no a priori reason why this should be true. For example, if we are considering the

symmetries of a triangle, rotating and then reflecting is different from reflecting

and then rotating.

However, for some groups, this happens to be true. We call such groups

abelian groups.

Definition (Abelian group). A group is abelian if it satisfies

4. (∀a, b ∈ G) a ∗ b = b ∗ a. (commutativity)

If it is clear from context, we are lazy and leave out the operation

∗

, and

write

a ∗ b

. We also write

aaa · · · a

| {z }

n copies

−n

= (

−1

)

etc.

Example. The following are abelian groups:

(i) Z with +

(ii) Q with +

(iii) Z

(integers mod n) with +

(iv) Q

∗

with ×

(v) {−1, 1} with ×

The following are non-abelian groups:

(vi)

Symmetries of an equilateral triangle (or any

-gon) with composition.

)

(vii) 2 × 2 invertible matrices with matrix multiplication (GL

(R))

(viii) Symmetry groups of 3D objects

Recall that the first group axiom requires that there exists an identity element,

which we shall call

. Then the second requires that for each

, there is an inverse

−1

such that

−1

. This only makes sense if there is only one identity

, or

else which identity should a

−1

a be equal to?

We shall now show that there can only be one identity. It turns out that the

inverses are also unique. So we will talk about the identity and the inverse.

Proposition. Let (G, ∗) be a group. Then

(i) The identity is unique.

(ii) Inverses are unique.

Proof.

(i)

Suppose

and

′

are identities. Then we have

′

, treating

as an

inverse, and ee

′

= e, treating e

′

as an inverse. Thus e = e

′

(ii)

Suppose

−1

and

both satisfy the inverse axiom for some

a ∈ G

. Then

b = be = b(aa

−1

) = (ba)a

−1

= ea

−1

= a

−1

. Thus b = a

−1

Proposition. Let (G, ∗) be a group and a, b ∈ G. Then

(i) (a

−1

)

−1

= a

(ii) (ab)

−1

= b

−1

Proof.

(i) Given a

−1

, both a and (a

−1

)

−1

satisfy

−1

= a

−1

x = e.

By uniqueness of inverses, (a

−1

)

−1

= a.

(ii) We have

(ab)(b

−1

) = a(bb

−1

= aea

−1

= aa

−1

= e

Similarly, (

−1

)

. So

−1

is an inverse of

. By the uniqueness

of inverses, (ab)

−1

= b

−1

Sometimes if we have a group

, we might want to discard some of the

elements. For example if

is the group of all symmetries of a triangle, we might

one day decide that we hate reflections because they reverse orientation. So

we only pick the rotations in

and form a new, smaller group. We call this a

subgroup of G.

Definition (Subgroup). A

is a subgroup of

, written

H ≤ G

, if

H ⊆ G

and

H with the restricted operation ∗ from G is also a group.

Example.

– (Z, +) ≤ (Q, +) ≤ (R, +) ≤ (C, +)

– (e, ∗) ≤ (G, ∗) (trivial subgroup)

– G ≤ G

– ({±1}, ×) ≤ (Q

∗

, ×)

According to the definition, to prove that

is a subgroup of

, we need to

make sure

satisfies all group axioms. However, this is often tedious. Instead,

there are some simplified criteria to decide whether H is a subgroup.

Lemma (Subgroup criteria I). Let (G, ∗) be a group and H ⊆ G. H ≤ G iff

(i) e ∈ H

(ii) (∀a, b ∈ H) ab ∈ H

(iii) (∀a ∈ H) a

−1

∈ H

Proof. The group axioms are satisfied as follows:

0. Closure: (ii)

Identity: (i). Note that

and

must have the same identity. Suppose that

and

are the identities of

and

respectively. Then

Now

has an inverse in

. Thus we have

−1

. So

= e

. Thus e

= e

2. Inverse: (iii)

3. Associativity: inherited from G.

Humans are lazy, and the test above is still too complicated. We thus come

up with an even simpler test:

Lemma (Subgroup criteria II). A subset H ⊆ G is a subgroup of G iff:

(I) H is non-empty

(II) (∀a, b ∈ H) ab

−1

∈ H

Proof. (I) and (II) follow trivially from (i), (ii) and (iii).

To prove that (I) and (II) imply (i), (ii) and (iii), we have

(i) H must contain at least one element a. Then aa

−1

= e ∈ H.

(iii) ea

−1

= a

−1

∈ H.

(ii) a(b

−1

)

−1

= ab ∈ H.

Proposition. The subgroups of (

+) are exactly

, for

n ∈ N

(

is the

integer multiples of n).

Proof.

Firstly, it is trivial to show that for any

n ∈ N

is a subgroup. Now

show that any subgroup must be in the form nZ.

Let

H ≤ Z

. We know 0

∈ H

. If there are no other elements in

, then

H = 0Z. Otherwise, pick the smallest positive integer n in H. Then H = nZ.

Otherwise, suppose (

∃a ∈ H

)

n ∤ a

. Let

, where 0

< q < n

. Since

a − pn ∈ H

q ∈ H

. Yet

q < n

but

is the smallest member of

. Contradiction.

So every

a ∈ H

is divisible by

. Also, by closure, all multiples of

must be in

H. So H = nZ.

1.2 Homomorphisms

It is often helpful to study functions between different groups. First, we need to

define what a function is. These definitions should be familiar from IA Numbers

and Sets.

Definition (Function). Given two sets

, a function

X → Y

sends each

x ∈ X

to a particular

(

)

∈ Y

is called the domain and

is the co-domain.

Example.

–

Identity function: for any set

, 1

X → X

with 1

(

) =

is a function.

This is also written as id

–

Inclusion map:

Z → Q

(

) =

. Note that this differs from the

identity function as the domain and codomain are different in the inclusion

map.

– f

: Z → Z: f

(x) = x + 1.

– f

: Z → Z: f

(x) = 2x.

– f

: Z → Z: f

(x) = x

– For g : {0, 1, 2, 3, 4} → {0, 1, 2, 3, 4}, we have:

◦ g

(x) = x + 1 if x < 4; g

(4) = 4.

◦ g

(x) = x + 1 if x < 4; g

(4) = 0.

Definition (Composition of functions). The composition of two functions is a

function you get by applying one after another. In particular, if

X → Y

and

G : Y → Z, then g ◦ f : X → Z with g ◦ f(x) = g(f(x)).

Example.

◦ f

(

) = 2

+ 2.

◦ f

(

) = 2

+ 1. Note that function

composition is not commutative.

Definition (Injective functions). A function

is injective if it hits everything

at most once, i.e.

(∀x, y ∈ X) f(x) = f (y) ⇒ x = y.

Definition (Surjective functions). A function is surjective if it hits everything

at least once, i.e.

(∀y ∈ Y )(∃x ∈ X) f(x) = y.

Definition (Bijective functions). A function is bijective if it is both injective

and surjective. i.e. it hits everything exactly once. Note that a function has an

inverse iff it is bijective.

Example.

and

are injective but not surjective.

and

are neither. 1

and g

are bijective.

Lemma. The composition of two bijective functions is bijective

When considering sets, functions are allowed to do all sorts of crazy things,

and can send any element to any element without any restrictions. However, we

are currently studying groups, and groups have additional structure on top of

the set of elements. Hence we are not interested in arbitrary functions. Instead,

we are interested in functions that “respect” the group structure. We call these

homomorphisms.

Definition (Group homomorphism). Let (

G, ∗

) and (

H, ×

) be groups. A

function f : G → H is a group homomorphism iff

(∀g

, g

∈ G) f(g

) × f(g

) = f(g

∗ g

Definition (Group isomorphism). Isomorphisms are bijective homomorphisms.

Two groups are isomorphic if there exists an isomorphism between them. We

write G

∼

We will consider two isomorphic groups to be “the same”. For example, when

we say that there is only one group of order 2, it means that any two groups of

order 2 must be isomorphic.

Example.

– f

G → H

defined by

(

) =

, where

is the identity of

, is a

homomorphism.

–

G → G

and

Z →

are isomorphisms.

Z → Q

and

Z → Z

are homomorphisms.

– exp : (R, +) → (R

, ×) with exp(x) = e

is an isomorphism.

–

Take (

+) and

: (

ikπ/2

= 0

}, ×

). Then

→ H

f(a) = e

iπa/2

is an isomorphism.

– f

(

)

→ R

∗

with

(

) =

det

(

) is a homomorphism, where

(

)

is the set of 2 × 2 invertible matrices.

Proposition. Suppose that f : G → H is a homomorphism. Then

(i) Homomorphisms send the identity to the identity, i.e.

f(e

) = e

(ii) Homomorphisms send inverses to inverses, i.e.

f(a

−1

) = f(a)

−1

(iii) The composite of 2 group homomorphisms is a group homomorphism.

(iv) The inverse of an isomorphism is an isomorphism.

Proof.

(i)

f(e

) = f(e

)

f(e

)

−1

f(e

) = f(e

)

−1

f(e

)

f(e

) = e

(ii)

= f(e

)

= f(aa

−1

)

= f(a)f(a

−1

)

Since inverses are unique, f(a

−1

) = f(a)

−1

(iii)

Let

→ G

and

→ G

. Then

(

)) =

(

)

(

)) =

g(f(a))g(f(b)).

(iv) Let f : G → H be an isomorphism. Then

−1

(ab) = f

−1



−1

(a)





−1

(b)



= f

−1



−1

(a)f

−1

(b)



= f

−1

(a)f

−1

(b)

−1

is a homomorphism. Since it is bijective,

−1

is an isomorphism.

Definition (Image of homomorphism). If

G → H

is a homomorphism, then

the image of f is

im f = f(G) = {f (g) : g ∈ G}.

Definition (Kernel of homomorphism). The kernel of f, written as

ker f = f

−1

({e

}) = {g ∈ G : f(g) = e

Proposition. Both the image and the kernel are subgroups of the respective

groups, i.e. im f ≤ H and ker f ≤ G.

Proof.

Since

∈ im f

and

∈ ker f

im f

and

ker f

are non-empty. Moreover,

suppose

, b

∈ im f

. Now

∃a

, a

∈ G

such that

(

) =

. Then

−1

f(a

)f(a

−1

) = f(a

−1

) ∈ im f.

Then consider

, b

∈ ker f

. We have

(

−1

) =

(

)

(

)

−1

. So

−1

∈ ker f.

Proposition. Given any homomorphism

G → H

and any

a ∈ G

, for all

k ∈ ker f, aka

−1

∈ ker f.

This proposition seems rather pointless. However, it is not. All subgroups

that satisfy this property are known as normal subgroups, and normal subgroups

have very important properties. We will postpone the discussion of normal

subgroups to later lectures.

Proof. f(aka

−1

) = f(a)f(k)f(a)

−1

= f(a)ef(a)

−1

= e. So aka

−1

∈ ker f.

Example. Images and kernels for previously defined functions:

(i) For the function that sends everything to e, im f = {e} and ker f = G.

(ii) For the identity function, im 1

= G and ker 1

= {e}.

(iii) For the inclusion map ι : Z → Q, we have im ι = Z and ker ι = {0}

(iv) For f

: Z → Z and f

(x) = 2x, we have im f

= 2Z and ker f

= {0}.

(v)

For

det

(

)

→ R

∗

, we have

im det

∗

and

ker det

det A

1} = SL

(R)

Proposition. For all homomorphisms f : G → H, f is

(i) surjective iff im f = H

(ii) injective iff ker f = {e}

Proof.

(i) By definition.

(ii)

We know that

(

) =

. So if

is injective, then by definition

ker f

{e}

. If

ker f

{e}

, then given

a, b

such that

(

) =

(

−1

) =

f(a)f(b)

−1

= e. Thus ab

−1

∈ ker f = {e}. Then ab

−1

= e and a = b.

So far, the definitions of images and kernels seem to be just convenient

terminology to refer to things. However, we will later prove an important

theorem, the first isomorphism theorem, that relates these two objects and

provides deep insights (hopefully).

Before we get to that, we will first study some interesting classes of groups

and develop some necessary theory.

1.3 Cyclic groups

The simplest class of groups is cyclic groups. A cyclic group is a group of the

form

{e, a, a

, a

, · · · , a

n−1

}

, where

. For example, if we consider the

group of all rotations of a triangle, and write

= rotation by 120

◦

, the elements

will be {e, r, r

} with r

= e.

Officially, we define a cyclic group as follows:

Definition (Cyclic group C

). A group G is cyclic if

(∃a)(∀b)(∃n ∈ Z) b = a

i.e. every element is some power of a. Such an a is called a generator of G.

We write C

for the cyclic group of order n.

Example.

(i) Z is cyclic with generator 1 or −1. It is the infinite cyclic group.

(ii) ({+1, −1}, ×) is cyclic with generator −1.

(iii) (Z

, +) is cyclic with all numbers coprime with n as generators.

Notation. Given a group

and

a ∈ G

, we write

⟨a⟩

for the cyclic group

generated by

, i.e. the subgroup of all powers of

. It is the smallest subgroup

containing a.

Definition (Order of element). The order of an element

is the smallest integer

such that

. If

doesn’t exist,

has infinite order. Write

ord

(

) for the

order of a.

We have given two different meanings to the word “order”. One is the order

of a group and the other is the order of an element. Since mathematicians

are usually (but not always) sensible, the name wouldn’t be used twice if they

weren’t related. In fact, we have

Lemma. For a in g, ord(a) = |⟨a⟩|.

Proof.

ord

(

) =

∞



for all

n 

. Otherwise

m−n

. Thus

|⟨a⟩| = ∞ = ord(a).

Otherwise, suppose

ord

(

) =

. Thus

. We now claim that

⟨a⟩

{e, a

, a

, · · · a

k−1

}

. Note that

⟨a⟩

does not contain higher powers of

and higher powers will loop back to existing elements. There are also no repeating

elements in the list provided since a

= a

⇒ a

m−n

= e. So done.

It is trivial to show that

Proposition. Cyclic groups are abelian.

Definition (Exponent of group). The exponent of a group

is the smallest

integer n such that a

= e for all a ∈ G.

1.4 Dihedral groups

Definition (Dihedral groups

). Dihedral groups are the symmetries of a

regular

-gon. It contains

rotations (including the identity symmetry, i.e.

rotation by 0

◦

) and n reflections.

We write the group as

. Note that the subscript refers to the order of

the group, not the number of sides of the polygon.

The dihedral group is not hard to define. However, we need to come up with

a presentation of D

that is easy to work with.

We first look at the rotations. The set of all rotations is generated by

360

◦

This r has order n.

How about the reflections? We know that each reflection has order 2. Let

be our favorite reflection. Then using some geometric arguments, we can show

that any reflection can be written as a product of

and

for some

. We also

have srs = r

−1

Hence we can define

as follows:

is a group generated by

and

and every element can be written as a product of

’s and

’s. Whenever we see

and s

, we replace it by e. When we see srs, we replace it by r

−1

It then follows that every element can be written in the form r

Formally, we can write D

as follows:

= ⟨r, s | r

= s

= e, srs

−1

= r

−1

⟩

= {e, r, r

, · · · r

n−1

, s, rs, r

s, · · · r

n−1

This is a notation we will commonly use to represent groups. For example, a

cyclic group of order n can be written as

= ⟨a | a

= e⟩.

1.5 Direct products of groups

Recall that if we have to sets

X, Y

, then we can obtain the product

X × Y

{(x, y) : x ∈ X, y ∈ Y }. We can do the same if X and Y are groups.

Definition (Direct product of groups). Given two groups (

G, ◦

) and (

H, •

we can define a set

G × H

{

(

g, h

) :

g ∈ G, h ∈ H}

and an operation

, a

) ∗ (b

, b

) = (a

◦ b

, a

• b

). This forms a group.

Why would we want to take the product of two groups? Suppose we have

two independent triangles. Then the symmetries of this system include, say

rotating the first triangle, rotating the second, or rotating both. The symmetry

group of this combined system would then be D

× D

Example.

× C

= {(0, 0), (0, 1), (1, 0), (1, 1)}

= {e, x, y, xy} with everything order 2

= ⟨x, y | x

= y

= e, xy = yx⟩

Proposition. C

× C

∼

iff hcf(m, n) = 1.

Proof.

Suppose that

hcf

(

m, n

) = 1. Let

⟨a⟩

and

⟨b⟩

. Let

be the

order of (

a, b

). Then (

a, b

)

= (

, b

) =

. This is possible only if

n | k

and

m | k

, i.e.

is a common multiple

and

. Since the order is the minimum

value of k that satisfies the above equation, k = lcm(n, m) =

hcf(n,m)

= nm.

Now consider

⟨

(

a, b

)

⟩ ≤ C

× C

. Since (

a, b

) has order

⟨

(

a, b

)

⟩

has

elements. Since

× C

also has

elements,

⟨

(

a, b

)

⟩

must be the whole of

× C

. And we know that ⟨(a, b)⟩

∼

. So C

× C

∼

On the other hand, suppose

hcf

(

m, n

)



= 1. Then

lcm

(

m, n

)



. Then

for any (

a, b

)

∈ C

× C

,we have (

a, b

)

= (

, b

) =

. So the order of any (

a, b

)

is at most

k < mn

. So there is no element of order

. So

× C

is not a

cyclic group of order nm.

Given a complicated group

, it is sometimes helpful to write it as a product

H × K

, which could make things a bit simpler. We can do so by the following

theorem:

Proposition (Direct product theorem). Let

, H

≤ G

. Suppose the following

are true:

(i) H

∩ H

= {e}.

(ii) (∀a

∈ H

) a

= a

(iii) (∀a ∈ G)(∃a

∈ H

) a = a

. We also write this as G = H

Then G

∼

× H

Proof.

Define

×H

→ G

(

, a

) =

. Then it is a homomorphism

since

f((a

, a

) ∗ (b

, b

)) = f(a

, a

)

= a

= f(a

, a

)f(b

, b

Surjectivity follows from (iii). We’ll show injectivity by showing that the kernel

{e}

. If

(

, a

) =

, then we know that

. Then

−1

. Since

∈ H

and

−1

∈ H

, we have

−1

∈ H

∩ H

{e}

. Thus

and ker f = {e}.

2 Symmetric group I

We will devote two full chapters to the study of symmetric groups, because

it is really important. Recall that we defined a symmetry to be an operation

that leaves some important property of the object intact. We can treat each

such operation as a bijection. For example, a symmetry of

is a bijection

→ R

that preserves distances. Note that we must require it to be a

bijection, instead of a mere function, since we require each symmetry to be an

inverse.

We can consider the case where we don’t care about anything at all. So a

“symmetry” would be any arbitrary bijection

X → X

, and the set of all bijections

will form a group, known as the symmetric group. Of course, we will no longer

think of these as “symmetries” anymore, but just bijections.

In some sense, the symmetric group is the most general case of a symmetry

group. In fact, we will later (in Chapter 5) show that every group can be written

as a subgroup of some symmetric group.

2.1 Symmetric groups

Definition (Permutation). A permutation of

is a bijection from a set

X itself. The set of all permutations on X is Sym X.

When composing permutations, we treat them as functions. So if

and

are permutations, σ ◦ ρ is given by first applying ρ, then applying σ.

Theorem. Sym X with composition forms a group.

Proof. The groups axioms are satisfied as follows:

X → X

and

X → X

, then

σ ◦ τ

X → X

. If they are both

bijections, then the composite is also bijective. So if

σ, τ ∈ Sym X

, then

σ ◦ τ ∈ Sym X.

1. The identity 1

: X → X is clearly a permutation, and gives the identity

of the group.

Every bijective function has a bijective inverse. So if

σ ∈ Sym X

, then

−1

∈ Sym X.

3. Composition of functions is associative.

Definition (Symmetric group

). If

is finite, say

|X|

(usually use

{

, · · · , n}

), we write

Sym X

. The is the symmetric group of degree

It is important to note that the degree of the symmetric group is different

from the order of the symmetric group. For example,

has degree 3 but order

6. In general, the order of S

is n!.

There are two ways to write out an element of the symmetric group. The

first is the two row notation.

Notation. (Two row notation) We write 1

, · · · n

on the top line and their

images below, e.g.



1 2 3

2 3 1



∈ S

and



1 2 3 4 5

2 1 3 4 5



∈ S

In general, if σ : X → X, we write



1 2 3 · · · n

σ(1) σ(2) σ(3) · · · σ(n)



Example. For small n, we have

(i) When n = 1, S





= {e}

∼

(ii) When n = 2, S



1 2





1 2

2 1



∼

(iii) When n = 3,













1 2 3





1 2 3

2 3 1





1 2 3

3 1 2





1 2 3

2 1 3





1 2 3

3 2 1





1 2 3

1 3 2













∼

Note that

is not abelian. Thus

is not abelian for

n ≥

3 since we can

always view S

as a subgroup of S

by fixing 4, 5, 6, · · · n.

In general, we can view

as a subgroup of

because each symmetry is

a permutation of the corners.

While the two row notation is fully general and can represent any (finite)

permutation, it is clumsy to write and wastes a lot of space. It is also very

annoying to type using L

X. Hence, most of the time, we actually use the

cycle notation.

Notation (Cycle notation). If a map sends 1

7→

2, 2

7→

3, 3

7→

1, then we

write it as a cycle (1 2 3). Alternatively, we can write (2 3 1) or (3 1 2), but by

convention, we usually write the smallest number first. We leave out numbers

that don’t move. So we write (1 2) instead of (1 2)(3).

For more complicated maps, we can write them as products of cycles. For

example, in S

, we can have things like (1 2)(3 4).

The order of each cycle is the length of the cycle, and the inverse is the cycle

written the other way round, e.g. (1 2 3)

−1

= (3 2 1) = (1 3 2).

Example.

(i)

Suppose we want to simplify (1 2 3)(1 2). Recall that composition is from

right to left. So 1 gets mapped to 3 ((1 2) maps 1 to 2, and (1 2 3) further

maps it to 3). Then 3 gets mapped to 1. 2 is mapped to 2 itself. So

(1 2 3)(1 2) = (1 3)(2)

(ii) (1 2 3 4)(1 4) = (1)(2 3 4) = (2 3 4).

Definition (

-cycles and transpositions). We call (

· · · a

) a

-cycle.

2-cycles are called transpositions. Two cycles are disjoint if no number appears

in both cycles.

Example. (1 2) and (3 4) are disjoint but (1 2 3) and (1 2) are not.

Lemma. Disjoint cycles commute.

Proof.

σ, τ ∈ S

are disjoint cycles. Consider any

. Show that:

(

)) =

(

)). If

is in neither of

and

, then

(

)) =

(

)) =

. Otherwise,

wlog assume that

is in

but not in

. Then

(

)

∈ τ

and thus

(

)

∈ σ

. Thus

(

) =

and

(

)) =

(

). Therefore we have

(

)) =

(

)) =

(

Therefore τ and σ commute.

In general, non-disjoint cycles may not commute. For example, (1 3)(2 3) =

(1 3 2) while (2 3)(1 3) = (1 2 3).

Theorem. Any permutation in

can be written (essentially) uniquely as a

product of disjoint cycles. (Essentially unique means unique up to re-ordering of

cycles and rotation within cycles, e.g. (1 2) and (2 1))

Proof.

Let

σ ∈ S

. Start with (1

(1)

· · ·

). As the set

{

· · · n}

is finite, for some

, we must have

(1) already in the list. If

(1) =

(1),

with

l < k

, then

k−l

(1) = 1. So all

(1) are distinct until we get back to 1.

Thus we have the first cycle (1 σ(1) σ

(1) σ

(1) · · · σ

k−1

(1)).

Now choose the smallest number that is not yet in a cycle, say

. Repeat to

obtain a cycle (

j σ

(

)

(

)

· · · σ

l−1

(

)). Since

is a bijection, nothing in this

cycle can be in previous cycles as well.

Repeat until all

{

· · · n}

are exhausted. This is essentially unique

because every number

completely determines the whole cycle it belongs to,

and whichever number we start with, we’ll end up with the same cycle.

Definition (Cycle type). Write a permutation

σ ∈ S

in disjoint cycle notation.

The cycle type is the list of cycle lengths. This is unique up to re-ordering. We

often (but not always) leave out singleton cycles.

Example. (1 2) has cycle type 2 (transposition). (1 2)(3 4) has cycle type 2, 2

(double transposition). (1 2 3)(4 5) has cycle type 3, 2.

Lemma. For

σ ∈ S

, the order of

is the least common multiple of cycle

lengths in the disjoint cycle notation. In particular, a k-cycle has order k.

Proof.

As disjoint cycles commute, we can group together each cycle when we take

powers. i.e. if σ = τ

· · · τ

with τ

all disjoint cycles, then σ

= τ

· · · τ

Now if cycle

has length

, then

, and

iff

| m

. To get an

such that

, we need all

to divide

. i.e.

is a common multiple of

. Since the order is the least possible

such that

, the order is the

least common multiple of k

Example. Any transpositions and double transpositions have order 2.

(1 2 3)(4 5) has order 6.

2.2 Sign of permutations

To classify different permutations, we can group different permutations according

to their cycle type. While this is a very useful thing to do, it is a rather fine

division. In this section, we will assign a “sign” to each permutation, and each

permutation can either be odd or even. This high-level classification allows us

to separate permutations into two sets, which is also a useful notion.

To define the sign, we first need to write permutations as products of

transpositions.

Proposition. Every permutation is a product of transpositions.

This is not a deep or mysterious fact. All it says is that you can rearrange

things however you want just by swapping two objects at a time.

Proof.

As each permutation is a product of disjoint cycles, it suffices to prove

that each cycle is a product of transpositions. Consider a cycle (

· · · a

This is in fact equal to (

)(

)

· · ·

(

k−1

). Thus a

-cycle can be written

as a product of k − 1 transpositions.

Note that the product is not unique. For example,

(1 2 3 4 5) = (1 2)(2 3)(3 4)(4 5) = (1 2)(2 3)(1 2)(3 4)(1 2)(4 5).

However, the number of terms in the product, mod 2, is always the same.

Theorem. Writing

σ ∈ S

as a product of transpositions in different ways,

either always composed of an even number of transpositions, or always an odd

number of transpositions.

The proof is rather magical.

Proof.

Write #(

) for the number of cycles in disjoint cycle notation, including

singleton cycles. So #(

) =

and #((1 2)) =

n −

1. When we multiply

by a

transposition τ = (c d) (wlog assume c < d),

–

c, d

are in the same

-cycle, say, (

c a

· · · a

k−1

d a

k+1

· · · a

k+l

)(

c d

) =

(c a

k+1

k+2

· · · a

k+l

)(d a

· · · a

k−1

). So #(στ) = #(σ) + 1 .

– If c, d are in different σ-cycles, say

(d a

· · · a

k−1

)(c a

k+1

k+2

· · · a

k+l

)(c d)

= (c a

· · · a

k−1

d a

k+1

· · · a

k+l

)(c d)(c d)

= (c a

· · · a

k−1

d a

k+1

· · · a

k+l

) and #(στ) = #(σ) − 1.

Therefore for any transposition τ, #(στ ) ≡ #(σ) + 1 (mod 2).

Now suppose

· · · τ

′

· · · τ

′

. Since disjoint cycle notation is unique,

#(σ) is uniquely determined by σ.

Now we can construct

by starting with

and multiplying the transpositions

one by one. Each time we add a transposition, we increase #(

) by 1 (

mod

2).

So #(

)

≡

) +

(

mod

2). Similarly, #(

)

≡

) +

(

mod

2). So

l ≡ k

(mod 2).

Definition (Sign of permutation). Viewing

σ ∈ S

as a product of transpositions,

· · · τ

, we call

sgn

(

) = (

−

. If

sgn

(

) = 1, we call

an even permutation.

If sgn(σ) = −1, we call σ an odd permutation.

While

itself is not well-defined, it is either always odd or always even, and

(−1)

is well-defined.

Theorem. For n ≥ 2, sgn : S

→ {±1} is a surjective group homomorphism.

Proof.

Suppose

· · · τ

and

′

· · · τ

. Then

sgn

(

) = (

−

(−1)

= sgn(σ

) sgn(σ

). So it is a homomorphism.

It is surjective since sgn(e) = 1 and sgn((1 2)) = −1.

It is this was rather trivial to prove. The hard bit is showing that

sgn

well defined. If a question asks you to show that

sgn

is a well-defined group

homomorphism, you have to show that it is well-defined.

Lemma.

is an even permutation iff the number of cycles of even length is

even.

Proof.

-cycle can be written as

k −

1 transpositions. Thus an even-length

cycle is odd, vice versa.

Since

sgn

is a group homomorphism, writing

in disjoint cycle notation,

· · · σ

, we get

sgn

(

) =

sgn

(

)

· · · sgn

(

). Suppose there are

even-

length cycles and

odd-length cycles, then

sgn

(

) = (

−

. This is equal to

1 iff (−1)

= 1, i.e. m is even.

Rather confusingly, odd length cycles are even, and even length cycles are

odd.

Definition (Alternating group

). The alternating group

is the kernel of

sgn

, i.e. the even permutations. Since

is a kernel of a group homomorphism,

≤ S

Among the many uses of the

sgn

homomorphism, it is used in the definition

of the determinant of a matrix: if A

n×n

is a square matrix, then

det A =

σ∈S

sgn(σ)a

1σ(1)

· · · a

nσ(n)

Proposition. Any subgroup of

contains either no odd permutations or

exactly half.

Proof.

has at least one odd permutation

, then there exists a bijection

between the odd and even permutations by

σ 7→ στ

(bijection since

σ 7→ στ

−1

is a well-defined inverse). So there are as many odd permutations as even

permutations.

After we prove the isomorphism theorem later, we can provide an even shorter

proof of this.

3 Lagrange’s Theorem

One can model a Rubik’s cube with a group, with each possible move correspond-

ing to a group element. Of course, Rubik’s cubes of different sizes correspond to

different groups.

Suppose I have a 4

4 Rubik’s cube, but I want to practice solving a

2 Rubik’s cube. It is easy. I just have to make sure every time I make a

move, I move two layers together. Then I can pretend I am solving a 2

cube. This corresponds to picking a particular subgroup of the 4

4 group.

Now what if I have a 3

3 cube? I can still practice solving a 2

one. This time, I just look at the corners and pretend that the edges and centers

do not exist. Then I am satisfied when the corners are in the right positions,

while the centers and edges can be completely scrambled. In this case, we are

not taking a subgroup. Instead, we are identifying certain moves together. In

particular, we are treating two moves as the same as long as their difference is

confined to the centers and edges.

Let

be the 3

3 cube group, and

be the subgroup of

that only

permutes the edges and centers. Then for any

a, b ∈ G

, we think

and

are “the

same” if

−1

b ∈ H

. Then the set of things equivalent to

{ah

h ∈ H}

We call this a coset, and the set of cosets form a group.

An immediate question one can ask is: why not

{ha

h ∈ H}

? In

this particular case, the two happen to be the same for all possible

. However,

for a general subgroup

, they need not be. We can still define the coset

{ah

h ∈ H}

, but these are less interesting. For example, the set of all

{aH}

will no longer form a group. We will look into these more in-depth in the

next chapter. In this chapter, we will first look at results for general cosets. In

particular, we will, step by step, prove the things we casually claimed above.

Definition (Cosets). Let

H ≤ G

and

a ∈ G

. Then the set

{ah

h ∈ H}

is a left coset of H and Ha = {ha : h ∈ H} is a right coset of H.

Example.

(i)

Take 2

Z ≤ Z

. Then 6 + 2

{all even numbers}

= 0 + 2

. 1 + 2

{all odd numbers} = 17 + 2Z.

(ii) Take G = S

, let H = ⟨(1 2)⟩ = {e, (1 2)}. The left cosets are

eH = (1 2)H = {e, (1 2)}

(1 3)H = (1 2 3)H = {(1 3), (1 2 3)}

(2 3)H = (1 3 2)H = {(2 3), (1 3 2)}

(iii)

Take

(which is isomorphic to

). Recall

⟨r, s | r

, rs

−1

⟩

.Take

⟨s⟩

{e, s}

. We have left coset

{r, rs

−1

}

and

the right coset Hr = {r, sr}. Thus rH = Hr.

Proposition. aH = bH ⇔ b

−1

a ∈ H.

Proof.

(

⇒

) Since

a ∈ aH

a ∈ bH

. Then

for some

h ∈ H

. So

−1

h ∈ H.

(

⇐

). Let

−1

. Then

∀ah ∈ aH

, we have

(

)

∈

bH. So aH ⊆ bH. Similarly, bH ⊆ aH. So aH = bH.

Definition (Partition). Let

be a set, and

, · · · X

be subsets of

. The

are called a partition of

and

∩ X

∅

for

i 

. i.e. every

element is in exactly one of X

Lemma. The left cosets of a subgroup

H ≤ G

partition

, and every coset has

the same size.

Proof.

For each

a ∈ G

a ∈ aH

. Thus the union of all cosets gives all of

. Now

we have to show that for all

a, b ∈ G

, the cosets

and

are either the same

or disjoint.

Suppose that

and

are not disjoint. Let

∈ aH ∩ bH

. Then

−1

a = h

−1

∈ H. So aH = bH.

To show that they each coset has the same size, note that

H → aH

with

(

) =

is invertible with inverse

−1

(

) =

−1

. Thus there exists a bijection

between them and they have the same size.

Definition (Index of a subgroup). The index of

, written

, is the

number of left cosets of H in G.

Theorem (Lagrange’s theorem). If

is a finite group and

is a subgroup of

G, then |H| divides |G|. In particular,

|H||G : H| = |G|.

Note that the converse is not true. If

divides

|G|

, there is not necessarily a

subgroup of order

, e.g.

= 12 but there is no subgroup of order 6. However,

we will later see that this is true if k is a prime (cf. Cauchy’s theorem).

Proof.

Suppose that there are

left cosets in total. Since the left cosets

partition G, and each coset has size |H|, we have

|H||G : H| = |G|.

Again, the hard part of this proof is to prove that the left cosets partition

and have the same size. If you are asked to prove Lagrange’s theorem in exams,

that is what you actually have to prove.

Corollary. The order of an element divides the order of the group, i.e. for any

finite group G and a ∈ G, ord(a) divides |G|.

Proof.

Consider the subgroup generated by

, which has order

ord

(

). Then by

Lagrange’s theorem, ord(a) divides |G|.

Corollary. The exponent of a group divides the order of the group, i.e. for any

finite group G and a ∈ G, a

|G|

= e.

Proof.

We know that

|G|

k ord

(

) for some

k ∈ N

. Then

|G|

= (

ord(a)

)

= e.

Corollary. Groups of prime order are cyclic and are generated by every non-

identity element.

Proof.

Say

|G|

. If

a ∈ G

is not the identity, the subgroup generated by

must have order

since it has to divide

. Thus the subgroup generated by

has the same size as

and they must be equal. Then

must be cyclic since it

is equal to the subgroup generated by a.

A useful way to think about cosets is to view them as equivalence classes.

To do so, we need to first define what an equivalence class is.

Definition (Equivalence relation). An equivalence relation

∼

is a relation that

is reflexive, symmetric and transitive. i.e.

(i) (∀x) x ∼ x (reflexivity)

(ii) (∀x, y) x ∼ y ⇒ y ∼ x (symmetry)

(iii) (∀x, y, z) [(x ∼ y) ∧ (y ∼ z) ⇒ x ∼ z] (transitivity)

Example. The following relations are equivalence relations:

(i) Consider Z. The relation ≡

defined as a ≡

b ⇔ n | (a − b).

(ii)

Consider the set (formally: class) of all finite groups. Then “is isomorphic

to” is an equivalence relation.

Definition (Equivalence class). Given an equivalence relation

∼

, the

equivalence class of a is

[a]

∼

= [a] = {b ∈ A : a ∼ b}

Proposition. The equivalence classes form a partition of A.

Proof.

By reflexivity, we have

a ∈

[

]. Thus the equivalence classes cover the

whole set. We must now show that for all

a, b ∈ A

, either [

] = [

] or [

]

∩

[

] =

∅

Suppose [

]

∩

[

]



∅

. Then

∃c ∈

[

]

∩

[

]. So

a ∼ c, b ∼ c

. By symmetry,

c ∼ b

By transitivity, we have

a ∼ b

. Now for all

′

∈

[

], we have

b ∼ b

′

. Thus by

transitivity, we have

a ∼ b

′

. Thus [

]

⊆

[

]. Similarly, [

]

⊆

[

] and [

] = [

Lemma. Given a group

and a subgroup

, define the equivalence relation

on G with a ∼ b iff b

−1

a ∈ H. The equivalence classes are the left cosets of H.

Proof. First show that it is an equivalence relation.

(i) Reflexivity: Since aa

−1

= e ∈ H, a ∼ a.

(ii) Symmetry: a ∼ b ⇒ b

−1

a ∈ H ⇒ (b

−1

= a

−1

b ∈ H ⇒ b ∼ a.

(iii)

Transitivity: If

a ∼ b

and

b ∼ c

, we have

−1

a, c

−1

b ∈ H

. So

−1

a ∈ H. So a ∼ c.

To show that the equivalence classes are the cosets, we have

a ∼ b ⇔ b

−1

a ∈

H ⇔ aH = bH.

Example. Consider (

+), and for fixed

, take the subgroup

. The cosets

are 0 +

1 +

H, · · ·

(

n −

1) +

. We can write these as [0]

[1]

[2]

· · ·

[

]. To

perform arithmetic “mod

”, define [

] + [

] = [

], and [

][

] = [

]. We

need to check that it is well-defined, i.e. it doesn’t depend on the choice of the

representative of [a].

If [

] = [

] and [

] = [

], then

and

, then

(

) and

(

kln

). So [

] =

+ b

] and [a

] = [a

We have seen that (

) is a group. What happens with multiplication?

We can only take elements which have inverses (these are called units, cf. IB

Groups, Rings and Modules). Call the set of them

{

[

] : (

a, n

) = 1

}

. We’ll

see these are the units.

Definition (Euler totient function). (Euler totient function) ϕ(n) = |U

Example. If p is a prime, ϕ(n) = p − 1. ϕ(4) = 2.

Proposition. U

is a group under multiplication mod n.

Proof. The operation is well-defined as shown above. To check the axioms:

Closure: if

a, b

are coprime to

, then

a · b

is also coprime to

. So

[a], [b] ∈ U

⇒ [a] · [b] = [a · b] ∈ U

1. Identity: [1]

Let [

]

∈ U

. Consider the map

→ U

with [

]

7→

[

]. This is injective:

if [

] = [

], then

divides

(

− c

). Since

is coprime to

divides

− c

, so [

] = [

]. Since

is finite, any injection (

→ U

) is also a

surjection. So there exists a c such that [ac] = [a][c] = 1. So [c] = [a]

−1

3. Associativity (and also commutativity): inherited from Z.

Theorem (Fermat-Euler theorem). Let n ∈ N and a ∈ Z coprime to n. Then

ϕ(n)

≡ 1 (mod n).

In particular, (Fermat’s Little Theorem) if

is a prime, then for any

not

a multiple of p.

p−1

≡ 1 (mod p).

Proof.

is coprime with

, [

]

∈ U

. Then [

]

= [1], i.e.

ϕ(n)

≡

(mod n).

3.1 Small groups

We will study the structures of certain small groups.

Example (Using Lagrange theorem to find subgroups). To find subgroups of

, we know that the subgroups must have size 1, 2, 5 or 10:

1: {e}

2: The groups generated by the 5 reflections of order 2

The group must be cyclic since it has prime order 5. It is then generated

by an element of order 5, i.e.

r, r

, r

and

. They generate the same

group ⟨r⟩.

10: D

As for D

, subgroups must have order 1, 2, 4 or 8.

1: {e}

2: 5 elements of order 2, namely 4 reflections and r

First consider the subgroup isomorphic to

, which is

⟨r⟩

. There are two

other non-cyclic group.

8: D

Proposition. Any group of order 4 is either isomorphic to C

or C

× C

Proof.

Let

|G|

= 4. By Lagrange theorem, possible element orders are 1 (

only),

2 and 4. If there is an element a ∈ G of order 4, then G = ⟨a⟩

∼

Otherwise all non-identity elements have order 2. Then

must be abelian

(For any

a, b

, (

)

= 1

⇒ ab

= (

)

−1

⇒ ab

−1

⇒ ab

). Pick

2 elements of order 2, say

b, c ∈ G

, then

⟨b⟩

{e, b}

and

⟨c⟩

{e, c}

. So

⟨b⟩ ∩ ⟨c⟩

{e}

. As

is abelian,

⟨b⟩

and

⟨c⟩

commute. We know that

has

order 2 as well, and is the only element of

left. So

∼

⟨b⟩ × ⟨c⟩

∼

× C

by the direct product theorem.

Proposition. A group of order 6 is either cyclic or dihedral (i.e. is isomorphic

to C

or D

). (See proof in next section)

3.2 Left and right cosets

|aH|

|H|

and similarly

|H|

|Ha|

, left and right cosets have the same size.

Are they necessarily the same? We’ve previously shown that they might not be

the same. In some other cases, they are.

Example.

(i)

Take

= (

+) and

= 2

. We have 0 + 2

= 2

+ 0 = even numbers

and 1 + 2

= 2

+ 1 = odd numbers. Since

is abelian,

for all

a, ∈ G, H ≤ G.

(ii)

Let

⟨r, s | r

, rs

−1

⟩

. Let

⟨r⟩

. Since

the cosets partition

, so one must be

and the other

{s, sr

s, sr

= rs} = Us. So for all a ∈ G, aU = Ua.

(iii)

Let

and take

⟨s⟩

. We have

{e, s}

{r, rs

−1

}

and

, r

}

; while

{e, s}, Hr

{r, sr}

and

, sr

}

So the left and right subgroups do not coincide.

This distinction will become useful in the next chapter.

4 Quotient groups

In the previous section, when attempting to pretend that a 3

3 Rubik’s

cube is a 2

2 one, we came up with the cosets

, and claimed that these

form a group. We also said that this is not the case for arbitrary subgroup

but only for subgroups that satisfy

. Before we prove these, we first

study these subgroups a bit.

4.1 Normal subgroups

Definition (Normal subgroup). A subgroup K of G is a normal subgroup if

(∀a ∈ G)(∀k ∈ K) aka

−1

∈ K.

We write K ◁ G. This is equivalent to:

(i) (∀a ∈ G) aK = Ka, i.e. left coset = right coset

(ii) (∀a ∈ G) aKa

−1

= K (cf. conjugacy classes)

From the example last time,

⟨s⟩ ≤ D

is not a normal subgroup, but

⟨r⟩ ◁ D

. We know that every group

has at least two normal subgroups

{e} and G.

Lemma.

(i) Every subgroup of index 2 is normal.

(ii) Any subgroup of an abelian group is normal.

Proof.

(i)

K ≤ G

has index 2, then there are only two possible cosets

and

G \ K

and cosets partition

, the other left coset and right coset

must be G \ K. So all left cosets and right cosets are the same.

(ii) For all a ∈ G and k ∈ K, we have aka

−1

= aa

−1

k = k ∈ K.

Proposition. Every kernel is a normal subgroup.

Proof.

Given homomorphism

G → H

and some

a ∈ G

, for all

k ∈ ker f

, we

have

(

aka

−1

) =

(

)

(

)

(

)

−1

(

)

(

)

−1

. Therefore

aka

−1

∈ ker f

by definition of the kernel.

In fact, we will see in the next section that all normal subgroups are kernels

of some homomorphism.

Example. Consider

. Let

⟨r

⟩

is normal. Check: Any element of

is either

ℓ

for some

ℓ

. Clearly

satisfies

aka

−1

∈ K

. Now check

For the case of

ℓ

, we have

ℓ

(

ℓ

)

−1

ℓ

−ℓ

−1

ssr

−2

For the case of r

ℓ

, r

ℓ

−ℓ

= r

Proposition. A group of order 6 is either cyclic or dihedral (i.e.

∼

or D

Proof.

Let

|G|

= 6. By Lagrange theorem, possible element orders are 1

3 and

6. If there is an

a ∈ G

of order 6, then

⟨a⟩

∼

. Otherwise, we can only

have elements of orders 2 and 3 other than the identity. If

only has elements

of order 2, the order must be a power of 2 by Sheet 1 Q. 8, which is not the case.

So there must be an element

of order 3. So

⟨r⟩ ◁ G

as it has index 2. Now

must also have an element s of order 2 by Sheet 1 Q. 9.

Since

⟨r⟩

is normal, we know that

srs

−1

∈ ⟨r⟩

. If

srs

−1

, then

which is not true. If

srs

−1

, then

and

has order 6 (lcm of the

orders of

and

), which was ruled out above. Otherwise if

srs

−1

then G is dihedral by definition of the dihedral group.

4.2 Quotient groups

Proposition. Let

K ◁ G

. Then the set of (left) cosets of

is a group

under the operation aK ∗ bK = (ab)K.

Proof.

First show that the operation is well-defined. If

′

and

′

we want to show that

aK ∗bK

′

K ∗b

′

. We know that

′

and

′

for some

, k

∈ K

. Then

′

. We know that

−1

b ∈ K

. Let

−1

. Then

. So

′

abk

∈

(

)

. So picking a different

representative of the coset gives the same product.

1. Closure: If aK, bK are cosets, then (ab)K is also a coset

2. Identity: The identity is eK = K (clear from definition)

3. Inverse: The inverse of aK is a

−1

K (clear from definition)

4. Associativity: Follows from the associativity of G.

Definition (Quotient group). Given a group

and a normal subgroup

, the

quotient group or factor group of

, written as

G/K

, is the set of (left)

cosets of K in G under the operation aK ∗ bK = (ab)K.

Note that the set of left cosets also exists for non-normal subgroups (abnormal

subgroups?), but the group operation above is not well defined.

Example.

(i)

Take

and

(which must be normal since

is abelian), the cosets

are

for 0

≤ k < n

. The quotient group is

. So we can write

(

) =

. In fact these are the only quotient groups of

since

are

the only subgroups.

Note that if G is abelian, G/K is also abelian.

(ii)

Take

⟨r⟩ ◁ D

. We have two cosets

and

. So

has order 2

and is isomorphic to C

(iii)

Take

⟨r

⟩ ◁ D

. We know that

G/K

should have

= 4 elements. We

have

G/K

{K, rK

K, sK

K, srK

. We see that all

elements (except K) has order 2, so G/K

∼

× C

Note that quotient groups are not subgroups of

. They contain different

kinds of elements. For example,

Z/nZ

∼

are finite, but all subgroups of

infinite.

Example. (Non-example) Consider

with

⟨s⟩

is not a normal

subgroup. We have

rH ∗ r

, but

rsH

and

srH

(by

considering the individual elements). So we have

rsH ∗ srH

H 

, and

the operation is not well-defined.

Lemma. Given

K ◁ G

, the quotient map

G → G/K

with

g 7→ gK

is a

surjective group homomorphism.

Proof. q

(

) = (

)

aKbK

(

)

(

). So

is a group homomorphism.

Also for all aK ∈ G/K, q(a) = aK. So it is surjective.

Note that the kernel of the quotient map is

itself. So any normal subgroup

is a kernel of some homomorphism.

Proposition. The quotient of a cyclic group is cyclic.

Proof.

Let

with

H ≤ C

. We know that

is also cyclic. Say

⟨c⟩

and

⟨c

⟩

∼

ℓ

, where

kℓ

. We have

{H, cH, c

H, · · · c

k−1

⟨cH⟩

∼

4.3 The Isomorphism Theorem

Now we come to the Really Important Theorem

Theorem (The Isomorphism Theorem). Let

G → H

be a group homomor-

phism with kernel K. Then K ◁ G and G/K

∼

im f.

Proof.

We have proved that

K ◁ G

before. We define a group homomorphism

θ : G/K → im f by θ(aK) = f(a).

First check that this is well-defined: If a

K = a

K, then a

−1

∈ K. So

f(a

)

−1

f(a

) = f(a

−1

) = e.

So f (a

) = f(a

) and θ(a

K) = θ(a

K).

Now we check that it is a group homomorphism:

θ(aKbK) = θ(abK) = f(ab) = f(a)f(b) = θ(aK)θ(bK).

To show that it is injective, suppose

(

) =

(

). Then

(

) =

(

). Hence

f(b)

−1

f(a) = e. Hence b

−1

a ∈ K. So aK = bK.

By definition,

is surjective since

im θ

im f

. So

gives an isomorphism

G/K

∼

im f ≤ H.

is injective, then the kernel is

{e}

, so

G/K

∼

and

is isomorphic to

a subgroup of

. We can think of

as an inclusion map. If

is surjective, then

im f = H. In this case, G/K

∼

Example.

(i)

Take

(

)

→ R

∗

with

A 7→ det A

ker f

(

im f

∗

as for

all

λ ∈ R

∗

det







λ 0 · · · 0

0 1 · · · 0

0 0 0 1







. So we know that

(

)

/SL

(

)

∼

∗

(ii)

Define

: (

→

(

C∗, ×

) with

r 7→ exp

πir

). This is a group homomor-

phism since

(

) =

exp

πi

(

)) =

exp

πir

)

exp

πis

) =

(

)

(

We know that the kernel is

Z ◁ R

. Clearly the image is the unit circle

, ×). So R/Z

∼

, ×).

(iii) G

= (

∗

, ×

) for prime

p 

= 2. We have

G → G

with

a 7→ a

. This

is a homomorphism since (

)

(

∗

is abelian). The kernel is

{±

}

{

, p −

}

. We know that

im f

∼

G/ ker f

with order

p−1

. These

are known as quadratic residues.

Lemma. Any cyclic group is isomorphic to either

(

) for some

n ∈ N

Proof.

Let

⟨c⟩

. Define

Z → G

with

m 7→ c

. This is a group

homomorphism since

is surjective since

is by definition

all c

for all m. We know that ker f ◁ Z. We have three possibilities. Either

(i) ker f = {e}, so F is an isomorphism and G

∼

Z; or

(ii) ker f = Z, then G

∼

Z/Z = {e} = C

; or

(iii) ker f

(since these are the only proper subgroups of

), then

∼

Z/(nZ).

Definition (Simple group). A group is simple if it has no non-trivial proper

normal subgroup, i.e. only {e} and G are normal subgroups.

Example.

for prime

are simple groups since it has no proper subgroups at

all, let alone normal ones. A

is simple, which we will prove after Chapter 6.

The finite simple groups are the building blocks of all finite groups. All finite

simple groups have been classified (The Atlas of Finite Groups). If we have

K ◁ G

with

K 

{e}

, then we can “quotient out”

into

G/K

. If

G/K

is not simple, repeat. Then we can write

as an “inverse quotient” of simple

groups.

5 Group actions

Recall that we came up with groups to model symmetries and permutations.

Intuitively, elements of groups are supposed to “do things”. However, as we

developed group theory, we abstracted these away and just looked at how

elements combine to form new elements. Group actions recapture this idea and

make each group element correspond to some function.

5.1 Group acting on sets

Definition (Group action). Let

be a set and

be a group. An action of

on X is a homomorphism φ : G → Sym X.

This means that the homomorphism

turns each element

g ∈ G

into a

permutation of X, in a way that respects the group structure.

Instead of writing φ(g)(x), we usually directly write g(x) or gx.

Alternatively, we can define the group action as follows:

Proposition. Let

be a set and

be a group. Then

G → Sym X

is a

homomorphism (i.e. an action) iff

G × X → X

defined by

(

g, x

) =

(

)(

)

satisfies

0. (∀g ∈ G)(x ∈ X) θ(g, x) ∈ X.

1. (∀x ∈ X) θ(e, x) = x.

2. (∀g, h ∈ G)(∀x ∈ X) θ(g, θ(h, x)) = θ(gh, x).

This criteria is almost the definition of a homomorphism. However, here we

do not explicitly require

(

g, ·

) to be a bijection, but require

(

e, ·

) to be the

identity function. This automatically ensures that

(

g, ·

) is a bijection, since

when composed with

(

−1

, ·

), it gives

(

e, ·

), which is the identity. So

(

g, ·

)

has an inverse. This is usually an easier thing to show.

Example.

(i)

Trivial action: for any group

acting on any set

, we can have

(

) = 1

for all g, i.e. G does nothing.

(ii) S

acts on {1, · · · n} by permutation.

(iii) D

acts on the vertices of a regular n-gon (or the set {1, · · · , n}).

(iv)

The rotations of a cube act on the faces/vertices/diagonals/axes of the

cube.

Note that different groups can act on the same sets, and the same group can

act on different sets.

Definition (Kernel of action). The kernel of an action

is the kernel of

φ, i.e. all g such that φ(g) = 1

Note that by the isomorphism theorem,

ker φ ◁ G

and

G/K

is isomorphic to

a subgroup of Sym X.

Example.

(i) D

acting on {1, 2 · · · n} gives φ : D

→ S

with kernel {e}.

(ii)

Let

be the rotations of a cube and let it act on the three axes

x, y, z

through the faces. We have

G → S

. Then any rotation by 180

◦

doesn’t

change the axes, i.e. act as the identity. So the kernel of the action has

at least 4 elements:

and the three 180

◦

rotations. In fact, we’ll see later

that these 4 are exactly the kernel.

Definition (Faithful action). An action is faithful if the kernel is just {e}.

5.2 Orbits and Stabilizers

Definition (Orbit of action). Given an action

, the orbit of an element

x ∈ X is

orb(x) = G(x) = {y ∈ X : (∃g ∈ G) g(x) = y}.

Intuitively, it is the elements that x can possibly get mapped to.

Definition (Stabilizer of action). The stabilizer of x is

stab(x) = G

= {g ∈ G : g(x) = x} ⊆ G.

Intuitively, it is the elements in G that do not change x.

Lemma. stab(x) is a subgroup of G.

Proof.

We know that

(

) =

by definition. So

stab

(

) is non-empty. Suppose

g, h ∈ stab

(

), then

−1

(

) =

(

−1

(

)) =

(

) =

. So

−1

∈ stab

(

). So

stab(x) is a subgroup.

Example.

(i)

Consider

acting on the corners of the square

{

}

. Then

orb

(1) =

since 1 can go anywhere by rotations.

stab

(1) =

{e,

reflection

in the line through 1}

(ii)

Consider the rotations of a cube acting on the three axes

x, y, z

. Then

orb

(

) is everything, and

stab

(

) contains

, 180

◦

rotations and rotations

about the x axis.

Definition (Transitive action). An action

is transitive if (

∀x

)

orb

(

) =

X, i.e. you can reach any element from any element.

Lemma. The orbits of an action partition X.

Proof. Firstly, (∀x)(x ∈ orb(x)) as e(x) = x. So every x is in some orbit.

Then suppose

z ∈ orb

(

) and

z ∈ orb

(

), we have to show that

orb

(

) =

orb

(

). We know that

(

) and

(

) for some

, g

. Then

(

) =

(y) and y = g

−1

(x).

For any

(

)

∈ orb

(

), we have

−1

(

). So

w ∈ orb

(

). Thus

orb(y) ⊆ orb(x) and similarly orb(x) ⊆ orb(y). Therefore orb(x) = orb(y).

Suppose a group

acts on

. We fix an

x ∈ X

. Then by definition of

the orbit, given any

g ∈ G

, we have

(

)

∈ orb

(

). So each

g ∈ G

gives us

a member of

orb

(

). Conversely, every object in

orb

(

) arises this way, by

definition of

orb

(

). However, different elements in

can give us the same orbit.

In particular, if

g ∈ stab

(

), then

and

give us the same object in

orb

(

since

(

) =

(

)) =

(

). So we have a correspondence between things in

orb(x) and members of G, “up to stab(x)”.

Theorem (Orbit-stabilizer theorem). Let the group

act on

. Then there

is a bijection between

orb

(

) and cosets of

stab

(

) in

. In particular, if

finite, then

| orb(x)|| stab(x)| = |G|.

Proof.

We biject the cosets of

stab

(

) with elements in the orbit of

. Recall

that G : stab(x) is the set of cosets of stab(x). We can define

θ : (G : stab(x)) → orb(x)

g stab(x) 7→ g(x).

This is well-defined — if

g stab

(

) =

h stab

(

), then

for some

k ∈ stab

(

So h(x) = g(k(x)) = g(x).

This map is surjective since for any

y ∈ orb

(

), there is some

g ∈ G

such

that

(

) =

, by definition. Then

(

g stab

(

)) =

. It is injective since if

g(x) = h(x), then h

−1

g(x) = x. So h

−1

g ∈ stab(x). So g stab(x) = h stab(x).

Hence the number of cosets is

| orb

(

)

. Then the result follows from La-

grange’s theorem.

An important application of the orbit-stabilizer theorem is determining group

sizes. To find the order of the symmetry group of, say, a pyramid, we find

something for it to act on, pick a favorite element, and find the orbit and

stabilizer sizes.

Example.

(i)

Suppose we want to know how big

is.

acts on the vertices

{

, · · · , n}

transitively. So

| orb

(1)

. Also,

stab

(1) =

{e,

reflection

in the line through 1}. So |D

| = | orb(1)|| stab(1)| = 2n.

Note that if the action is transitive, then all orbits have size

|X|

and thus

all stabilizers have the same size.

(ii)

Let

⟨

(1 2)

⟩

act on

{

}

. Then

orb

(1) =

{

}

and

stab

(1) =

{e}

orb(3) = {3} and stab(3) = ⟨(1 2)⟩.

(iii)

Consider

acting on

{

}

. We know that

orb

(1) =

and

= 24.

| stab

(1)

= 6. That makes it easier to find

stab

(1). Clearly

{2,3,4}

∼

fix 1. So

{2,3,4}

≤ stab

(1). However,

= 6 =

| stab

(1)

so this is all of the stabilizer.

5.3 Important actions

Given any group

, there are a few important actions we can define. In particular,

we will define the conjugation action, which is a very important concept on

its own. In fact, the whole of the next chapter will be devoted to studying

conjugation in the symmetric groups.

First, we will study some less important examples of actions.

Lemma (Left regular action). Any group

acts on itself by left multiplication.

This action is faithful and transitive.

Proof. We have

1. (∀g ∈ G)(x ∈ G) g(x) = g · x ∈ G by definition of a group.

2. (∀x ∈ G) e · x = x by definition of a group.

3. g(hx) = (gh)x by associativity.

So it is an action.

To show that it is faithful, we want to know that [(

∀x ∈ X

)

]

⇒ g

This follows directly from the uniqueness of identity.

To show that it is transitive,

∀x, y ∈ G

, then (

−1

)(

) =

. So any

can

be sent to any y.

Theorem (Cayley’s theorem). Every group is isomorphic to some subgroup of

some symmetric group.

Proof.

Take the left regular action of

on itself. This gives a group homo-

morphism

G → Sym G

with

ker φ

{e}

as the action is faithful. By the

isomorphism theorem, G

∼

im φ ≤ Sym G.

Lemma (Left coset action). Let

H ≤ G

. Then

acts on the left cosets of

by left multiplication transitively.

Proof. First show that it is an action:

0. g(aH) = (ga)H is a coset of H.

1. e(aH) = (ea)H = aH.

2. g

(aH)) = g

((g

a)H) = (g

a)H = (g

)(aH).

To show that it is transitive, given

aH, bH

, we know that (

−1

)(

) =

So any aH can be mapped to bH.

In the boring case where

{e}

, then this is just the left regular action

since G/{e}

∼

Definition (Conjugation of element). The conjugation of

a ∈ G

b ∈ G

given by

bab

−1

∈ G

. Given any

a, c

, if there exists some

such that

bab

−1

then we say a and c are conjugate.

What is conjugation? This

bab

−1

form looks familiar from Vectors and

Matrices. It is the formula used for changing basis. If

is the change-of-basis

matrix and

is a matrix, then the matrix in the new basis is given by

bab

−1

. In

this case, bab

−1

is the same matrix viewed from a different basis.

In general, two conjugate elements are “the same” in some sense. For example,

we will later show that in

, two elements are conjugate if and only if they

have the same cycle type. Conjugate elements in general have many properties

in common, such as their order.

Lemma (Conjugation action). Any group

acts on itself by conjugation (i.e.

g(x) = gxg

−1

Proof. To show that this is an action, we have

0. g(x) = gxg

−1

∈ G for all g, x ∈ G.

1. e(x) = exe

−1

= x

2. g(h(x)) = g(hxh

−1

) = ghxh

−1

= (gh)x(gh)

−1

= (gh)(x)

Definition (Conjugacy classes and centralizers). The conjugacy classes are the

orbits of the conjugacy action.

ccl(a) = {b ∈ G : (∃g ∈ g) gag

−1

= b}.

The centralizers are the stabilizers of this action, i.e. elements that commute

with a.

(a) = {g ∈ G : gag

−1

= a} = {g ∈ G : ga = ag}.

The centralizer is defined as the elements that commute with a particular

element a. For the whole group G, we can define the center.

Definition (Center of group). The center of

is the elements that commute

with all other elements.

Z(G) = {g ∈ G : (∀a) gag

−1

= a} = {g ∈ G : (∀a) ga = ag}.

It is sometimes written as C(G) instead of Z(G).

In many ways, conjugation is related to normal subgroups.

Lemma. Let K ◁ G. Then G acts by conjugation on K.

Proof.

We only have to prove closure as the other properties follow from the

conjugation action. However, by definition of a normal subgroup, for every

g ∈ G, k ∈ K, we have gkg

−1

∈ K. So it is closed.

Proposition. Normal subgroups are exactly those subgroups which are unions

of conjugacy classes.

Proof.

Let

K ◁ G

. If

k ∈ K

, then by definition for every

g ∈ G

, we get

gkg

−1

∈ K

. So

ccl

(

)

⊆ K

. So

is the union of the conjugacy classes of all its

elements.

Conversely, if

is a union of conjugacy classes and a subgroup of

, then

for all k ∈ K, g ∈ G, we have gkg

−1

∈ K. So K is normal.

Lemma. Let

be the set of subgroups of

. Then

acts by conjugation on

Proof. To show that it is an action, we have

H ≤ G

, then we have to show that

gHg

−1

is also a subgroup. We

know that

e ∈ H

and thus

geg

−1

e ∈ gHg

−1

, so

gHg

−1

is non-empty.

For any two elements

gag

−1

and

gbg

−1

∈ gHg

−1

, (

gag

−1

)(

gbg

−1

)

−1

g(ab

−1

∈ gHg

−1

. So gHg

−1

is a subgroup.

1. eHe

−1

= H.

2. g

−1

= (g

)H(g

)

−1

Under this action, normal subgroups have singleton orbits.

Definition (Normalizer of subgroup). The normalizer of a subgroup is the

stabilizer of the (group) conjugation action.

(H) = {g ∈ G : gHg

−1

= H}.

We clearly have

H ⊆ N

(

). It is easy to show that

(

) is the largest

subgroup of G in which H is a normal subgroup, hence the name.

There is a connection between actions in general and conjugation of subgroups.

Lemma. Stabilizers of the elements in the same orbit are conjugate, i.e. let

act on X and let g ∈ G, x ∈ X. Then stab(g(x)) = g stab(x)g

−1

5.4 Applications

Example. Let

be the rotations of a cube acting on the vertices. Let

be the set of vertices. Then

|X|

= 8. Since the action is transitive, the

orbit of element is the whole of

. The stabilizer of vertex 1 is the set of

rotations through 1 and the diagonally opposite vertex, of which there are 3. So

| = | orb(1)|| stab(1)| = 8 · 3 = 24.

Example. Let

be a finite simple group of order greater than 2, and

H ≤ G

have index n = 1. Then |G| ≤ n!/2.

Proof.

Consider the left coset action of

. We get a group homomorphism

G → S

since there are

cosets of

. Since

H 

is non-trivial and

ker φ 

. Now

ker φ ◁ G

. Since

is simple,

ker φ

{e}

. So

∼

im φ ⊆ S

by the isomorphism theorem. So |G| ≤ |S

| = n!.

We can further refine this by considering

sgn ◦φ

G → {±

}

. The kernel

of this composite is normal in

. So

ker

(

sgn ◦ϕ

) =

{e}

. Since

G/K

∼

(

sgn ◦ϕ

), we know that

|G|/|K|

= 1 or 2 since

(

sgn ◦ϕ

) has at most

two elements. Hence for

|G| >

2, we cannot have

{e}

, or else

|G|/|K| >

So we must have

, so

sgn

(

)) = 1 for all

and

im φ ≤ A

. So

|G| ≤ n!/2

We have seen on Sheet 1 that if

|G|

is even, then

has an element of order

2. In fact,

Theorem (Cauchy’s Theorem). Let

be a finite group and prime

dividing

|G|

. Then

has an element of order

(in fact there must be at least

p −

elements of order p).

It is important to remember that this only holds for prime

. For example,

doesn’t have an element of order 6 even though 6

12 =

. The converse,

however, holds for any number trivially by Lagrange’s theorem.

Proof.

Let

and

be fixed. Consider

G × G × · · · × G

, the set of

-tuples

of G. Let X ⊆ G

be X = {(a

, a

, · · · , a

) ∈ G

: a

· · · a

= e}.

In particular, if an element

has order

, then (

b, b, · · · , b

)

∈ X

. In fact, if

(b, b, · · · , b) ∈ X and b = e, then b has order p, since p is prime.

Now let

⟨h

e⟩

∼

be a cyclic group of order

with generator

(This h is not related to G in any way). Let H act on X by “rotation”:

h(a

, a

, · · · , a

) = (a

, a

, · · · , a

, a

)

This is an action:

· · · a

, then

−1

· · · a

. So

· · · a

−1

. So

, a

, · · · , a

, a

) ∈ X.

1. e acts as an identity by construction

2. The “associativity” condition also works by construction.

As orbits partition

, the sum of all orbit sizes must be

|X|

. We know that

|X|

|G|

p−1

since we can freely choose the first

p −

1 entries and the last one

must be the inverse of their product. Since

divides

|G|

also divides

|X|

. We

have

| orb

(

, · · · , a

)

|| stab

(

, · · · , a

)

|H|

. So all orbits have size 1 or

, and they sum to

|X|

p×

something. We know that there is one orbit of size

1, namely (

e, e, · · · , e

). So there must be at least

p −

1 other orbits of size 1 for

the sum to be divisible by p.

In order to have an orbit of size 1, they must look like

(a, a, · · · , a)

for some

a ∈ G, which has order p.

6 Symmetric groups II

In this chapter, we will look at conjugacy classes of

and

. It turns out this

is easy for

, since two elements are conjugate if and only if they have the same

cycle type. However, it is slightly more complicated in

. This is since while

(1 2 3) and (1 3 2) might be conjugate in

, the element needed to perform the

conjugation might be odd and not in A

6.1 Conjugacy classes in S

Recall σ, τ ∈ S

are conjugate if ∃ρ ∈ S

such that ρσρ

−1

= τ.

We first investigate the special case, when σ is a k-cycle.

Proposition. If (

· · · a

) is a

-cycle and

ρ ∈ S

, then

(

· · · a

)

−1

the k-cycle (ρ(a

) ρ(a

) · · · ρ(a

)).

Proof.

Consider any

(

) acted on by

(

· · · a

)

−1

. The three permutations

send it to

(

)

7→ a

7→ ρ

(

) and similarly for other

s. Since

bijective, any

can be written as

(

) for some

. So the result is the

-cycle

(ρ(a

) ρ(a

) · · · ρ(a

)).

Corollary. Two elements in

are conjugate iff they have the same cycle type.

Proof.

Suppose

· · · σ

ℓ

, where

are disjoint cycles. Then

ρσρ

−1

ρσ

−1

ρσ

−1

· · · ρσ

ℓ

−1

. Since the conjugation of a cycle conserves its length,

ρσρ

−1

has the same cycle type.

Conversely, if σ, τ have the same cycle type, say

σ = (a

· · · a

)(a

k+1

· · · a

k+ℓ

), τ = (b

· · · b

)(b

k+1

· · · b

k+ℓ

if we let ρ(a

) = b

, then ρσρ

−1

= τ.

Example. Conjugacy classes of S

Cycle type Example element Size of ccl Size of centralizer Sign

(1, 1, 1, 1) e 1 24 +1

(2, 1, 1) (1 2) 6 4 −1

(2, 2) (1 2)(3 4) 3 8 +1

(3, 1) (1 2 3) 8 3 +1

(4) (1 2 3 4) 6 4 −1

We know that a normal subgroup is a union of conjugacy classes. We can now

find all normal subgroups by finding possible union of conjugacy classes whose

cardinality divides 24. Note that the normal subgroup must contain e.

(i) Order 1: {e}

(ii) Order 2: None

(iii) Order 3: None

(iv)

Order 4:

{e,

(1 2)(3 4)

(1 3)(2 4)

(1 4)(2 3)

}

∼

× C

is a possible

candidate. We can check the group axioms and find that it is really a

subgroup

(v) Order 6: None

(vi) Order 8: None

(vii)

Order 12:

(We know it is a normal subgroup since it is the kernel of

the signature and/or it has index 2)

(viii) Order 24: S

We can also obtain the quotients of

/{e}

∼

, S

= {e}.

6.2 Conjugacy classes in A

We have seen that

= 2

and that conjugacy classes in

are “nice”.

How about in A

The first thought is that we write it down:

ccl

(σ) = {τ ∈ S

: (∃ρ ∈ S

) τ = ρσρ

−1

}

ccl

(σ) = {τ ∈ A

: (∃ρ ∈ A

) τ = ρσρ

−1

}

Obviously

ccl

(

)

⊆ ccl

(

), but the converse need not be true since the

conjugation need to map σ to τ may be odd.

Example. Consider (1 2 3) and (1 3 2). They are conjugate in

by (2 3), but

(2 3)

∈ A

. (This does not automatically entail that they are not conjugate in

because there might be another even permutation that conjugate (1 2 3) and

(1 3 2). In A

, (2 3)(4 5) works (but not in A

))

We can use the orbit-stabilizer theorem:

| = | ccl

(σ)||C

(σ)|

| = | ccl

(σ)||C

(σ)|

We know that

is half of

and

ccl

is contained in

ccl

. So we have two

options: either

ccl

(

) =

ccl

(

) and

(

)

(

)

; or

| ccl

(

)

| ccl

(σ)| and C

(σ) = C

(σ).

Definition (Splitting of conjugacy classes). When

| ccl

(

)

| ccl

(

)

, we

say that the conjugacy class of σ splits in A

So the conjugacy classes are either retained or split.

Proposition. For

σ ∈ A

, the conjugacy class of

splits in

if and only if

no odd permutation commutes with σ.

Proof.

We have the conjugacy classes splitting if and only if the centralizer does

not. So instead we check whether the centralizer splits. Clearly

(

) =

(

)

∩ A

. So splitting of centralizer occurs if and only if an odd permutation

commutes with σ.

Example. Conjugacy classes in A

Cycle type Example | ccl

| Odd element in C

? | ccl

(1, 1, 1, 1) e 1 Yes (1 2) 1

(2, 2) (1 2)(3 4) 3 Yes (1 2) 3

(3, 1) (1 2 3) 8 No 4, 4

In the (3, 1) case, by the orbit stabilizer theorem,

((1 2 3))

= 3, which is

odd and cannot split.

Example. Conjugacy classes in A

Cycle type Example | ccl

| Odd element in C

? | ccl

(1, 1, 1, 1, 1) e 1 Yes (1 2) 1

(2, 2, 1) (1 2)(3 4) 15 Yes (1 2) 15

(3, 1, 1) (1 2 3) 20 Yes (4 5) 20

(5) (1 2 3 4 5) 24 No 12, 12

Since the centralizer of (1 2 3 4 5) has size 5, it cannot split, so its conjugacy

class must split.

Lemma. σ = (1 2 3 4 5) ∈ S

has C

(σ) = ⟨σ⟩.

Proof. | ccl

(

)

= 24 and

= 120. So

(

)

= 5. Clearly

⟨σ⟩ ⊆ C

(

Since they both have size 5, we know that C

(σ) = ⟨σ⟩

Theorem. A

is simple.

Proof.

We know that normal subgroups must be unions of the conjugacy classes,

must contain

and their order must divide 60. The possible orders are 1, 2,

3, 4, 5, 6, 10, 12, 15, 20, 30. However, the conjugacy classes 1, 15, 20, 12, 12

cannot add up to any of the possible orders apart from 1 and 60. So we only

have trivial normal subgroups.

In fact, all

for

n ≥

5 are simple, but the proof is horrible (cf. IB Groups,

Rings and Modules).

7 Quaternions

In the remaining of the course, we will look at different important groups. Here,

we will have a brief look at

Definition (Quaternions). The quaternions is the set of matrices



1 0

0 1





i 0

0 −i





0 1

−1 0





0 i

i 0





−1 0

0 −1





−i 0

0 i





0 −1

1 0





0 −i

−i 0



which is a subgroup of GL

(C).

Notation. We can also write the quaternions as

= ⟨a, b : a

= e, b

= a

, bab

−1

= a

−1

⟩

Even better, we can write

= {1, −1, i, −i, j, −j, k, −k}

with

(i) (−1)

= 1

(ii) i

= j

= k

= −1

(iii) (−1)i = −i etc.

(iv) ij = k, jk = i, ki = j

(v) ji = −k, kj = −i, ik = −j

We have

1 =



1 0

0 1



, i =



i 0

0 −i



, j =



0 1

−1 0



, k =



0 i

i 0



−1 =



−1 0

0 −1



, −i =



−i 0

0 i



, −j =



0 −1

1 0



, −k =



0 −i

−i 0



Lemma. If

has order 8, then either

is abelian (i.e.

∼

, C

× C

), or

is not abelian and isomorphic to

(dihedral or

quaternion).

Proof. Consider the different possible cases:

– If G contains an element of order 8, then G

∼

– If all non-identity elements have order 2, then G is abelian (Sheet 1, Q8).

Let

a 

b ∈ G \ {e}

. By the direct product theorem,

⟨a, b⟩

⟨a⟩ × ⟨b⟩

Then take

c ∈ ⟨a, b⟩

. By the direct product theorem, we obtain

⟨a, b, c⟩

⟨a⟩ × ⟨b⟩ × ⟨c⟩

× C

. Since

⟨a, b, c⟩ ⊆ G

and

|⟨a, b, c⟩|

|G|

G = ⟨a, b, c⟩

∼

× C

– G

has no element of order 8 but has an order 4 element

a ∈ G

. Let

⟨a⟩

. Since

has index 2, it is normal in

. So

G/H

∼

since

|G/H|

= 2. This means that for any

b ∈ H

generates

G/H

. Then

(

)

. So

∈ H

. Since

∈ ⟨a⟩

and

⟨a⟩

is a cyclic group,

commutes with a.

If b

= a or a

, then b has order 8. Contradiction. So b

= e or a

We also know that

is normal, so

bab

−1

∈ H

. Let

bab

−1

ℓ

. Since

and

commute, we know that

−2

(

bab

−1

)

−1

ℓ

−1

(bab

−1

)

ℓ

= a

ℓ

. So ℓ

≡ 1 (mod 4). So ℓ ≡ ±1 (mod 4).

◦ When l ≡ 1 (mod 4), bab

−1

= a, i.e. ba = ab. So G is abelian.

∗ If b

= e, then G = ⟨a, b⟩

∼

⟨a⟩ × ⟨b⟩

∼

× C

∗ If b

= a

, then (ba

−1

)

= e. So G = ⟨a, ba

−1

⟩

∼

× C

◦ If l ≡ −1 (mod 4), then bab

−1

= a

−1

∗

, then

⟨a, b

, bab

−1

⟩

. So

∼

by definition.

∗ If b

= a

, then we have G

∼

8 Matrix groups

8.1 General and special linear groups

Consider

n×n

(

), i.e. the set of

n × n

matrices over the field

(or

). We know that matrix multiplication is associative (since they represent

functions) but are, in general, not commutative. To make this a group, we want

the identity matrix

to be the identity. To ensure everything has an inverse, we

can only include invertible matrices.

(We do not necessarily need to take

as the identity of the group. We can,

for example, take



0 0

0 1



and obtain a group in which every matrix is of

the form



0 0

0 a



for some non-zero

. This forms a group, albeit a boring one

(it is simply

∼

∗

))

Definition (General linear group GL

(F )).

(F ) = {A ∈ M

n×n

(F ) : A is invertible}

is the general linear group.

Alternatively, we can define

(

) as matrices with non-zero determinants.

Proposition. GL

(F ) is a group.

Proof.

Identity is

, which is in

(

) by definition (

is its self-inverse). The

composition of invertible matrices is invertible, so is closed. Inverse exist by

definition. Multiplication is associative.

Proposition. det : GL

(F ) → F \ {0} is a surjective group homomorphism.

Proof. det AB

det A det B

. If

is invertible, it has non-zero determinant and

det A ∈ F \ {0}.

To show it is surjective, for any

x ∈ F \ {

}

, if we take the identity matrix

and replace I

with x, then the determinant is x. So it is surjective.

Definition (Special linear group

(

)). The special linear group

(

) is

the kernel of the determinant, i.e.

(F ) = {A ∈ GL

(F ) : det A = 1}.

So SL

(F ) ◁ GL

(F ) as it is a kernel. Note that Q

≤ SL

(C)

8.2 Actions of GL

(C)

Proposition.

(

) acts faithfully on

by left multiplication to the vector,

with two orbits (0 and everything else).

Proof. First show that it is a group action:

1. If A ∈ GL

, then Av ∈ C

. So it is closed.

2. Iv = v for all v ∈ C

3. A(Bv) = (AB)v.

Now prove that it is faithful: a linear map is determined by what it does

on a basis. Take the standard basis e

= (1

, · · · ,

, · · ·

= (0

, · · · ,

1). Any

matrix which maps each e

to itself must be

(since the columns of a matrix

are the images of the basis vectors)

To show that there are 2 orbits, we know that

0 = 0 for all

. Also, as

is invertible,

v = 0

⇔

v = 0. So 0 forms a singleton orbit. Then given any

two vectors v



= w

∈ C

\ {

}

, there is a matrix

A ∈ GL

(

) such that

v = w

(cf. Vectors and Matrices).

Similarly, GL

(R) acts on R

Proposition. GL

n×n

This action can be thought of as a “change of basis” action. Two matrices

are conjugate if they represent the same map but with respect to different bases.

The P is the base change matrix.

From Vectors and Matrices, we know that there are three different types of

orbits for GL

(C): A is conjugate to a matrix of one of these forms:

(i)



λ 0

0 µ



, with λ = µ, i.e. two distinct eigenvalues

(ii)



λ 0

0 λ



, i.e. a repeated eigenvalue with 2-dimensional eigenspace

(iii)



λ 1

0 λ



, i.e. a repeated eigenvalue with a 1-dimensional eigenspace

Note that we said there are three types of orbits, not three orbits. There are

infinitely many orbits, e.g. one for each of λI.

8.3 Orthogonal groups

Recall that

is defined by

, i.e. we reflect the matrix in the diagonal.

They have the following properties:

(i) (AB)

= B

(ii) (A

−1

)

= (A

)

−1

(iii) A

A = I ⇔ AA

= I ⇔ A

−1

= A

. In this case A is orthogonal

(iv) det A

= det A

We are now in

, because orthogonal matrices don’t make sense with complex

matrices.

Note that a matrix is orthogonal if the columns (or rows) form an orthonormal

basis of

I ⇔ a

⇔

, where a

is the

th column

of A.

The importance of orthogonal matrices is that they are the isometries of

Lemma (Orthogonal matrices are isometries). For any orthogonal

and

x, y ∈

, we have

(i) (Ax) · (Ay) = x · y

(ii) |Ax| = |x|

Proof. Treat the dot product as a matrix multiplication. So

(Ax)

(Ay) = x

Ay = x

Iy = x

Then we have

= (

(

x) = x

x =

. Since both are positive, we

know that |Ax| = |x|.

It is important to note that orthogonal matrices are isometries, but not all

isometries are orthogonal. For example, translations are isometries but are not

represented by orthogonal matrices, since they are not linear maps and cannot

be represented by matrices at all! However, it is true that all linear isometries

can be represented by orthogonal matrices.

Definition (Orthogonal group O(n)). The orthogonal group is

O(n) = O

= O

(R) = {A ∈ GL

(R) : A

A = I},

i.e. the group of orthogonal matrices.

We will later show that this is the set of matrices that preserve distances in

Lemma. The orthogonal group is a group.

Proof.

We have to check that it is a subgroup of

(

): It is non-empty,

since

I ∈

). If

A, B ∈

), then (

−1

)(

−1

)

−1

(

−1

)

−1

= I, so AB

−1

∈ O(n) and this is indeed a subgroup.

Proposition. det : O(n) → {±1} is a surjective group homomorphism.

Proof.

For

A ∈

), we know that

. So

det A

= (

det A

)

= 1. So

det A = ±1. Since det(AB) = det A det B, it is a homomorphism. We have

det I = 1, det







−1 0 · · · 0

0 1 · · · 0

0 0 · · · 1







= −1,

so it is surjective.

Definition (Special orthogonal group

(

)). The special orthogonal group is

the kernel of det : O(n) → {±1}.

SO(n) = SO

= SO

(R) = {A ∈ O(n) : det A = 1}.

By the isomorphism theorem, O(n)/SO(n)

∼

What’s wrong with matrices with determinant

−

1? Why do we want to

eliminate these? An important example of an orthogonal matrix with determinant

−

1 is a reflection. These transformations reverse orientation, and is often

unwanted.

Lemma. O(n) = SO(n) ∪







−1 0 · · · 0

0 1 · · · 0

0 0 · · · 1







SO(n)

Proof. Cosets partition the group.

8.4 Rotations and reflections in R

and R

Lemma. SO(2) consists of all rotations of R

around 0.

Proof.

Let

A ∈ SO

(2). So

and

det A

= 1. Suppose



a b

c d



Then

−1



d −b

−c a



−1

implies

ad − bc

= 1,

−b

Combining these equations we obtain

= 1. Set

cos θ

, and

c = sin θ = −b. Then these satisfies all three equations. So

A =



cos θ − sin θ

sin θ cos θ



Note that

maps (1

0) to (

cos θ, sin θ

), and maps (0

1) = (

− sin θ, cos θ

), which

are rotations by θ counterclockwise. So A represents a rotation by θ.

Corollary. Any matrix in O(2) is either a rotation around 0 or a reflection in a

line through 0.

Proof. If A ∈ SO(2), we’ve show that it is a rotation. Otherwise,

A =



1 0

0 −1



cos θ − sin θ

sin θ cos θ





cos θ − sin θ

− sin θ − cos θ



since O(2) =

(2)

∪



1 0

0 −1



(2). This has eigenvalues 1

, −

1. So it is a

reflection in the line of the eigenspace

. The line goes through 0 since the

eigenspace is a subspace which must include 0.

Lemma. Every matrix in SO(3) is a rotation around some axis.

Proof.

Let

A ∈ SO

(3). We know that

det A

= 1 and

is an isometry. The

eigenvalues

must have

|λ|

= 1. They also multiply to

det A

= 1. Since we are

, complex eigenvalues come in complex conjugate pairs. If there are complex

eigenvalues

and

, then

|λ|

= 1. The third eigenvalue must be real and

has to be +1.

If all eigenvalues are real. Then eigenvalues are either 1 or

−

1, and must

multiply to 1. The possibilities are 1

1 and

−

, −

1, all of which contain an

eigenvalue of 1.

So pick an eigenvector for our eigenvalue 1 as the third basis vector. Then in

some orthonormal basis,

A =





a b 0

c d 0

0 0 1





Since the third column is the image of the third basis vector, and by orthogonality

the third row is 0, 0, 1. Now let

′



a b

c d



∈ GL

(R)

with

det A

′

= 1.

′

is still orthogonal, so

′

∈ SO

(2). Therefore

′

is a rotation

and

A =





cos θ − sin θ 0

sin θ cos θ 0

0 0 1





in some basis, and this is exactly the rotation through an axis.

Lemma. Every matrix in O(3) is the product of at most three reflections in

planes through 0.

Note that a rotation is a product of two reflections. This lemma effectively

states that every matrix in O(3) is a reflection, a rotation or a product of a

reflection and a rotation.

Proof.

Recall O(3) =

(3)

∪





1 0 0

0 1 0

0 0 −1





(3). So if

A ∈ SO

(3), we know

that





cos θ − sin θ 0

sin θ cos θ 0

0 0 1





in some basis, which is a composite of two

reflections:

A =





1 0 0

0 −1 0

0 0 1









cos θ sin θ 0

sin θ − cos θ 0

0 0 1





Then if

A ∈





1 0 0

0 1 0

0 0 −1





(3), then it is automatically a product of three

reflections.

In the last line we’ve shown that everything in O(3)

\ SO

(3) can be written as

a product of three reflections, but it is possible that they need only 1 reflection.

However, some matrices do genuinely need 3 reflections, e.g.





−1 0 0

0 −1 0

0 0 −1





8.5 Unitary groups

The concept of orthogonal matrices only make sense if we are talking about

real matrices. If we are talking about complex, then instead we need unitary

matrices. To do so, we replace the transposition with the Hermitian conjugate.

It is defined by

†

= (

∗

)

with (

†

)

∗

, where the asterisk is the complex

conjugate. We still have

(i) (AB)

†

= B

†

(ii) (A

−1

)

†

= (A

†

)

−1

(iii) A

†

A = I ⇔ AA

†

= I ⇔ A

†

= A

−1

. We say A is a unitary matrix

(iv) det A

†

= (det A)

∗

Definition (Unitary group U(n)). The unitary group is

U(n) = U

= {A ∈ GL

†

A = I}.

Lemma.

det

: U(

)

→ S

, where

is the unit circle in the complex plane, is a

surjective group homomorphism.

Proof.

We know that 1 =

det I

det A

†

| det A|

. So

| det A|

= 1. Since

det AB = det A det B, it is a group homomorphism.

Now given

λ ∈ S

, we have







λ 0 · · · 0

0 1 · · · 0

0 0 0 1







∈

). So it is surjective.

Definition (Special unitary group

(

)). The special unitary group

(

) =

is the kernel of det U (n) → S

Similarly, unitary matrices preserve the complex dot product: (

(

y) =

x · y.

9 More on regular polyhedra

In this section, we will look at the symmetry groups of the cube and the

tetrahedron.

9.1 Symmetries of the cube

Rotations

Recall that there are

= 24 rotations of the group by the orbit-stabilizer

theorem.

Proposition. G

∼

, where G

is the group of all rotations of the cube.

Proof.

Consider

acting on the 4 diagonals of the cube. This gives a group

homomorphism

→ S

. We have (1 2 3 4)

∈ im φ

by rotation around

the axis through the top and bottom face. We also (1 2)

∈ im φ

by rotation

around the axis through the mid-point of the edge connect 1 and 2. Since (1 2)

and (1 2 3 4) generate

(Sheet 2 Q. 5d),

im φ

, i.e.

is surjective. Since

| = |G

|, φ must be an isomorphism.

All symmetries

Consider the reflection in the mid-point of the cube

, sending every point to

its opposite. We can view this as

−I

. So it commutes with all other

symmetries of the cube.

Proposition.

∼

× C

, where

is the group of all symmetries of the cube.

Proof.

Let

be “reflection in mid-point” as shown above. This commutes with

everything. (Actually it is enough to check that it commutes with rotations

only)

We have to show that

⟨τ⟩

. This can be deduced using sizes: since

and

⟨τ⟩

intersect at

only, (i) and (ii) of the Direct Product Theorem gives

an injective group homomorphism

× ⟨τ⟩ → G

. Since both sides have the

same size, the homomorphism must be surjective as well. So

∼

× ⟨τ⟩

∼

× C

In fact, we have also proved that the group of symmetries of an octahedron

× C

since the octahedron is the dual of the cube. (if you join the centers

of each face of the cube, you get an octahedron)

9.2 Symmetries of the tetrahedron

Rotations

Let 1

4 be the vertices (in any order).

is just the rotations. Let it act

on the vertices. Then orb(1) = {1, 2, 3, 4} and stab(1) = { rotations in the axis

through 1 and the center of the opposite face } = {e,

2π

4π

}

So |G

| = 4 · 3 = 12 by the orbit-stabilizer theorem.

The action gives a group homomorphism

→ S

. Clearly

ker φ

{e}

≤ S

and

has size 12. We “guess” it is

(actually it must be

since that is the only subgroup of

of order 12, but it’s nice to see why that’s

the case).

If we rotate in an axis through 1, we get (2 3 4)

(2 4 3). Similarly, rotating

through other axes through vertices gives all 3-cycles.

If we rotate through an axis that passes through two opposite edges, e.g.

through 1-2 edge and 3-4 edge, then we have (1 2)(3 4) and similarly we obtain

all double transpositions. So

∼

. This shows that there is no rotation

that fixes two vertices and swaps the other two.

All symmetries

Now consider the plane that goes through 1, 2 and the mid-point of 3 and 4.

Reflection through this plane swaps 3 and 4, but doesn’t change 1

2. So now

stab

(1) =

⟨

(2 3 4)

⟩

∼

(alternatively, if we want to fix 1, we just move

2, 3, 4 around which is the symmetries of the triangular base)

|G|

= 4

6 = 24 and

∼

(which makes sense since we can move

any of its vertices around in any way and still be a tetrahedron, so we have all

permutations of vertices as the symmetry group)

10 M¨obius group

10.1 M¨obius maps

We want to study maps

C → C

in the form

(

) =

az+b

cz+d

with

a, b, c, d ∈ C

and ad − bc = 0.

We impose

ad − bc 

= 0 or else the map will be constant: for any

z, w ∈ C

(

)

− f

(

) =

(az+b)(cw+d)−(aw+b)(cz+d)

(cw+d)(cz+d)

(ad−bc)(z−w)

(cw+d)(cz+d)

. If

ad − bc

= 0, then

is constant and boring (more importantly, it will not be invertible).

c 

= 0, then

(

−

) involves division by 0. So we add

∞

to form

the extended complex plane (Riemann sphere)

C ∪ {∞}

∞

(cf. Vectors and

Matrices). Then we define

(

−

) =

∞

. We call

∞

a one-point compactification

(because it adds one point to

to make it compact, cf. Metric and Topology).

Definition (M¨obius map). A M¨obius map is a map from

∞

→ C

∞

of the form

f(z) =

az + b

cz + d

where

a, b, c, d ∈ C

and

ad − bc 

= 0, with

(

−

) =

∞

and

(

∞

) =

when

c 

= 0.

(if c = 0, then f (∞) = ∞)

Lemma. The M¨obius maps are bijections C

∞

→ C

∞

Proof.

The inverse of

(

) =

az+b

cz+d

(

) =

dz−b

−cz+a

, which we can check by

composition both ways.

Proposition. The M¨obius maps form a group

under function composition.

(The M¨obius group)

Proof. The group axioms are shown as follows:

(

) =

z+b

z+d

and

(

) =

z+b

z+d

, then

◦f

(

) =



z+b

z+d



+ b



z+b

z+d



+ d

+ b

)z + (a

+ b

)

+ d

)z + (c

+ d

)

. Now we have to check that

ad − bc 

= 0:

we have (

)(

)

−

(

)(

) = (

−

)(a

− b

) = 0.

(This works for

z 

∞, −

. We have to manually check the special cases,

which is simply yet more tedious algebra)

1. The identity function is 1(z) =

1z+0

0+1

which satisfies ad − bc = 0.

We have shown above that

−1

(

) =

dz−b

−cz+a

with

da − bc 

= 0, which are

also M¨obius maps

3. Composition of functions is always associative

is not abelian. e.g.

(

) = 2

and

(

) =

+ 1 are not commutative:

◦ f

(z) = 2z + 2 and f

◦ f

(z) = 2z + 1.

Note that the point at “infinity” is not special.

∞

is no different to any other

point of the Riemann sphere. However, from the way we write down the M¨obius

map, we have to check infinity specially. In this particular case, we can get quite

far with conventions such as

∞

= 0,

= ∞ and

a·∞

c·∞

Clearly

az+b

cz+d

λaz+λb

λcz+λd

for any

λ 

= 0. So we do not have a unique represen-

tation of a map in terms of

a, b, c, d

. But

a, b, c, d

does uniquely determine a

M¨obius map.

Proposition. The map

(

)

→ M

sending



a b

c d



7→

az + b

cz + d

is a

surjective group homomorphism.

Proof.

Firstly, since the determinant

ad−bc

of any matrix in

(

) is non-zero,

it does map to a M¨obius map. This also shows that θ is surjective.

We have previously calculated that

θ(A

) ◦ θ(A

) =

+ b

)z + (a

+ b

)

+ d

)z + (c

+ d

)

= θ(A

)

So it is a homomorphism.

The kernel of θ is

ker(θ) =



A ∈ GL

az + b

cz + d



We can try different values of

∞ ⇒ c

= 0;

= 0

⇒ b

= 0;

= 1

⇒ d

ker θ = Z = {λI : λ ∈ C, λ = 0},

where I is the identity matrix and Z is the centre of GL

(C).

By the isomorphism theorem, we have

∼

(C)/Z

Definition (Projective general linear group

PGL

(

)). (Non-examinable) The

projective general linear group is

PGL

(C)/Z.

Since

iff

λA

for some

λ 

= 0 (where

A, B

are the corresponding

matrices of the maps), if we restrict

(

), we have

θ|

(C)

(

)

→ M

is also surjective. The kernel is now just {±I}. So

∼

(C)/{±I} = PSL

(C)

Clearly PSL

(C)

∼

PGL

Proposition. Every M¨obius map is a composite of maps of the following form:

(i) Dilation/rotation: f(z) = az, a = 0

(ii) Translation: f(z) = z + b

(iii) Inversion: f(z) =

Proof. Let

az+b

cz+d

∈ M.

If c = 0, i.e. g(∞) = ∞, then g(z) =

z +

, i.e.

z 7→

z +

c 

= 0, let

(

∞

) =

, Let

(

) =

z−z

. Then

(

∞

) =

∞

is of the above form.

We have

−1

(

) =

being of type (iii) followed by (ii). So

−1

(

) is

a composition of maps of the three forms listed above.

Alternatively, with sufficient magic, we have

z 7→ z +

7→

z +

7→ −

ad + bc

(z +

)

7→

−

ad + bc

(z +

)

az + b

cz + d

Note that the non-calculation method above can be transformed into another

(different) composition with the same end result. So the way we compose a

M¨obius map from the “elementary” maps are not unique.

10.2 Fixed points of M¨obius maps

Definition (Fixed point). A fixed point of f is a z such that f(z) = z.

We know that any M¨obius map with

= 0 fixes

∞

. We also know that

z → z

for any

b 

= 0 fixes

∞

only, where as

z 7→ az

for

a 

= 0

1 fixes 0 and

∞

. It turns out that you cannot have more than two fixed points, unless you

are the identity.

Proposition. Any M¨obius map with at least 3 fixed points must be the identity.

Proof.

Consider

(

) =

az+b

cz+d

. This has fixed points at those

which satisfy

az+b

cz+d

z ⇔ cz

+ (

d − a

)

z − b

= 0. A quadratic has at most two roots, unless

c = b = 0 and d = a, in which the equation just says 0 = 0.

However, if c = b = 0 and d = a, then f is just the identity.

Proposition. Any M¨obius map is conjugate to

(

) =

νz

for some

ν 

= 0 or to

f(z) = z + 1.

Proof.

We have the surjective group homomorphism

(

)

→ M

. The

conjugacy classes of GL



λ 0

0 µ



7→ g(z) =

λz + 0

0z + µ



λ 0

0 λ



7→ g(z) =

λz + 0

0z + λ

= 1z



λ 1

0 λ



7→ g(z) =

λz + 1

= z +

But the last one is not in the form

+ 1. We know that the last

(

) can

also be represented by



0 1



, which is conjugate to



1 1

0 1



(since that’s its

Jordan-normal form). So z +

is also conjugate to z + 1.

Now we see easily that (for

ν 

= 0

1),

νz

has 0 and

∞

as fixed points,

+ 1

only has ∞. Does this transfer to their conjugates?

Proposition. Every non-identity has exactly 1 or 2 fixed points.

Proof.

Given

f ∈ M

and

f 

. So

∃h ∈ M

such that

hfh

−1

(

) =

νz

. Now

(

) =

w ⇔ hf

(

) =

(

)

⇔ hfh

−1

(

)) =

(

). So

(

) is a fixed point

hfh

−1

. Since

is a bijection,

and

hfh

−1

have the same number of fixed

points.

has exactly 2 fixed points if

is conjugate to

νz

, and exactly 1 fixed

point if f is conjugate to z + 1.

Intuitively, we can show that conjugation preserves fixed points because if we

conjugate by

, we first move the Riemann sphere around by

, apply

(that

fixes the fixed points) then restore the Riemann sphere to its original orientation.

So we have simply moved the fixed point around by h.

10.3 Permutation properties of M¨obius maps

We have seen that the M¨obius map with three fixed points is the identity. As a

corollary, we obtain the following.

Proposition. Given

f, g ∈ M

. If

∃z

, z

∈ C

∞

such that

(

) =

(

), then

f = g. i.e. every M¨obius map is uniquely determined by three points.

Proof.

As M¨obius maps are invertible, write

(

) =

(

) as

−1

(

) =

. So

−1

f has three fixed points. So g

−1

f must be the identity. So f = g.

Definition (Three-transitive action). An action of

is called three-

transitive if the induced action on

{

(

, x

)

∈ X

pairwise disjoint}

given by g(x

, x

) = (g(x

), g(x

)), is transitive.

This means that for any two triples

, x

and

, y

of distinct elements

of X, there exists g ∈ G such that g(x

) = y

If this g is always unique, then the action is called sharply three transitive

This is a really weird definition. The reason we raise it here is that the

M¨obius map satisfies this property.

Proposition. The M¨obius group M acts sharply three-transitively on C

∞

Proof.

We want to show that we can send any three points to any other three

points. However, it is easier to show that we can send any three points to 0

, ∞

Suppose we want to send

→ ∞, z

7→

, z

7→

1. Then the following works:

f(z) =

(z − z

)(z

− z

)

(z − z

)(z

− z

)

If any term

∞

, we simply remove the terms with

, e.g. if

∞

, we have

f(z) =

z−z

−z

So given also

, w

distinct in

∞

and

g ∈ M

sending

7→ ∞, w

7→

0, w

7→ 1, then we have g

−1

f(z

) = w

The uniqueness of the map follows from the fact that a M¨obius map is

uniquely determined by 3 points.

3 points not only define a M¨obius map uniquely. They also uniquely define

a line or circle. Note that on the Riemann sphere, we can think of a line as a

circle through infinity, and it would be technically correct to refer to both of

them as “circles”. However, we would rather be clearer and say “line/circle”.

We will see how M¨obius maps relate to lines and circles. We will first recap

some knowledge about lines and circles in the complex plane.

Lemma. The general equation of a circle or straight line in C is

Az¯z +

Bz + B¯z + C = 0,

where A, C ∈ R and |B|

> AC.

= 0 gives a straight line. If

A 

= 0

, B

= 0, we have a circle centered at the

origin. If C = 0, the circle passes through 0.

Proof.

This comes from noting that

|z − B|

for

r ∈ R >

0 is a circle;

|z − a|

|z − b|

with

a 

is a line. The detailed proof can be found in Vectors

and Matrices.

Proposition. M¨obius maps send circles/straight lines to circles/straight lines.

Note that it can send circles to straight lines and vice versa.

Alternatively, M¨obius maps send circles on the Riemann sphere to circles on

the Riemann sphere.

Proof.

We can either calculate it directly using

az+b

cz+d

⇔ z

dw−b

−cw+a

and

substituting

into the circle equation, which gives

′

w ¯w

′

¯w

′

= 0

with A

′

, C

′

∈ R.

Alternatively, we know that each M¨obius map is a composition of translation,

dilation/rotation and inversion. We can check for each of the three types. Clearly

dilation/rotation and translation maps a circle/line to a circle/line. So we simply

do inversion: if w = z

−1

Az¯z +

Bz + B¯z + C = 0

⇔ Cw ¯w + Bw +

B ¯w + A = 0

Example. Consider

(

) =

z−i

z+i

. Where does the real line go? The real line

is simply a circle through 0

, ∞

maps this circle to the circle containing

f(∞) = 1, f(0) = −1 and f(1) = −i, which is the unit circle.

Where does the upper half plane go? We know that the M¨obius map is

smooth. So the upper-half plane either maps to the inside of the circle or the

outside of the circle. We try the point

, which maps to 0. So the upper half

plane is mapped to the inside of the circle.

10.4 Cross-ratios

Finally, we’ll look at an important concept known as cross-ratios. Roughly

speaking, this is a quantity that is preserved by M¨obius transforms.

Definition (Cross-ratios). Given four distinct points

, z

∈ C

∞

their

cross-ratio is [

, z

] =

(

), with

being the unique M¨obius map that

maps z

7→ ∞, z

7→ 0, z

7→ 1. So [∞, 0, 1, λ] = λ for any λ = ∞, 0, 1. We have

, z

] =

− z

(with special cases as above).

We know that this exists and is uniquely defined because

acts sharply

three-transitively on C

∞

Note that different authors use different permutations of 1

4, but they

all lead to the same result as long as you are consistent.

Lemma. For z

, z

∈ C

∞

all distinct, then

, z

] = [z

, z

] = [z

, z

] = [z

, z

]

i.e. if we perform a double transposition on the entries, the cross-ratio is retained.

Proof. By inspection of the formula.

Proposition. If f ∈ M, then [z

, z

] = [f(z

), f(z

)].

Proof.

Use our original definition of the cross ratio (instead of the formula). Let

g be the unique M¨obius map such that [z

, z

] = g(z

) = λ, i.e.

7−→ ∞

7→ 0

7→ 1

7→ λ

We know that gf

−1

sends

f(z

)

−1

7−−→ z

7−→ ∞

f(z

)

−1

7−−→ z

7−→ 0

f(z

)

−1

7−−→ z

7−→ 1

f(z

)

−1

7−−→ z

7−→ λ

So [f (z

), f(z

)] = gf

−1

f(z

) = g(z

) = λ.

In fact, we can see from this proof that: given

, z

all distinct and

, w

distinct in

∞

, then

∃f ∈ M

with

(

) =

iff [

, z

] =

, w

Corollary. z

, z

lie on some circle/straight line iff [z

, z

] ∈ R.

Proof.

Let

be the circle/line through

, z

. Let

be the unique M¨obius

map with

(

) =

∞

(

) = 0,

(

) = 1. Then

(

) = [

, z

] by

definition.

Since we know that M¨obius maps preserve circle/lines,

∈ C ⇔ g

(

) is on

the line through ∞, 0, 1, i.e. g(z

) ∈ R.

11 Projective line (non-examinable)

We have seen in matrix groups that

(

) acts on

, the column vectors.

Instead, we can also have GL

(i.e. lines) of C

For any v

∈ C

, write the line generated by v as

⟨

⟩

. Then clearly

⟨

⟩

{λ

v :

λ ∈ C}

. Now for any

A ∈ GL

(

), define the action as

A⟨

⟩

⟨A

⟩

. Check

that this is well-defined: for any

⟨

⟩

⟨

⟩

, we want to show that

⟨A

⟩

⟨A

⟩

This is true because

⟨

⟩

⟨

⟩

if and only if w =

v for some

λ ∈ C \ {

}

, and

then ⟨Aw⟩ = ⟨Aλv⟩ = ⟨λ(Av)⟩ = ⟨Av⟩.

What is the kernel of this action? By definition the kernel has to fix all lines.

In particular, it has to fix our magic lines generated by









and





. Since

we want

A⟨





⟩

⟨





⟩

, so we must have









for some

. Similarly,









. So we can write



λ 0

0 µ



. However, also need

A⟨





⟩

⟨





⟩

Since

is a linear function, we know that

















. For the final

vector to be parallel to





, we must have

. So

λI

for some

. Clearly

any matrix of this form fixes any line. So the kernel Z = {λI : λ ∈ C \ {0}}.

Note that every line is uniquely determined by its slope. For any v =

(

, v

)

w = (

, w

), we have

⟨

⟩

⟨

⟩

iff

. So we have a

one-to-one correspondence from our lines to C

∞

, that maps ⟨





⟩ ↔ z

Finally, for each A ∈ GL

(C), given any line ⟨





⟩, we have



a b

c d







az + b

cz + d



↔

az + b

cz + d

(

) acting on the lines is just “the same” as the M¨obius groups acting

on points.