5Set theory

II Logic and Set Theory



5.1 Axioms of set theory
Definition (Zermelo-Fraenkel set theory). Zermelo-Fraenkel set theory (ZF)
has language = , Π = {∈}, with arity 2.
Then a “universe of sets” will simply mean a model of these axioms, a pair
(
V,
), where
V
is a set and
is a binary relation on
V
in which the axioms are
true (officially, we should write
V
, but it’s so weird that we don’t do it). Each
model (
V,
) will (hopefully) contain a copy of all of maths, and so will look
very complicated!
ZF will have 2 axioms to get started with, 4 to build things, and 3 more
weird ones one might not realize are needed.
Axiom (Axiom of extension). “If two sets have the same elements, they are the
same set”.
(x)(y)((z)(z x z y) x = y).
We could replace the
with an
, but the converse statement
x
=
y
(
z x z y
) is an instance of a logical axiom, and we don’t have to explicitly
state it.
Axiom (Axiom of separation). “Can form subsets of sets”. More precisely, for
any set x and a formula p, we can form {z x : p(z)}.
(t
1
) ···(t
n
)(x)(y)(z)(z y (z x p)).
This is an axiom scheme, with one instance for each formula
p
with free variables
t
1
, ··· , t
n
, z.
Note again that we have those funny (
t
i
). We do need them to form, e.g.
{z x : t z}, where t is a parameter.
This is sometimes known as Axiom of comprehension.
Axiom (Axiom of empty set). “The empty-set exists”
(x)(y)(y ∈ x).
We write
for the (unique, by extension) set with no members. This is an
abbreviation:
p
(
) means (
x
)(
x has no members p
(
x
)). Similarly, we tend to
write {z x : p(z)} for the set given by separation.
Axiom (Axiom of pair set). “Can form {x, y}”.
(x)(y)(z)(t)(t z (t = x t = y)).
We write {x, y} for this set. We write {x} for {x, x}.
We can now define ordered pairs:
Definition (Ordered pair). An ordered pair (x, y) is {{x}, {x, y}}.
We define x is an ordered pair” to mean (y)(z)(x = (y, z)).
We can show that (a, b) = (c, d) (a = c) (b = d).
Definition (Function). We define f is a function to mean
(x)(x f x is an ordered pair)
(x)(y)(z)[(x, y) f (x, z) f ] y = z.
We define
x
=
dom f
to mean
f
is a function and (
y
)(
y x
(
z
)((
y, z
)
f
)).
We define f : x y to mean f is a function and dom f = x and
(z)[(t)((t, z) f ) z y].
Once we’ve defined them formally, let’s totally forget about the definition
and move on with life.
Axiom (Axiom of union). “We can form unions” Intuitively, we have
a b c
=
{x
:
x a or x b or x c}
. but instead of
a b c
, we write
S
{a, b, c}
so that
we can express infinite unions as well.
(x)(y)(z)(z y (t)(t x z t)).
We tend to write
S
x for the set given above. We also write x y for
S
{x, y}.
Note that we can define intersection
T
x
as a subset of
y
(for any
y x
) by
separation, so we don’t need an axiom for that.
Axiom (Axiom of power set). “Can form power sets”.
(x)(y)(z)(z y z x),
where z x means (t)(t z t x).
We tend to write P(x) for the set generated above.
We can now form
x × y
, as a subset of
P
(
P
(
x y
)), because for
t x, s y
,
we have (t, s) = {{t}, {t, s}} P(P(x y)).
Similarly, we can form the set of all functions from
x
to
y
as a subset of
P(x × y) (or P(P(P(x y)))!).
Now we’ve got the easy axioms. Time for the weird ones.
Axiom of infinity
From the axioms so far, we cannot prove that there is an infinite set! We start
from the empty set, and all the operations above can only produce finite sets.
So we need an axiom to say that the natural numbers exists.
But how could we do so in finitely many words? So far, we do have infinitely
many sets. For example, if we write
x
+
for
x {x}
, it is easy to check that
,
+
,
++
, ··· are all distinct.
(Writing them out explicitly, we have
+
=
{∅}
,
++
=
{∅, {∅}}
,
+++
=
{∅, {∅}, {∅, {∅}}}
. We can also write 0 for
, 1 for
+
, 2 for
++
. So 0 =
,
1 = {0}, 2 = {0, 1}, ···)
Note that all of these are individually sets, and so
V
is definitely infinite.
However, inside
V
,
V
is not a set, or else we can apply separation to obtain
Russell’s paradox.
So the collection of all 0
,
1
, ··· ,
need not be a set. Therefore we want an
axiom that declares this is a set. So we want an axiom stating
x
such that
x,
+
x,
++
x, ···
. But this is an infinite statement, and we need a
smarter way of formulating this.
Axiom (Axiom of infinity). “There is an infinite set”.
(x)( x (y)(y x y
+
x)).
We say any set that satisfies the above axiom is a successor set.
A successor set can contain a lot of nonsense elements. How do we want to
obtain N, without nonsense?
We know that the intersection of successors is also a successor set. So there
is a least successor set, i.e. the intersection of all successor sets. Call this
ω
. This
will be our copy of N in V . So
(x)(x ω (y)(y is a successor set x y)).
Therefore, if we have
(x)[(x ω x is a successor set) x = ω],
by definition of
ω
. By the definition of “is a successor set”, we can write this as
(x)[(x ω x (y)(y x y
+
x)) x = ω].
This is genuine induction (in
V
)! It is not just our weak first-order axiom in
Peano’s axioms.
Also, we have
(x)(x ω x
+
= ) and (x)(y)((x ω y ω x
+
= y
+
) x = y)
by
ω
-induction (i.e. induction on
ω
). Hence
ω
satisfies the axioms of natural
numbers.
We can now write, e.g.
x
is finite” for (
y
)(
f
)(
y x f bijects x with y
).
Similarly, we define
x
is countable” to mean
x
is finite or
x
bijects with
ω”.
That’s about it for this axiom. Time for the next one:
Axiom of foundation
“Sets are built out of simpler sets”. We want to ban weird things like
x x
,
or
x y y x
, or similarly for longer chains. We also don’t want infinite
descending chains like ···x
3
x
2
x
1
x
0
.
How can we capture the “wrongness” of these weird things all together? In
the first case
x x
, we see that
{x}
has no
-minimal element (we say
y
is
-minimal in
x
if (
z x
)(
z ∈ y
)). In the second case,
{x, y}
has no minimal
element. In the last case, {x
0
, x
1
, x
2
, ···} has no -minimal element.
Axiom (Axiom of foundation). “Every (non-empty) set has an
-minimal
member”
(x)(x = (y)(y x (z)(z x z ∈ y))).
This is sometimes known as the Axiom of regularity.
Note that most of the time, we don’t actually use the Axiom of Foundation.
It’s here just so that our universe “looks nice”. Most results in set theory don’t
rely on foundation.
We will later show that this entails that all sets in
V
can be “built out of”
the empty set, but that’s left for a later time.
Axiom of replacement
In ordinary mathematics, we often say things like “for each
x I
, we have some
A
i
. Now take
{A
i
:
i I}
”. For example, (after defining ordinals), we want to
have the set {ω + i : i ω}.
How do we know that
{A
i
:
i I}
is a set? How do we know that
i 7→ A
i
is a
function, i.e. that
{
(
i, A
i
) :
i I}
is a set? It feels like it should be. We want an
axiom that says something along the line of “the image of a set under a function
is a set”. However, we do not know that the thing is a function yet. So we will
have “the image of a set under something that looks like a function is a set”.
To formally talk about “something that looks like a function”, we need a
digression on classes:
Digression on classes
x 7→ {x}
for all
x
looks like a function, but isn’t it, since every function
f
has a
(set) domain, defined as a suitable subset of
SS
f
, but our function here has
domain V .
So what is this x 7→ {x}? We call it a class.
Definition (Class). Let (
V,
) be an
L
-structure. A class is a collection
C
of
points of
V
such that, for some formula
p
with free variable
x
(and maybe more
funny parameters), we have
x C p holds.
Intuitively, everything of the form {x V : p(x)} is a class.
Note that here we are abusing notation. When we say
x C
, the symbol
does not mean the membership relation in
V
. Inside the theory, we should view
x C as a shorthand of p(x) holds”.
Example.
(i) V is a class, by letting p be x = x”.
(ii)
For any
t
,
{x V
:
t x}
is a class, with
p
being
t x
. Here
t
is a
parameter.
(iii) For any set y V , y is a class let p be x y”.
Definition (Proper class). We say
C
is a proper class if
C
is not a set (in
V
), ie
¬(y)(x)(x y p).
Similarly, we can have
Definition (Function-class). A function-class
F
is a collection of ordered pairs
such that there is a formula
p
with free variables
x, y
(and maybe more) such
that
(x, y) F p holds, and (x, y) F (x, z) F y = z.
Example. x 7→ {x} is a function-class: take p to be y = {x}.
Back to replacement
How do we talk about function-classes? We cannot refer to classes inside
V
.
Instead, we must refer to the first-order formula p.
Axiom (Axiom of replacement). “The image of a set under a function-class is a
set”. This is an axiom scheme, with an instance for each first-order formula p:
(t
1
) ···(t
n
)
| {z }
parameters
[(x)(y)(z)((p p[z/y]) y = z)]
| {z }
p defines a function-class
[(x)(y)(z y (t)(t x p[t/x, z/y]))]
| {z }
image of x under F is a set
.
Example. For any set
x
, we can form
{{t}
:
t x}
by
p
being
y
=
{x}
”.
However, this is a bad use of replacement, because we can already obtain that
by power set (and separation).
We will give a good example later.
So that’s all the axioms of ZF. Note that we did not include the Axiom of
choice!
Axiom of choice
Definition (ZFC). ZFC is the axioms ZF + AC, where AC is the axiom of
choice, “every family of non-empty sets has a choice function”.
(f)[(x)(x dom f f (x) = )
(g)(dom g = dom f) (x)(x dom g g(x) f (x))]
Here we define a family of sets
{A
i
:
i I}
to be a function
f
:
I V
such that
i 7→ A
i
.