4Well-quasi-orderings

III Logic



4 Well-quasi-orderings
We end by a discussion on well-quasi-orders. The beginning of all this is the
idea of well-foundedness. We begin by recalling the definition of a quasi-order.
Definition
(Quasi-order)
.
An order
hX, ≤i
is a quasi-order if it is transitive
and reflexive, i.e. for all a, b, c X,
If a b and b c, then a c.
We always have a a.
Note that unlike a poset, we do not require reflexivity, i.e. we can have distinct
a, b
satisfying
a b
and
b a
. There is a canonical way of turning a quasi-order
into a partial order, namely by actually identifying elements satisfying
a b
and
b a
, and so it appears we don’t lose much by restricting to partial orders.
However, it turns out a lot of the orders we construct are naturally quasi-orders,
not partial orders, and the extra hypothesis of being a partial order is not really
useful. So we will talk about quasi-orders.
The main idea behind the whole chapter is the notion of well-foundedness.
We probably first met this concept in set theory, where the axiom of foundation
said sets are well-founded. Chances are, it was motivated to get rid of scenarios
such as
x x
. However, as we do set theory, we figure that the real benefit it
brings us is the ability to do ε-induction.
In general, for an arbitrary relation
R
on
X
, we want to admit the following
R-induction principle:
xX
((
yX
R(y, x) ϕ(y)) ϕ(x))
xX
ϕ(x)
How can this fail? Suppose the top line holds, but the bottom line does not. We
set
A = {x : ¬ϕ(x)}.
Then we know this set is non-empty. Moreover, the top line says
xA
yA
R(y, x).
Thus, if we forbid such sets from existing, then we can admit the induction
principle.
Definition
(Well-founded relation)
.
A relation
R
on
X
is said to be well-founded
if for all
A X
non-empty, there is some
x A
such that for any
y A
, we
have ¬R(y, x).
There is another way of expressing the well-founded relation, if we believe in
the axiom of dependent choice.
Axiom
(Axiom of dependent choice)
.
Let
X
be a set an
R
a relation on
X
.
The axiom of dependent choice says if for all
x X
, there exists
y X
such
that
R
(
x, y
), then we can find a sequence
x
1
, x
2
, · · ·
such that
R
(
x
i
, x
i+1
) for
all i N.
This is a rather weak form of the axiom of choice, which we will freely assume.
Assuming this, then being well-founded is equivalent to the non-existing of
infinite decreasing R-chains.
Why do we care about well-foundedness? There are many reasons, and one we
can provide comes from computer science. In general, it guarantees termination.
When writing programs, we often find ourselves writing loops like
while ( some condition holds ) {
do something;
}
This repeats the “do something” until the condition fails. In general, there is no
reason why such a program should eventually terminate.
Often, the loop is actually of the form
while ( i != BOT ) {
do something with i;
}
where
i
takes value in some partial order
R
, with
BOT
being the bottom element.
Suppose every time the loop runs, the value of
i
decreases. Then if
R
is well-
founded, then we are guaranteed that the program will indeed terminate. More
generally, if
R
doesn’t have a unique bottom element, but is still well-founded,
then loops of the form
while ( i is not a minimal element ) {
do something with i;
}
are still guaranteed to terminate.
Example.
Consider typed
λ
calculus, which is the form of
λ
calculus where we
require each term to have a type, and all terms have to be typed, just as we did
when we used it to decorate natural deduction proofs.
We let
X
be the set of all typed
λ
terms up to
α
-equivalence, and for
f, g X
,
we write
f g
if
g
can be
β
-reduced to
f
. It is a remarkable theorem that
X
is well-founded. In other words, we can keep applying
β
-reduction in any
way we like, and we are guaranteed to always end up in a normal form. It is
also true that the normal form of a
λ
term is unique, but this is not implied by
well-foundedness.
It is clear from definition that a quasi-order is never well-founded, since it
is reflexive. It is also clear that it fails for a silly reason, and we can fix it by
defining
Definition
(Well-ordered quasi-order)
.
A quasi-order is well-ordered if there is
no strictly decreasing infinite sequence.
Here we are not worried about choice issues anymore.
Once we are convinced that being well-founded is a good thing, it is natu-
ral ask what operations preserve well-foundedness. We begin by listing some
constructions we can do.
Definition
(
P
(
X
))
.
Let
hX,
X
i
be a quasi-order. We define a quasi-order
+
X
on P(X) by
X
1
+
X
X
2
if
x
1
X
1
x
2
X
2
(x
1
X
x
2
).
Definition
(
X
)
.
Let
hX,
X
i
be a quasi-order. We let
X
be the set of all
finite lists (i.e. sequences) in
X
. We define a quasi-order on
X
recursively by
nil `
1
, where nil is the empty list.
tail(`
1
) `
1
If tail(`
1
) tail(`
2
) and head(`
1
)
X
head(`
2
), then `
1
`
2
.
for all lists `
1
, `
2
.
Equivalently, sequences
{x
i
}
n
i=1
s
{y
i
}
`
i=1
if there is a subsequence
{y
i
k
}
n
k=1
of {y
i
} such that x
k
y
i
k
for all k.
Finally, the construction we are really interested in is trees.
Definition
(Tree)
.
Let
hX,
X
i
be a quasi-order. The set of all trees (in
X
) is
defined inductively as follows:
If
x X
and
L
is a list of trees, then (
x, L
) is a tree. We call
x
a node of
the tree. In particular (x, nil) is a tree. We write
root(x, L) = x, children(x, L) = L.
Haskell programmers would define this by
data Tree a = Branch a [Tree a]
We write Trees(X) for the set of all trees on X.
We define an order relation
s
on
Trees
(
X
) as follows let
T
1
, T
2
Trees
(
X
).
Then T
1
T
2
if
(i) T
1
T
0
for some T
0
children(T
2
).
(ii) root(T
1
) root(T
2
) and children(T
1
) children(T
2
) as lists.
It is an exercise to think hard and convince yourself this recursively-defined
relation is indeed well-defined.
Which of these operations preserve well-foundedness? It is not hard to see
the following:
Lemma.
If
hX,
X
i
is a well-founded quasi-order, then so is
X
and
Trees
(
X
).
The reason why this works is that the lifts of the order on
X
to
X
and
Trees
(
X
) are “of finite character”. To compare two elements in
X
or
Trees
(
X
),
we only have to appeal to finitely many instances of
X
.
On the other hand,
hP
(
X
)
,
+
X
i
is in general not well-founded. For example,
hN,
=
i
is a well-founded quasi-order, being the reflexive transitive closure of the
trivial relation. But
hP
(
N
)
,
=
+
i
has an infinite decreasing sequence. Indeed, =
+
is just the subset relation.
Definition
(Well-quasi-order)
.
A well-quasi-order (WQO) is a well-founded
quasi-order hX,
X
i such that hP(X),
+
X
i is also well-founded.
Note that the power set of a well-quasi-order need not be a well-quasi-order.
The question we want to answer is whether the operations
X
and
Trees
(
X
)
preserve well-quasi-orders. To do so, we try to characterize well-quasi-orders in
a more concrete way.
When does a quasi-order fail to be a well quasi-order? Suppose
x
1
, x
2
, x
3
, · · ·
is an infinite antichain, i.e. a collection of mutually unrelated elements. Then as
above
h{x
i
: i > n} : n Ni
is an infinite descending sequence in
hP
(
X
)
,
+
i
. Of course, to be well-founded,
we also need the non-existence of infinite decreasing sequences.
Proposition. Let hX, be a quasi-order. Then the following are equivalent:
(i) hX, ≤i is a well-quasi-order.
(ii) There is no infinite decreasing sequence and no infinite anti-chain.
(iii)
Whenever we have any sequence
x
i
X
whatsoever, we can find
i < j
such that x
i
x
j
.
In standard literature, (iii) is often taken as the definition of a well-quasi-order
instead.
Proof.
(i) (ii): We just showed this.
(iii)
(i): Suppose
X
1
, X
2
, X
3
, · · ·
is a strictly decreasing chain in
P
(
X
).
Then by definition, we can pick
x
i
X
i
such that
x
i
is not
anything in
X
i+1
. Now for any
j > i
, we know
x
i
6≤ x
j
, because
x
j
is
something in
X
i+1
. So we have found a sequence {x
i
} such that x
i
6≤ x
j
for all i < j.
(iii) (ii) is clear.
(ii)
(iii), we show that whenever
x
i
is a sequence without
i < j
such
that
x
i
x
j
, then it either contains an infinite anti-chain, or an infinite
descending chain.
To do so, we two-colour [N] by
X({i < j}) =
(
0 x
i
> x
j
1 otherwise
By Ramsey’s theorem, this has an infinite monochromatic set. If we have
an infinite monochromatic set of colour 0, then this is a descending chain.
Otherwise, if it is of colour 1, then together with our hypothesis, we know
{x
i
} is an anti-chain.
If we want to reason about well-founded relations, then the following notion
is often a useful one to have in mind:
Definition
(Well-founded part of a relation)
.
Let
hX, Ri
be a set with a relation.
Then the well-founded part of a relation is the
-least subset
A X
satisfying
the property
If x is such that all predecessors of x are in A, then x A.
So for example, any minimal element of
X
would be in the well-founded part.
One can prove that the well-founded part is indeed well-founded without much
difficulty.
Is there a good notion of the WQO-part of a quasi-order? It turns out not,
or at least there isn’t an obvious one that works. For our later discussion, it is
convenient to have the following definition:
Definition
(Bad sequence)
.
Let
hX, ≤i
be a well-founded quasi-order. A se-
quence {x
i
} is bad if for all i < j, we have f (i) 6≤ f (j).
Then a quasi-order is a WQO iff it has no bad sequences.
The idea is now to pick a bad sequence that is “minimal” in some sense, and
then looking at things below this minimal bad sequence will give is a well-quasi-
order. This “WQO-part” isn’t some object canonically associated to the order
hX, ≤i
, because it depends on the choice of the minimal bad sequence, but is
very useful in our upcoming proof.
Definition
(Minimal bad sequence)
.
Let
hX, ≤i
be a quasi-order. A minimal
bad sequence is a bad sequence
{x
i
}
such that for each
k N
,
x
k
is a
-minimal
element in
{x : there exists a bad sequence starting with x
1
, x
2
, · · · , x
k1
, x}.
Lemma.
Let
hX, ≤i
be a well-founded quasi-order that is not a WQO. Then it
has a minimal bad sequence.
Proof.
Just pick each
x
i
according to the requirement in the definition. Such an
x
i
can always be found because hX, ≤i is well-founded.
What makes minimal bad sequences useful is the following promised lemma:
Lemma
(Minimal bad sequence lemma)
.
Let
hX, ≤i
be a quasi-order and
B = {b
i
} a minimal bad sequence. Let
X
0
= {x X :
nN
x < b
n
}.
Then hX
0
, ≤i is a WQO.
Proof.
Suppose
{s
i
}
is a bad sequence in
X
0
. We prove by induction that nothing
in {s
i
} is below b
n
, which gives a contradiction.
Suppose
s
i
< b
0
for some
i
. Then
s
i
, s
i+1
, s
i+2
, · · ·
is a bad sequence, whose
first element is less than b
0
. This then contradicts the minimality of B.
For the induction step, suppose nothing in
S
is strictly less than
b
0
, · · · , b
n
.
Suppose, for contradiction, that s
i
< b
n+1
. Now consider the sequence
b
0
, · · · , b
n
, s
i
, s
i+1
, s
i+2
, · · · .
By minimality of
B
, we know this cannot be a bad sequence. This implies there
is some
n, m
such that the
n
th element of the sequence is less than the
m
th
element. But they can’t be both from the
{b
k
}
or both from then
{s
k
}
. This
implies there is some b
j
s
k
with j n and k i + 1.
Consider
s
k
. Since
s
k
X
0
, it must be
b
m
for some
m
. Moreover, by
induction hypothesis, we must have
m > n
. Then by transitivity, we have
b
j
b
m
, contradicting the fact that B is bad.
We need one more technical lemma.
Lemma
(Perfect subsequence lemma)
.
Let
hX, ≤i
be a WQO. Then every
sequence
{x
n
}
in
X
has a perfect subsequence, i.e. an infinite subset
A N
such
that for all i < j A, we have f(i) f(j).
By definition of well-quasi order, we can always find some
i < j
such that
f(i) f(j). This says we can find an infinite subset such that this works.
Proof. We apply Ramsey’s theorem. We two-colour [N]
2
by
c(i < j) =
(
red f(i) f(j)
blue f(i) 6≤ f(j)
Then Ramsey’s theorem gives us an infinite monochromatic set. But we cannot
have an infinite monochromatic blue set, as this will give a bad sequence. So
there is a monochromatic red set, which is a perfect subsequence.
We are now in a position to prove that lists over a WQO are WQO’s.
Lemma
(Higman’s lemma)
.
Let
hX,
X
i
be a WQO. Then
hX
,
s
i
is a
WQO.
Proof.
We already know that
X
is well-founded. Suppose it is not a WQO.
Then there is a minimal bad sequence {x
i
} of X-lists under
s
.
Now consider the sequence
{head
(
x
i
)
}
. By the perfect subsequence lemma,
after removing some elements in the sequence, we may wlog this is a perfect
sequence.
Now consider the sequence
{tail
(
x
i
)
}
. Now note that
tail
(
x
i
)
< x
i
for each
i
(here it is crucial that lists are finite). Thus, using the notation in the minimal
bad sequence lemma, we know tail(x
i
) X
0
for all i.
But
X
0
is a well quasi-order. So we can find some
i < j
such that
tail
(
x
i
)
tail
(
x
j
). But also
head
(
x
i
)
head
(
x
j
) by assumption. So
x
i
x
j
, and this is a
contradiction.
We can apply a similar juggle to prove Kruskal’s theorem. The twist is that
we will have to apply Higman’s lemma!
Theorem
(Kruskal’s theorem)
.
Let
hX,
X
i
be a WQO. Then
hTrees
(
X
)
,
s
i
is a WQO.
Proof.
We already know that
Trees
(
X
) is well-founded. Suppose it is not a
WQO. Then we can pick a minimal bad sequence {x
i
} of trees.
As before, we may wlog {root(x
i
)} is a perfect sequence. Consider
Y = {T : T children(x
i
) for some i}.
Then
Y X
0
. So
Y
is a WQO, and hence by Higman’s lemma,
Y
is a
WQO. So there exists some
i, j
such that
children
(
x
i
)
children
(
x
j
). But
by perfectness, we have
root
(
x
i
)
root
(
x
j
). So by definition of the ordering,
x
i
x
j
.
What good is this theorem? It is possible to use Kruskal’s theorem to prove
the termination of some algorithms, but this is not a computer science course, and
we will not go into that. Instead, we look at some proof-theoretic consequences
of Kruskal’s theorem.
We begin with the simplest possible WQO the WQO with one element
only and the only possible order. Then a tree over this WQO is just a tree with
no labels on each node, i.e. a tree.
Consider the statement
P (k) =
there exists some
n N
such that there is no finite bad
sequence T
1
, · · · , T
n
where T
i
has k + i nodes
.
This is a finitary version of the statement of Kruskal’s theorem. We claim that
P
(
k
) is true for all
k
. Suppose not. Then for each
n
, we find a bad sequence
T
n
1
, · · · , T
n
n
, where
T
n
i
is a tree with
k
+
i
nodes. We can put these trees in a
grid:
T
1
1
T
2
1
T
2
2
T
3
1
T
3
2
T
3
3
T
4
1
T
4
2
T
4
3
T
4
4
T
5
1
T
5
2
T
5
3
T
5
4
T
5
5
We look at the first column. Each tree has
k
+ 1 nodes. But there are only
finitely many such trees. So there is some tree that appears infinitely often. Pick
one, call it
T
1
, and throw away all other rows. Similarly, we can then find some
tree that appears infinitely often in the second row, call it
T
2
, and throw the
other rows away. We keep going. Then we find a sequence
T
1
, T
2
, T
3
, · · ·
such that every initial segment is a bad sequence. So this is a bad sequence,
which is impossible.
Proposition
(Friedman’s finite form)
.
For all
k N
, the statement
P
(
k
) is
true.
Why do we care about this? The statement
P
(
k
) is something we can write
out in Peano arithmetic, as it is finitary in nature. But our proof certainly
didn’t live in Peano arithmetic, as we had to talk about infinite sets. It turns
out for each
k N
, we can indeed prove
P
(
k
) in Peano arithmetic. However,
it is impossible to prove
k
P
(
k
)! This in particular means we cannot “prove
Kruskal’s theorem” in Peano arithmetic (whatever that might mean).
We can also relate this to ordinal analysis. Given a WQO
hX, ≤i
, we can
construct a well-founded structure from it. The obvious thing to try is to take
hP
(
X
)
,
+
i
, but this is not what we want. The problem is that talking about
power sets involves talking about infinite sets, and this is not something we know
how to do in Peano arithmetic.
Instead, we let
˜
X
be the set of all finite bad sequences in
X
, and order them
by
s t
if
s
is an end-extension of
t
. Note that this is “upside down”. Then by
definition of a WQO, this is a well-founded downward-branching tree whose top
element is the empty sequence. Then by taking the rank of this structure, we
obtain an ordinal.
Using Kruskal’s theorem, we can produce some really huge WQO’s, which in
turn gives us some really huge ordinals. So in some sense, being able to prove
Kruskal’s theorem means we can prove the well-foundedness of some really big
ordinal in our theory.
We end with some natural generalization of WQO’s. Suppose
hX, ≤i
is
a WQO. Then
hP
(
X
)
,
+
i
need not be a WQO. Suppose it is not, and let
X
1
, X
2
, X
3
, · · ·
be a bad sequence of subsets. Then for each
i < j
, we can find
some x
ij
X
i
such that x
ij
is not less than anything in X
j
.
Using dependent choice, we can find a function
f : {hi, ji : i < j N} X
such that for all
i < j < k
, we have
f
(
i, j
)
6≤ f
(
j, k
). This is a strengthening of
bad sequence, where the sequence is now indexed by pairs (i < j).
Definition (ω
2
-good). A well-ordering hX, ≤i is ω
2
-good if for any
f : {hi, ji : i < j N} X,
there is some i < j < k such that f (i, j) f (j, k).
There is an analogue of the perfect subsequence lemma for these, but there
is no analogous notion of a minimal bad sequence (or at least no obvious one).