4Topological Dynamics in Ramsey Theory

III Ramsey Theory



4 Topological Dynamics in Ramsey Theory
Recall the finite sum theorem for any fixed
k
, whenever
N
is finitely coloured,
we can find
x
1
, ··· , x
k
such that
F S
(
x
1
, x
2
, ··· , x
k
) is monochromatic. One
natural generalization is to ask the following question can we find an infinite
set A such that
F S(A) =
(
X
xB
x : B A finite non-empty
)
is monochromatic? The first thought upon hearing this question is probably
either “this is so strong that there is no way it can be true”, or “we can deduce
it easily, perhaps via some compactness argument, from the finite sum theorem”.
Both of these thoughts are wrong. It is true that it is always possible to find
these x
1
, x
2
, ···, but the proof is hard. This is Hindman’s theorem.
The way we are going to approach this is via topological dynamics, which
is the title of this chapter. The idea is we construct a metric space (and in
particular a dynamical system) out of the Ramsey-theoretic problem we are
interested in, and then convert the Ramsey-theoretic problem into a question
about topological properties of the space we are constructed.
In the case of Hindman’s theorem, it turns out the “topological” form is a
general topological fact, which we can prove for a general dynamical system.
What is the advantage of this approach? When we construct the dynamical
system we are interested in, what we get is highly “non-geometric”. It would
be very difficult to think of it as an actual “space”. It is just something that
happens to satisfy the definition of a metric space. However, when we prove the
general topological facts, we will think of our metric space as a genuine space
like
R
n
, and this makes it much easier to visualize what is going on in the proofs.
A less obvious benefit of this approach is that we will apply Tychonoff’s
theorem many times, and also use the fact that the possibly infinite intersection
of nested compact sets is open. If we don’t formulate our problem in topological
terms, then we might find ourselves re-proving these repeatedly, which further
complicates the proof.
We now begin our journey.
Definition
(Dynamical system)
.
A dynamical system is a pair (
X, T
) where
X
is the compact metric space and T is a homeomorphism on X.
Example. (T = R/Z, T
α
(x) = x + α) for α R is a dynamical system.
In this course, this is not the dynamical system we are interested in. In fact,
we are only ever going to be interested in one dynamical system.
We fix number of colours
k N
. Instead of looking at a particular colouring,
we consider the space of all k-colourings. We let
C = {c : Z [k]}
=
[k]
Z
.
We endow this with the product topology. Since [
k
] has the discrete topology,
our basic open sets have the form
U = {c : Z [k] : c(i
1
) = c
1
, ··· , c(i
s
) = c
s
}
for some i
1
, ··· , i
s
Z and c
1
, ··· , c
s
[k].
Since [
k
] is trivially compact, by Tychonoff’s theorem, we know that
C
is
compact. But we don’t really need Tychonoff, since we know that
C
is metrizable,
with metric
ρ(c
1
, c
2
) =
1
n + 1
,
where
n
is the largest integer for which
c
1
and
c
2
agree on [
n, n
]. This metric
is easily seen to give rise to the same topology, and by hand, we can see this is
sequentially compact.
We now have to define the operation on C we are interested in.
Definition (Left shift). The left shift operator L : C C is defined by
(Lc)(i) = c(i + 1).
We observe that this map is invertible, with inverse given by the right shift. This
is one good reason to work over
Z
instead of
N
. Moreover, we see that
L
is nice
and uniformly continuous, since if
ρ(c
1
, c
2
)
1
n + 1
,
then
ρ(Lc
1
, Lc
2
)
1
n
.
Similarly,
L
1
is uniformly continuous. So it follows that (
C, L
) is a dynamical
system.
Instead of proving Hindman’s theorem directly, we begin by re-proving van
der Waerden’s theorem using this approach, to understand better how it works.
Theorem
(Topological van der Waerden)
.
Let (
X, T
) be a dynamical system.
Then there exists an
ε >
0 such that whenever
r N
, then we can find
x X
and n N such that ρ(x, T
in
x) < ε for all i = 1, ··· , r.
In words, this says if we flow around
X
under
T
, then eventually some point
must go back near to itself, and must do that as many times as we wish to, in
regular intervals.
To justify calling this the topological van der Waerden theorem, we should
be able to easily deduce van der Waerden’s theorem from this. It is not difficult
to imagine this is vaguely related to van der Waerden in some sense, but it is
not obvious how we can actually do the deduction. After all, topological van
der Waerden theorem talks about the whole space of all colourings, and we do
not get to control what
x
we get back from it. What we want is to start with a
single colouring c and try to say something about this particular c.
Now of course, we cannot restrict to the sub-system consisting only of
c
itself,
because, apart from it being really silly, the function
L
does not restrict to a
map
{c} {c}
. We might think of considering the set of all translates of
c
.
While this is indeed closed under
L
, this is not a dynamical system, since it is
not compact! The solution is to take the closure of that.
Definition
(Orbital closure)
.
Let (
X, T
) be a dynamical system, and
x X
.
The orbital closure ¯x of x is the set cl{T
s
x : s Z}.
Observe that ¯x is a closed subset of a compact space, hence compact.
Proposition. (
¯
X, T ) is a dynamical system.
Proof.
It suffices to show that
¯x
is closed under
T
. If
y ¯x
, then we have
some
s
i
such that
T
s
i
x y
. Since
T
is continuous, we know
T
s
i
+1
x T y
. So
T y ¯x. Similarly, T
1
¯x = ¯x.
Once we notice that we can produce such a dynamical system from our
favorite point, it is straightforward to prove van der Waerden.
Corollary
(van der Waerden theorem)
.
Let
r, k N
. Then whenever
Z
is
k-coloured, then there is a monochromatic arithmetic progression of length r.
Proof.
Let
c
:
Z
[
k
]. Consider (
¯c, L
). By topological van der Waerden, we can
find
x ¯c
and
n N
such that
ρ
(
x, L
in
x
)
<
1 for all
i
= 1
, ··· , r
. In particular,
we know that x and L
in
x agree at 0. So we know that
x(0) = L
in
x(0) = x(in)
for all i = 0, ··· , r.
We are not quite done, because we just know that
x ¯c
, and not that
x = L
k
x for some k.
But this is not bad. Since x ¯c, we can find some s Z such that
ρ(T
s
c, x)
1
rn + 1
.
This means x and T
s
c agree on the first rn + 1 elements. So we know
c(s) = c(s + n) = ··· = c(s + rn).
Since we will be looking at these
¯c
a lot, it is convenient to figure out what
this
¯c
actually looks like. It turns out there is a reasonably good characterization
of these orbital closures.
Let’s look at some specific examples. Consider c given by
Then the orbital closure has just two points, c and Lc.
What if we instead had the following?
Then ¯c contains all translates of these, but also
,
We define the following:
Definition (Seq). Let c : Z [k]. We define
Seq(c) = {(c(i), ··· , c(i + r)) : i Z, r N}.
It turns out these sequences tell us everything we want to know about orbital
closures.
Proposition. We have c
0
¯c iff Seq(c
0
) Seq(c).
The proof is just following the definition and checking everything works.
Proof.
We first prove
. Suppose
c
0
¯c
. Let (
c
0
(
i
)
, ··· , c
0
(
i
+
r
1))
Seq
(
c
).
Then we have s Z such that
ρ(c
0
, L
s
c) <
1
1 + max(|i|, |i + s 1|)
,
which implies
(c(s + i), ··· , c(s + i + r 1)) = (c
0
(i), ··· , c
0
(i + s 1)).
So we are done.
For , if Seq(c
0
) Seq(c), then for all n N, there exists s
n
Z such that
(c
0
(n), ··· , c
0
(n)) = (c(s
n
n), ··· , c(s
n
+ n)).
Then we have
L
s
i
c c
0
.
So we have c
0
¯c.
We now return to talking about topological dynamics. We saw that in our
last example, the orbital closure of
c
has three “kinds” of things translates of
c
, and the all-red and all-blue ones. The all-red and all-blue ones are different,
because they seem to have “lost” the information of
c
. In fact, the orbital closure
of each of them is just that colouring itself. This is a strict subset of
¯c
. These
are phenomena we don’t want.
Definition
(Minimal dynamical system)
.
We say (
X, T
) is minimal if
¯x
=
X
for all
x X
. We say
x X
is minimal if (
¯x, T
) is a minimal dynamical system.
Proposition. Every dynamical system (X, T ) has a minimal point.
Thus, most of the time, when we want to prove something about dynamical
systems, we can just pass to a minimal subsystem, and work with it.
Proof.
Let
U
=
{¯x
:
x X}
. Thus is a family of closed sets, ordered by inclusion.
We want to apply Zorn’s lemma to obtain a minimal element. Consider a chain
S in U. If
¯x
1
¯x
2
··· ¯x
n
,
then their intersection is
¯x
n
, which is in particular non-empty. So any finite
collection in S has non-empty intersection. Since X is compact, we know
\
¯xS
¯x 6= .
We pick
z
\
¯xS
¯x 6= .
Then we know that
¯z ¯x
for all ¯x S. So we know
¯z
\
¯xS
¯x 6= .
So by Zorn’s lemma, we can find a minimal element (in both senses).
Example. Consider c given by
Then we saw that ¯c contains all translates of these, and also
,
Then this system is not minimal, since the orbital closure of the all-red one is a
strict subsystem of ¯c.
We are now going to derive some nice properties of minimal systems.
Lemma.
If (
X, T
) is a minimal system, then for all
ε >
0, there is some
m N
such that for all x, y X, we have
min
|s|≤m
ρ(T
s
x, y) < ε.
Proof. Suppose not. Then there exists ε > 0 and points x
i
, y
i
X such that
min
|s|≤i
ρ(T
s
x
i
, y
i
) ε.
By compactness, we may pass to subsequences, and assume
x
i
x
and
y
i
y
.
By continuity, it follows that
ρ(T
s
x, y) ε
for all s Z. This is a contradiction, since ¯x = X by minimality.
We now want to prove topological van der Waerden.
Theorem
(Topological van der Waerden)
.
Let (
X, T
) be a dynamical system.
Then there exists an
ε >
0 such that whenever
r N
, then we can find
x X
and n N such that ρ(x, T
in
x) < ε for all i = 1, ··· , r.
Proof.
Without loss of generality, we may assume (
X, T
) is a minimal system.
We induct on
r
. If
r
= 1, we can just choose
y X
, and consider
T y, T
2
y, ···
.
Then note that by compactness, we have s
1
, s
2
N such that
ρ(T
s
1
y, T
s
2
y) < ε,
and then take x = T
s
1
y and n = s
2
s
1
.
Now suppose the result is true for
r >
1, and that we have the result for all
ε > 0 and r 1.
Claim.
For all
ε >
0, there is some point
y X
such that there is an
x X
and n N such that
ρ(t
in
x, y) < ε
for all 1 i r.
Note that this is a different statement from the theorem, because we are not
starting at
x
. In fact, this is a triviality. Indeed, let (
x
0
, n
) be as guaranteed by
the hypothesis. That is,
ρ
(
x
0
, T
in
x
0
)
< ε
for all 1
i r
. Then pick
y
=
x
0
and x = T
n
x
0
.
The next goal is to show that there is nothing special about this y.
Claim.
For all
ε >
0 and for all
z X
, there exists
x
z
X
and
n N
for
which ρ(T
in
x
z
, z) < ε.
The idea is just to shift the picture so that
y
gets close to
z
, and see where
we send x to. We will use continuity to make sure we don’t get too far away.
We choose
m
as in the previous lemma for
ε
2
. Since
T
m
, T
m+1
, ··· , T
m
are all uniformly continuous, we can choose
ε
0
such that
ρ
(
a, b
)
< ε
0
implies
ρ(T
s
a, T
s
b) <
ε
2
for all |s| m.
Given
z X
, we obtain
x
and
y
by applying our first claim to
ε
0
. Then we
can find
s Z
with
|s| m
such that
ρ
(
T
s
y, z
)
<
ε
2
. Consider
x
z
=
T
s
x
. Then
ρ(T
in
x
z
, z) ρ(T
in
x
z
, T
s
y) + ρ(T
s
y, z)
ρ(T
s
(T
in
x), T
s
y) +
ε
2
ε
2
+
ε
2
= ε
Claim.
For all
ε >
0 and
z X
, there exists
x X
,
n N
and
ε
0
>
0 such
that T
in
(B(x, ε
0
)) B(z, ε) for all 1 i r.
We choose ε
0
by continuity, using the previous claim.
The idea is that we repeatedly apply the final claim, and then as we keep
moving back, eventually two points will be next to each other.
We pick
z
0
X
and set
ε
0
=
ε
2
. By the final claim, there exists
z
1
X
and
some
n
1
N
such that
T
in
1
(
B
(
z
1
, ε
1
))
B
(
z
0
, ε
0
) for some 0
< ε
1
ε
0
and all
1 i r.
Inductively, we find z
s
X, n
s
N and some 0 < ε
s
ε
s1
such that
T
in
s
(B(z
s
, ε
s
)) B(z
s1
, ε
s1
)
for all 1 i r.
By compactness, (
z
s
) has a convergent subsequence, and in particular, there
exists i < j N such that ρ(z
i
, z
j
) <
ε
2
.
Now take x = z
j
, and
n = n
j
+ n
j1
+ ··· + n
i+1
Then
T
`n
(B(x, ε
j
)) B(z
i
, ε
i
).
for all 1 ` r. But we know
ρ(z
i
, z
j
)
ε
2
,
and
ε
i
ε
0
ε
2
.
So we have
ρ(T
`n
x, x) ε
for all 1 ` r.
In the remaining of the chapter, we will actually prove Hindman’s theorem.
We will again first prove a “topological” version, but unlike the topological van
der Waerden theorem, it will look nothing like the actually Hindman’s theorem.
So we still need to do some work to deduce the actual version from the topological
version.
To state topological Hindman, we need the following definition:
Definition (Proximal points). We say x, y X are proximal if
inf
sZ
ρ(T
s
x, T
s
y) = 0.
Example.
Two colourings
c
1
, c
2
C
are proximal iff there are arbitrarily large
intervals where they agree.
The topological form of Hindman’s theorem says the following:
Theorem
(Topological Hindman’s theorem)
.
Let (
X, T
) be a dynamical system,
and suppose
X
=
¯x
for some
x X
. Then for any minimal subsystem
Y X
,
then there is some y Y such that x and y are proximal.
We will first show how this implies Hindman’s theorem, and then prove this.
To do the translation, we need to, of course, translate concepts in dynamical
systems to concepts about colourings. We previously showed that for colourings
c, c
0
, we have
c
0
¯c
iff
Seq
(
c
0
)
Seq
(
c
). The next thing to figure out is when
c
:
Z
[
k
] is minimal. By definition, this means for any
x ¯c
, we have
¯x
=
¯c
.
Equivalently, we have Seq(x) = Seq(c).
We introduce some notation.
Notation.
For an interval
I Z
, we say
I
=
{a, a
+ 1
, ··· , b}
, we write
c
(
I
) for
the sequence (c(a), c(a + 1), ··· , c(b)).
Clearly, we have
Seq(c) = {c(I) : I Z is an interval}.
We say for intervals
I
1
, I
2
Z
, we have
c
(
I
1
)
4 c
(
I
2
), if
c
(
I
1
) is a subsequence
of c(I
2
).
Definition
(Bounded gaps property)
.
We say a colouring
c
:
Z
[
k
] has the
bounded gaps property if for each interval
I Z
, there exists some
M >
0 such
that c(I) 4 c(U) for every interval U Z of length at least M.
For example, every periodic colouring has the bounded gaps property.
It turns out this is precisely what we need to characterize minimal colourings.
Proposition.
A colouring
c
:
Z
[
k
] is minimal iff
c
has the bounded gaps
property.
Proof.
Suppose
c
has the bounded gaps property. We want to show that for all
x ¯c
, we have
Seq
(
x
) =
Seq
(
c
). We certainly got one containment
Seq
(
x
)
Seq
(
c
). Consider any interval
I Z
. We want to show that
c
(
I
)
Seq
(
x
). By
the bounded gaps property, there is some
M
such that any interval
U Z
of
length > M satisfies c(I) 4 c(U). But since x ¯c, we know that
x([0, ··· , M]) = c([t, t + 1, ··· , t + M ])
for some t Z. So c(I) 4 x([0, ··· , M ]), and this implies Seq(c) Seq(x).
For the other direction, suppose
c
does not have the bounded gaps property.
So there is some bad interval
I Z
. Then for any
n N
, there is some
t
n
such
that
c(I) 64 c([t
n
n, t
n
n + 1, ··· , t
n
+ n]).
Now consider
x
n
=
L
t
n
(
c
). By passing to a subsequence if necessary, we may
assume that
x
n
x
. Clearly,
c
(
I
)
6∈ Seq
(
x
). So we have found something in
Seq(c) \ Seq(x). So c 6∈ ¯x.
It turns out once we have equated minimality with the bounded gaps property,
it is not hard to deduce Hindman’s theorem.
Theorem
(Hindman’s theorem)
.
If
c
:
N
[
k
] is a
k
-colouring, then there
exists an infinite A N such that F S(A) is monochromatic.
Proof.
We extend the colouring
c
to a colouring of
Z
by defining
x
:
Z
[
k
+ 1]
with
x(i) = x(i) = c(i)
if
x 6
= 0, and
x
(0) =
k
+ 1. Then it suffices to find an infinite an infinite
A Z
such that F S(A) is monochromatic with respect to x.
We apply topological Hindman to (
¯x, L
) to find a minimal colouring
y
such
that x and y are proximal. Then either
inf
nN
ρ(L
n
x, L
n
y) = 0 or inf
nN
ρ(L
n
x, L
n
y) = 0.
We wlog it is the first case, i.e.
x
and
y
are positively proximal. We now build
up this infinite set A we want one by one.
Let the colour of
y
(0) be red. We shall in fact pick 0
< a
1
< a
2
< ···
inductively so that x(t) = y(t) = red for all t F S(a
1
, a
2
, ···).
By the bounded gaps property of
y
, there exists some
M
1
such that for any
I Z of length M
1
, then it contains y([0]), i.e. y([0]) 4 y(I).
Since
x
and
y
are positively proximal, there exists
I Z
such that
|I| M
and
min
(
I
)
>
0 such that
x
(
I
) =
y
(
I
). Then we pick
a
1
I
such that
x(a
1
) = y(a
1
) = y(0).
Now we just have to keep doing this. Suppose we have found
a
1
< ··· < a
n
as required. Consider the interval
J
= [0
, ··· , a
1
+
a
2
+
···
+
a
n
]. Again by
the bounded gaps property, there exists some
M
n+1
such that if
I Z
has
|I| M
n+1
, then y(J) 4 y(I).
We choose an interval I Z such that
(i) x(I) = y(I)
(ii) |I| M
n+1
(iii) min I >
P
n
i=1
a
i
.
Then we know that
y([0, ··· , a
1
+ ··· + a
n
]) 4 y(I).
Let
a
n+1
denote by the position at which
y
([0
, ··· , a
1
+
···
+
a
n
]) occurs in
y
(
I
).
It is then immediate that
x
(
z
) =
y
(
z
) =
red
for all
z F S
(
a
1
, ··· , a
n+1
), as any
element in F S(a
1
, ··· , a
n+1
) is either
t
0
for some t
0
F S(a
1
, . . . , a
n
);
a
n+1
; or
a
n+1
+ t
0
for some t
0
F S(a
1
, ··· , a
n
),
and these are all red by construction.
Then A = {a
1
, a
2
, ···} is the required set.
The heart of the proof is that we used topological Hindman to obtain a
minimal proximal point, and this is why we can iteratively get the a
n
.
It remains to prove topological Hindman.
Theorem
(Topological Hindman’s theorem)
.
If (
X, T
) is a dynamical system,
X
=
¯x
and
Y X
is minimal, then there exists
y Y
such that
x
and
y
are
proximal.
This is a long proof, and we will leave out proofs of basic topological facts.
Proof.
If
X
is a compact metric space, then
X
X
=
{f
:
X X}
is compact
under the product topology by Tychonoff. The basic open sets are of the form
{f : X Y | f(x
i
) U
i
for i = 1, ··· , k}
for some fixed x
i
X and U
i
X open.
Now any function g : X X can act on X
X
in two obvious ways
Post-composition L
g
: X
X
X
X
be given by L
g
(f) = g f.
Pre-composition R
g
: X
X
X
X
be given by R
g
(f) = f g.
The useful thing to observe is that
R
g
is always continuous, while
L
g
is continuous
if g is continuous.
Now if (X, T ) is a dynamical system, we let
E
T
= cl{T
s
: s Z} X
X
.
The idea is that we look at what is going on in this space instead. Let’s note a
few things about this subspace E
T
.
E
T
is compact, as it is a closed subset of a compact set.
f E
T
iff for all
ε >
0 and points
x
1
, ··· , x
k
X
, there exists
s Z
such
that ρ(f(x
i
), T
s
(x
i
)) < ε.
E
T
is closed under composition.
So in fact,
E
T
is a compact semi-group inside
X
X
. It turns out every proof of
Hindman’s theorem involves working with a semi-group-like object, and then
trying to find an idempotent element.
Theorem
(Idempotent theorem)
.
If
E X
X
is a compact semi-group, then
there exists g E such that g
2
= g.
This is a strange thing to try to prove, but it turns out once we came up
with this statement, it is not hard to prove. The hard part is figuring out this is
the thing we want to prove.
Proof.
Let
F
denote the collection of all compact semi-groups of
E
. Let
A F
be minimal with respect to inclusion. To see
A
exists, it suffices (by Zorn) to
check that any chain
S
in
F
has a lower bound in
F
, but this is immediate,
since the intersection of nested compact sets is noon-empty.
Claim. Ag = A for all g A.
We first observe that
Ag
is compact since
R
g
is continuous. Also,
Ag
is a
semigroup, since if f
1
g, f
2
g Ag, then
f
1
gf
2
g = (f
1
gf
2
)g A
g
.
Finally, since g A, we have Ag A. So by minimality, we have Ag = A.
Now let
B
g
= {f A : fg = g}.
Claim. B
g
= A for all g A.
Note that
B
g
is non-empty, because
Ag
=
A
. Moreover,
B
is closed, as
B
is
the inverse of
{g}
under
R
g
. So
B
is compact. Finally, it is clear that
B
is a
semi-group. So by minimality, we have B
g
= A.
So pick any g A, and then g
2
= g.
Proof of topological Hindman (continued).
We are actually going to work in a
slightly smaller compact semigroup than E
T
. We let F E
T
be defined by
F = {f E
T
: f(x) Y }.
Claim. F X
X
is a compact semigroup.
Before we prove the claim, we see why it makes sense to consider this
F
.
Suppose the claim is true. Then by applying the idempotent theorem, to get
g F such that g
2
= g. We now check that x, g(x) are proximal.
Since g F E
T
, we know for all ε, there exists s Z such that
ρ(g(x), T
s
(x)) <
ε
2
, ρ(g(g(x)), T
s
(g(x))) <
ε
2
.
But we are done, since
g
(
x
) =
g
(
g
(
x
)), and then we conclude from the triangle
inequality that
ρ(T
s
(x), T
s
(g(x))) < ε.
So x and g(x) Y are proximal.
It now remains to prove the final claim. We first note that
F
is non-empty.
Indeed, pick any
y Y
. Since
X
=
¯x
, there exists some sequence
T
n
i
(
x
)
y
.
Now by compactness of
E
T
, we can find some
f E
T
such that (
T
n
i
)
i0
cluster
at
f
, i.e. every open neighbourhood of
f
contains infinitely many
T
n
i
. Then
since X is Hausdorff, it follows that f(x) = y. So f F.
To show
F
is compact, it suffices to show that it is closed. But since
Y
=
¯y
for all
y Y
by minimality, we know
Y
is in particular closed, and so
F
is closed
in the product topology.
Finally, we have to show that
F
is closed under composition, so that it is a
semi-group. To show this, it suffices to show that any map
f E
T
sends
Y
to
Y
. Indeed, by minimality of
Y
, this is true for
T
s
for
s Z
, and then we are
done since Y is closed.
The actual hard part of the proof is deciding we should prove the idempotent
theorem. It is not an obvious thing to try!