III Ramsey Theory - Topological Dynamics in Ramsey Theory

4Topological Dynamics in Ramsey Theory

III Ramsey Theory

4 Topological Dynamics in Ramsey Theory

Recall the finite sum theorem — for any fixed

, whenever

is finitely coloured,

we can find

, ··· , x

such that

F S

(

, x

, ··· , x

) is monochromatic. One

natural generalization is to ask the following question — can we find an infinite

set A such that

F S(A) =

(

x∈B

x : B ⊆ A finite non-empty

)

is monochromatic? The first thought upon hearing this question is probably

either “this is so strong that there is no way it can be true”, or “we can deduce

it easily, perhaps via some compactness argument, from the finite sum theorem”.

Both of these thoughts are wrong. It is true that it is always possible to find

these x

, x

, ···, but the proof is hard. This is Hindman’s theorem.

The way we are going to approach this is via topological dynamics, which

is the title of this chapter. The idea is we construct a metric space (and in

particular a dynamical system) out of the Ramsey-theoretic problem we are

interested in, and then convert the Ramsey-theoretic problem into a question

about topological properties of the space we are constructed.

In the case of Hindman’s theorem, it turns out the “topological” form is a

general topological fact, which we can prove for a general dynamical system.

What is the advantage of this approach? When we construct the dynamical

system we are interested in, what we get is highly “non-geometric”. It would

be very difficult to think of it as an actual “space”. It is just something that

happens to satisfy the definition of a metric space. However, when we prove the

general topological facts, we will think of our metric space as a genuine space

, and this makes it much easier to visualize what is going on in the proofs.

A less obvious benefit of this approach is that we will apply Tychonoff’s

theorem many times, and also use the fact that the possibly infinite intersection

of nested compact sets is open. If we don’t formulate our problem in topological

terms, then we might find ourselves re-proving these repeatedly, which further

complicates the proof.

We now begin our journey.

Definition

(Dynamical system)

A dynamical system is a pair (

X, T

) where

is the compact metric space and T is a homeomorphism on X.

Example. (T = R/Z, T

(x) = x + α) for α ∈ R is a dynamical system.

In this course, this is not the dynamical system we are interested in. In fact,

we are only ever going to be interested in one dynamical system.

We fix number of colours

k ∈ N

. Instead of looking at a particular colouring,

we consider the space of all k-colourings. We let

C = {c : Z → [k]}

∼

[k]

We endow this with the product topology. Since [

] has the discrete topology,

our basic open sets have the form

U = {c : Z → [k] : c(i

) = c

, ··· , c(i

) = c

}

for some i

, ··· , i

∈ Z and c

, ··· , c

∈ [k].

Since [

] is trivially compact, by Tychonoff’s theorem, we know that

compact. But we don’t really need Tychonoff, since we know that

is metrizable,

with metric

ρ(c

, c

) =

n + 1

where

is the largest integer for which

and

agree on [

−n, n

]. This metric

is easily seen to give rise to the same topology, and by hand, we can see this is

sequentially compact.

We now have to define the operation on C we are interested in.

Definition (Left shift). The left shift operator L : C → C is defined by

(Lc)(i) = c(i + 1).

We observe that this map is invertible, with inverse given by the right shift. This

is one good reason to work over

instead of

. Moreover, we see that

is nice

and uniformly continuous, since if

ρ(c

, c

) ≤

n + 1

then

ρ(Lc

, Lc

) ≤

Similarly,

−1

is uniformly continuous. So it follows that (

C, L

) is a dynamical

system.

Instead of proving Hindman’s theorem directly, we begin by re-proving van

der Waerden’s theorem using this approach, to understand better how it works.

Theorem

(Topological van der Waerden)

Let (

X, T

) be a dynamical system.

Then there exists an

ε >

0 such that whenever

r ∈ N

, then we can find

x ∈ X

and n ∈ N such that ρ(x, T

x) < ε for all i = 1, ··· , r.

In words, this says if we flow around

under

, then eventually some point

must go back near to itself, and must do that as many times as we wish to, in

regular intervals.

To justify calling this the topological van der Waerden theorem, we should

be able to easily deduce van der Waerden’s theorem from this. It is not difficult

to imagine this is vaguely related to van der Waerden in some sense, but it is

not obvious how we can actually do the deduction. After all, topological van

der Waerden theorem talks about the whole space of all colourings, and we do

not get to control what

we get back from it. What we want is to start with a

single colouring c and try to say something about this particular c.

Now of course, we cannot restrict to the sub-system consisting only of

itself,

because, apart from it being really silly, the function

does not restrict to a

map

{c} → {c}

. We might think of considering the set of all translates of

While this is indeed closed under

, this is not a dynamical system, since it is

not compact! The solution is to take the closure of that.

Definition

(Orbital closure)

Let (

X, T

) be a dynamical system, and

x ∈ X

The orbital closure ¯x of x is the set cl{T

x : s ∈ Z}.

Observe that ¯x is a closed subset of a compact space, hence compact.

Proposition. (

X, T ) is a dynamical system.

Proof.

It suffices to show that

¯x

is closed under

. If

y ∈ ¯x

, then we have

some

such that

x → y

. Since

is continuous, we know

x → T y

. So

T y ∈ ¯x. Similarly, T

−1

¯x = ¯x.

Once we notice that we can produce such a dynamical system from our

favorite point, it is straightforward to prove van der Waerden.

Corollary

(van der Waerden theorem)

Let

r, k ∈ N

. Then whenever

k-coloured, then there is a monochromatic arithmetic progression of length r.

Proof.

Let

Z →

[

]. Consider (

¯c, L

). By topological van der Waerden, we can

find

x ∈ ¯c

and

n ∈ N

such that

(

x, L

)

1 for all

= 1

, ··· , r

. In particular,

we know that x and L

x agree at 0. So we know that

x(0) = L

x(0) = x(in)

for all i = 0, ··· , r.

We are not quite done, because we just know that

x ∈ ¯c

, and not that

x = L

x for some k.

But this is not bad. Since x ∈ ¯c, we can find some s ∈ Z such that

ρ(T

c, x) ≤

rn + 1

This means x and T

c agree on the first rn + 1 elements. So we know

c(s) = c(s + n) = ··· = c(s + rn).

Since we will be looking at these

¯c

a lot, it is convenient to figure out what

this

¯c

actually looks like. It turns out there is a reasonably good characterization

of these orbital closures.

Let’s look at some specific examples. Consider c given by

Then the orbital closure has just two points, c and Lc.

What if we instead had the following?

Then ¯c contains all translates of these, but also

We define the following:

Definition (Seq). Let c : Z → [k]. We define

Seq(c) = {(c(i), ··· , c(i + r)) : i ∈ Z, r ∈ N}.

It turns out these sequences tell us everything we want to know about orbital

closures.

Proposition. We have c

∈ ¯c iff Seq(c

) ⊆ Seq(c).

The proof is just following the definition and checking everything works.

Proof.

We first prove

⇒

. Suppose

∈ ¯c

. Let (

(

)

, ··· , c

(

r −

1))

∈ Seq

(

Then we have s ∈ Z such that

ρ(c

, L

c) <

1 + max(|i|, |i + s − 1|)

which implies

(c(s + i), ··· , c(s + i + r − 1)) = (c

(i), ··· , c

(i + s − 1)).

So we are done.

For ⇐, if Seq(c

) ⊆ Seq(c), then for all n ∈ N, there exists s

∈ Z such that

(−n), ··· , c

(n)) = (c(s

− n), ··· , c(s

+ n)).

Then we have

c → c

So we have c

∈ ¯c.

We now return to talking about topological dynamics. We saw that in our

last example, the orbital closure of

has three “kinds” of things — translates of

, and the all-red and all-blue ones. The all-red and all-blue ones are different,

because they seem to have “lost” the information of

. In fact, the orbital closure

of each of them is just that colouring itself. This is a strict subset of

¯c

. These

are phenomena we don’t want.

Definition

(Minimal dynamical system)

We say (

X, T

) is minimal if

¯x

for all

x ∈ X

. We say

x ∈ X

is minimal if (

¯x, T

) is a minimal dynamical system.

Proposition. Every dynamical system (X, T ) has a minimal point.

Thus, most of the time, when we want to prove something about dynamical

systems, we can just pass to a minimal subsystem, and work with it.

Proof.

Let

{¯x

x ∈ X}

. Thus is a family of closed sets, ordered by inclusion.

We want to apply Zorn’s lemma to obtain a minimal element. Consider a chain

S in U. If

¯x

⊇ ¯x

⊇ ··· ⊇ ¯x

then their intersection is

¯x

, which is in particular non-empty. So any finite

collection in S has non-empty intersection. Since X is compact, we know

¯x∈S

¯x 6= ∅.

We pick

z ∈

¯x∈S

¯x 6= ∅.

Then we know that

¯z ⊆ ¯x

for all ¯x ∈ S. So we know

¯z ⊆

¯x∈S

¯x 6= ∅.

So by Zorn’s lemma, we can find a minimal element (in both senses).

Example. Consider c given by

Then we saw that ¯c contains all translates of these, and also

Then this system is not minimal, since the orbital closure of the all-red one is a

strict subsystem of ¯c.

We are now going to derive some nice properties of minimal systems.

Lemma.

If (

X, T

) is a minimal system, then for all

ε >

0, there is some

m ∈ N

such that for all x, y ∈ X, we have

min

|s|≤m

ρ(T

x, y) < ε.

Proof. Suppose not. Then there exists ε > 0 and points x

, y

∈ X such that

min

|s|≤i

ρ(T

, y

) ≥ ε.

By compactness, we may pass to subsequences, and assume

→ x

and

→ y

By continuity, it follows that

ρ(T

x, y) ≥ ε

for all s ∈ Z. This is a contradiction, since ¯x = X by minimality.

We now want to prove topological van der Waerden.

Theorem

(Topological van der Waerden)

Let (

X, T

) be a dynamical system.

Then there exists an

ε >

0 such that whenever

r ∈ N

, then we can find

x ∈ X

and n ∈ N such that ρ(x, T

x) < ε for all i = 1, ··· , r.

Proof.

Without loss of generality, we may assume (

X, T

) is a minimal system.

We induct on

. If

= 1, we can just choose

y ∈ X

, and consider

T y, T

y, ···

Then note that by compactness, we have s

, s

∈ N such that

ρ(T

y, T

y) < ε,

and then take x = T

y and n = s

− s

Now suppose the result is true for

r >

1, and that we have the result for all

ε > 0 and r − 1.

Claim.

For all

ε >

0, there is some point

y ∈ X

such that there is an

x ∈ X

and n ∈ N such that

ρ(t

x, y) < ε

for all 1 ≤ i ≤ r.

Note that this is a different statement from the theorem, because we are not

starting at

. In fact, this is a triviality. Indeed, let (

, n

) be as guaranteed by

the hypothesis. That is,

(

, T

)

< ε

for all 1

≤ i ≤ r

. Then pick

and x = T

−n

The next goal is to show that there is nothing special about this y.

Claim.

For all

ε >

0 and for all

z ∈ X

, there exists

∈ X

and

n ∈ N

for

which ρ(T

, z) < ε.

The idea is just to shift the picture so that

gets close to

, and see where

we send x to. We will use continuity to make sure we don’t get too far away.

We choose

as in the previous lemma for

. Since

−m

, T

−m+1

, ··· , T

are all uniformly continuous, we can choose

such that

(

a, b

)

< ε

implies

ρ(T

a, T

b) <

for all |s| ≤ m.

Given

z ∈ X

, we obtain

and

by applying our first claim to

. Then we

can find

s ∈ Z

with

|s| ≤ m

such that

(

y, z

)

. Consider

. Then

ρ(T

, z) ≤ ρ(T

, T

y) + ρ(T

y, z)

≤ ρ(T

x), T

y) +

≤

= ε

Claim.

For all

ε >

0 and

z ∈ X

, there exists

x ∈ X

n ∈ N

and

0 such

that T

(B(x, ε

)) ⊆ B(z, ε) for all 1 ≤ i ≤ r.

We choose ε

by continuity, using the previous claim.

The idea is that we repeatedly apply the final claim, and then as we keep

moving back, eventually two points will be next to each other.

We pick

∈ X

and set

. By the final claim, there exists

∈ X

and

some

∈ N

such that

(

, ε

))

⊆ B

(

, ε

) for some 0

< ε

≤ ε

and all

1 ≤ i ≤ r.

Inductively, we find z

∈ X, n

∈ N and some 0 < ε

≤ ε

s−1

such that

(B(z

, ε

)) ⊆ B(z

s−1

, ε

s−1

)

for all 1 ≤ i ≤ r.

By compactness, (

) has a convergent subsequence, and in particular, there

exists i < j ∈ N such that ρ(z

, z

) <

Now take x = z

, and

n = n

+ n

j−1

+ ··· + n

i+1

Then

(B(x, ε

)) ⊆ B(z

, ε

for all 1 ≤ ` ≤ r. But we know

ρ(z

, z

) ≤

and

≤ ε

≤

So we have

ρ(T

x, x) ≤ ε

for all 1 ≤ ` ≤ r.

In the remaining of the chapter, we will actually prove Hindman’s theorem.

We will again first prove a “topological” version, but unlike the topological van

der Waerden theorem, it will look nothing like the actually Hindman’s theorem.

So we still need to do some work to deduce the actual version from the topological

version.

To state topological Hindman, we need the following definition:

Definition (Proximal points). We say x, y ∈ X are proximal if

inf

s∈Z

ρ(T

x, T

y) = 0.

Example.

Two colourings

, c

∈ C

are proximal iff there are arbitrarily large

intervals where they agree.

The topological form of Hindman’s theorem says the following:

Theorem

(Topological Hindman’s theorem)

Let (

X, T

) be a dynamical system,

and suppose

¯x

for some

x ∈ X

. Then for any minimal subsystem

Y ⊆ X

then there is some y ∈ Y such that x and y are proximal.

We will first show how this implies Hindman’s theorem, and then prove this.

To do the translation, we need to, of course, translate concepts in dynamical

systems to concepts about colourings. We previously showed that for colourings

c, c

, we have

∈ ¯c

iff

Seq

(

)

⊆ Seq

(

). The next thing to figure out is when

Z →

[

] is minimal. By definition, this means for any

x ∈ ¯c

, we have

¯x

¯c

Equivalently, we have Seq(x) = Seq(c).

We introduce some notation.

Notation.

For an interval

I ⊆ Z

, we say

{a, a

+ 1

, ··· , b}

, we write

(

) for

the sequence (c(a), c(a + 1), ··· , c(b)).

Clearly, we have

Seq(c) = {c(I) : I ⊆ Z is an interval}.

We say for intervals

, I

⊆ Z

, we have

(

)

4 c

(

), if

(

) is a subsequence

of c(I

Definition

(Bounded gaps property)

We say a colouring

Z →

[

] has the

bounded gaps property if for each interval

I ⊆ Z

, there exists some

M >

0 such

that c(I) 4 c(U) for every interval U ⊆ Z of length at least M.

For example, every periodic colouring has the bounded gaps property.

It turns out this is precisely what we need to characterize minimal colourings.

Proposition.

A colouring

Z →

[

] is minimal iff

has the bounded gaps

property.

Proof.

Suppose

has the bounded gaps property. We want to show that for all

x ∈ ¯c

, we have

Seq

(

) =

Seq

(

). We certainly got one containment

Seq

(

)

⊆

Seq

(

). Consider any interval

I ⊆ Z

. We want to show that

(

)

∈ Seq

(

). By

the bounded gaps property, there is some

such that any interval

U ⊆ Z

length > M satisfies c(I) 4 c(U). But since x ∈ ¯c, we know that

x([0, ··· , M]) = c([t, t + 1, ··· , t + M ])

for some t ∈ Z. So c(I) 4 x([0, ··· , M ]), and this implies Seq(c) ⊆ Seq(x).

For the other direction, suppose

does not have the bounded gaps property.

So there is some bad interval

I ⊆ Z

. Then for any

n ∈ N

, there is some

such

that

c(I) 64 c([t

− n, t

− n + 1, ··· , t

+ n]).

Now consider

(

). By passing to a subsequence if necessary, we may

assume that

→ x

. Clearly,

(

)

6∈ Seq

(

). So we have found something in

Seq(c) \ Seq(x). So c 6∈ ¯x.

It turns out once we have equated minimality with the bounded gaps property,

it is not hard to deduce Hindman’s theorem.

Theorem

(Hindman’s theorem)

N →

[

] is a

-colouring, then there

exists an infinite A ⊆ N such that F S(A) is monochromatic.

Proof.

We extend the colouring

to a colouring of

by defining

Z →

[

+ 1]

with

x(i) = x(−i) = c(i)

x 6

= 0, and

(0) =

+ 1. Then it suffices to find an infinite an infinite

A ⊆ Z

such that F S(A) is monochromatic with respect to x.

We apply topological Hindman to (

¯x, L

) to find a minimal colouring

such

that x and y are proximal. Then either

inf

n∈N

ρ(L

x, L

y) = 0 or inf

n∈N

ρ(L

−n

x, L

−n

y) = 0.

We wlog it is the first case, i.e.

and

are positively proximal. We now build

up this infinite set A we want one by one.

Let the colour of

(0) be red. We shall in fact pick 0

< a

< ···

inductively so that x(t) = y(t) = red for all t ∈ F S(a

, a

, ···).

By the bounded gaps property of

, there exists some

such that for any

I ⊆ Z of length ≥ M

, then it contains y([0]), i.e. y([0]) 4 y(I).

Since

and

are positively proximal, there exists

I ⊆ Z

such that

|I| ≥ M

and

min

(

)

0 such that

(

) =

(

). Then we pick

∈ I

such that

x(a

) = y(a

) = y(0).

Now we just have to keep doing this. Suppose we have found

< ··· < a

as required. Consider the interval

= [0

, ··· , a

···

]. Again by

the bounded gaps property, there exists some

n+1

such that if

I ⊆ Z

has

|I| ≥ M

n+1

, then y(J) 4 y(I).

We choose an interval I ⊆ Z such that

(i) x(I) = y(I)

(ii) |I| ≥ M

n+1

(iii) min I >

i=1

Then we know that

y([0, ··· , a

+ ··· + a

]) 4 y(I).

Let

n+1

denote by the position at which

([0

, ··· , a

···

]) occurs in

(

It is then immediate that

(

) =

(

) =

red

for all

z ∈ F S

(

, ··· , a

n+1

), as any

element in F S(a

, ··· , a

n+1

) is either

– t

for some t

∈ F S(a

, . . . , a

);

– a

n+1

; or

– a

n+1

+ t

for some t

∈ F S(a

, ··· , a

and these are all red by construction.

Then A = {a

, a

, ···} is the required set.

The heart of the proof is that we used topological Hindman to obtain a

minimal proximal point, and this is why we can iteratively get the a

It remains to prove topological Hindman.

Theorem

(Topological Hindman’s theorem)

If (

X, T

) is a dynamical system,

¯x

and

Y ⊆ X

is minimal, then there exists

y ∈ Y

such that

and

are

proximal.

This is a long proof, and we will leave out proofs of basic topological facts.

Proof.

is a compact metric space, then

X → X}

is compact

under the product topology by Tychonoff. The basic open sets are of the form

{f : X → Y | f(x

) ∈ U

for i = 1, ··· , k}

for some fixed x

∈ X and U

⊆ X open.

Now any function g : X → X can act on X

in two obvious ways —

– Post-composition L

: X

→ X

be given by L

(f) = g ◦ f.

– Pre-composition R

: X

→ X

be given by R

(f) = f ◦ g.

The useful thing to observe is that

is always continuous, while

is continuous

if g is continuous.

Now if (X, T ) is a dynamical system, we let

= cl{T

: s ∈ Z} ⊆ X

The idea is that we look at what is going on in this space instead. Let’s note a

few things about this subspace E

– E

is compact, as it is a closed subset of a compact set.

– f ∈ E

iff for all

ε >

0 and points

, ··· , x

∈ X

, there exists

s ∈ Z

such

that ρ(f(x

), T

)) < ε.

– E

is closed under composition.

So in fact,

is a compact semi-group inside

. It turns out every proof of

Hindman’s theorem involves working with a semi-group-like object, and then

trying to find an idempotent element.

Theorem

(Idempotent theorem)

E ⊆ X

is a compact semi-group, then

there exists g ∈ E such that g

= g.

This is a strange thing to try to prove, but it turns out once we came up

with this statement, it is not hard to prove. The hard part is figuring out this is

the thing we want to prove.

Proof.

Let

denote the collection of all compact semi-groups of

. Let

A ∈ F

be minimal with respect to inclusion. To see

exists, it suffices (by Zorn) to

check that any chain

has a lower bound in

, but this is immediate,

since the intersection of nested compact sets is noon-empty.

Claim. Ag = A for all g ∈ A.

We first observe that

is compact since

is continuous. Also,

is a

semigroup, since if f

g, f

g ∈ Ag, then

g = (f

)g ∈ A

Finally, since g ∈ A, we have Ag ⊆ A. So by minimality, we have Ag = A.

Now let

= {f ∈ A : fg = g}.

Claim. B

= A for all g ∈ A.

Note that

is non-empty, because

. Moreover,

is closed, as

the inverse of

{g}

under

. So

is compact. Finally, it is clear that

is a

semi-group. So by minimality, we have B

= A.

So pick any g ∈ A, and then g

= g.

Proof of topological Hindman (continued).

We are actually going to work in a

slightly smaller compact semigroup than E

. We let F ⊆ E

be defined by

F = {f ∈ E

: f(x) ∈ Y }.

Claim. F ⊆ X

is a compact semigroup.

Before we prove the claim, we see why it makes sense to consider this

Suppose the claim is true. Then by applying the idempotent theorem, to get

g ∈ F such that g

= g. We now check that x, g(x) are proximal.

Since g ∈ F ⊆ E

, we know for all ε, there exists s ∈ Z such that

ρ(g(x), T

(x)) <

, ρ(g(g(x)), T

(g(x))) <

But we are done, since

(

) =

(

)), and then we conclude from the triangle

inequality that

ρ(T

(x), T

(g(x))) < ε.

So x and g(x) ∈ Y are proximal.

It now remains to prove the final claim. We first note that

is non-empty.

Indeed, pick any

y ∈ Y

. Since

¯x

, there exists some sequence

(

)

→ y

Now by compactness of

, we can find some

f ∈ E

such that (

)

i≥0

cluster

, i.e. every open neighbourhood of

contains infinitely many

. Then

since X is Hausdorff, it follows that f(x) = y. So f ∈ F.

To show

is compact, it suffices to show that it is closed. But since

¯y

for all

y ∈ Y

by minimality, we know

is in particular closed, and so

is closed

in the product topology.

Finally, we have to show that

is closed under composition, so that it is a

semi-group. To show this, it suffices to show that any map

f ∈ E

sends

. Indeed, by minimality of

, this is true for

for

s ∈ Z

, and then we are

done since Y is closed.

The actual hard part of the proof is deciding we should prove the idempotent

theorem. It is not an obvious thing to try!