3Partition Regularity

III Ramsey Theory



3 Partition Regularity
In the previous chapter, the problems we studied were mostly “linear” in nature.
We had some linear system, namely that encoding the fact that a sequence is an
AP, and we wanted to know if it had a monochromatic solution. More generally,
we can define the following:
Definition
(Partition regularity)
.
We say an
m ×n
matrix
A
over
Q
is partition
regular if whenever
N
is finitely coloured, there exists an
x N
n
such that
Ax = 0 and x is monochromatic, i.e. all coordinates of x have the same colour.
Recall that N does not include 0.
There is no loss in generality by assuming
A
in fact has entries in
Z
, by
scaling
A
, but sometimes it is (notationally) convenient to consider cases where
the entries are rational.
The question of the chapter is the following when is a matrix partition
regular? We begin by looking at some examples.
Example
(Schur’s theorem)
.
Schur’s theorem says whenever
N
is finitely
coloured, there exists a monochromatic set of the form
{x, y, x
+
y}
. In other
words the matrix
1 1 1
is partition regular, since
1 1 1
x
y
z
= 0 z = x + y.
Example.
How about
2 3 5
. This is partition regular, because we can
pick any x, and we have
2 3 5
x
x
x
= 0.
This is a trivial example.
How about van der Waerden’s theorem?
Example.
Van der Waerden’s theorem says there is a monochromatic 3-AP
{x
1
, x
2
, x
3
}
whenever
N
is finitely-coloured. We know
x
1
, x
2
, x
3
forms a 3-AP iff
x
3
x
2
= x
2
x
1
,
or equivalently
x
3
+ x
1
= 2x
2
.
This implies that
1 2 1
is partition regular. But this is actually not a
very interesting thing to say, because
x
1
=
x
2
=
x
3
is always a solution to this
equation. So this falls into the previous “trivial” case.
On the second example sheet we saw a stronger version of van der Waerden.
It says we can always find a monochromatic set of the form
{d, a, a + d, a + 2d, ··· , a + md}.
By including this variable, we can write down the property of being a progression
in a non-trivial format by
1 1 0 1
2 1 1 0
d
a
x
2
x
3
= 0
This obviously generalizes to an arbitrary m-AP, with matrix
1 1 1 0 0 ··· 0
1 2 0 1 0 ··· 0
1 3 0 0 1 ··· 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 m 0 0 0 ··· −1
.
We’ve seen three examples of partition regular matrices. Of course, not every
matrix is partition regular. The matrix
1 1
is not partition regular, for the
silly reason that two positive things cannot add up to zero.
Let’s now look at some non-trivial first matrix that is not partition regular.
Example.
The matrix
2 1
is not partition regular, since we can put
c
(
x
) = (
1)
n
, where
n
is the maximum integer such that 2
n
| x
. Then
{x,
2
x}
is
never monochromatic.
A similar argument shows that if
λ Q
is such that
λ, 1
is partition
regular, then λ = 1.
But if we write down a random matrix, say
2 3 6
? The goal of this
chapter is to find a complete characterization of matrices that are partition
regular.
Definition (Columns property). Let
A =
c
(1)
c
(2)
··· c
(n)
.
We say
A
has the columns property if there is a partition [
n
] =
B
1
B
2
···B
d
such that
X
iB
s
c
(i)
span{c
(i)
: c
(i)
B
1
··· B
s1
}
for s = 1, ··· , d. In particular,
X
iB
1
c
(i)
= 0
What does this mean? Let’s look at the matrices we’ve seen so far.
Example.
1 1 1
has the columns property by picking
B
1
=
{
1
,
3
}
and
B
2
= {2}.
Example.
2 3 5
has the columns property by picking B
1
= {1, 2, 3}.
Example. The matrix
1 1 1 0 0 ··· 0
1 2 0 1 0 ··· 0
1 3 0 0 1 ··· 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 m 0 0 0 ··· −1
has the column property. Indeed, we take
B
1
=
{
1
,
3
, ··· , m
+ 2
}
, and since
{c
(3)
, ··· , c
(m+2)
} spans all of R
m
, we know picking B
2
= {2} works.
Example.
` 1
has the columns property iff
`
= 1. In particular,
1 1
does not have a columns property.
Given these examples, it is natural to conjecture the following:
Theorem
(Rado’s theorem)
.
A matrix
A
is partition regular iff it has the
column property.
This is a remarkable theorem! The property of being partition regular
involves a lot of quantifiers, over infinitely many things, but the column property
is entirely finite, and we can get a computer to check it for us easily.
Another remarkable property of this theorem is that neither direction is obvi-
ous! It turns out partition regularity implies the column property is (marginally)
easier, because if we know something is partition regular, then we can try to
cook up some interesting colourings and see what happens. The other direction
is harder.
To get a feel of the result, we will first prove it in the case of a single equation.
The columns property in particular becomes much simpler. It just means that
there exists a non-empty subset of the non-zero a
i
’s that sums up to zero.
Theorem.
If
a
1
, ··· , a
n
Q \ {
0
}
, then
a
1
··· a
n
is partition regular iff
there exists a non-empty I [n] such that
X
iI
a
i
= 0.
For a fixed prime
p
, we let
d
(
x
) denote the last non-zero digit of
x
in base
p
,
i.e. if
x = d
n
p
n
+ d
n1
p
n1
+ ··· + d
0
,
then
L(x) = min{i : d
i
6= 0}
and d(x) = d
L(x)
. We now prove the easy direction of the theorem.
Proposition.
If
a
2
, ··· , a
n
Q\{
0
}
and
a
1
a
2
··· a
n
is partition regular,
then
X
iI
a
i
= 0
for some non-empty I.
Proof. We wlog a
i
Z, by scaling. Fix a suitably large prime
p >
n
X
i=1
|a
i
|,
and consider the (
p
1)-colouring of
N
where
x
is coloured
d
(
x
). We find
x
1
, ··· , x
n
such that
X
a
i
x
i
= 0.
and
d
(
x
i
) =
d
for some
d {
1
, ··· , p
1
}
. We write out everything in base
p
,
and let
L = min{L(x
i
) : 1 i n},
and set
I = {i : L(x
i
) = L}.
Then for all i I, we have
x
i
d (mod p
L+1
).
On the other hand, we are given that
X
a
i
x
i
= 0.
Taking mod p
L+1
gives us
X
iI
a
i
d = 0 (mod p
L+1
).
Since d is invertible, we know
X
iI
a
i
= 0 (mod p
L+1
).
But by our choice of p, this implies that
P
iI
a
i
= 0.
Here we knew what prime
p
to pick ahead. If we didn’t, then we could
consider all primes p, and for each p we find I
p
[n] such that
X
iI
p
a
i
= 0 (mod p).
Then some
I
p
has to occur infinitely often, and then we are done. Note that this
is merely about the fact that if a fixed number is 0 mod
n
for arbitrarily large
n
,
then it must be zero. This is not some deep local-global correspondence number
theoretic fact.
It turns out this is essentially the only way we know for proving this theorem.
One possible variation is to use the “first non-zero digit” to do the colouring,
but this is harder.
Let’s now try and do the other direction. Before we do that, we start by
doing a warm up. Last time, we proved that if we had
1 λ
, then this is
partition regular iff λ = 1. We now prove a three element version.
Proposition. The equation
1 λ 1
is partition regular for all λ Q.
Proof.
We may wlog
λ >
0. If
λ
= 0, then this is trivial, and if
λ <
0, then we
can multiply the whole equation by 1.
Say
λ =
r
s
.
The idea is to try to find solutions of this in long arithmetic progressions.
Suppose N is k-coloured. We let
{a, a + d, ··· , a + nd}
be a monochromatic AP, for n sufficiently large.
If sd were the same colour as this AP, then we’d be done, as we can take
x = a, y = sd, z = a +
r
s
sd.
In fact, if any of
sd,
2
sd, ··· ,
n
r
sd
have the same colour as the AP, then we’d
be done by taking
x = a, y = isd, z = a +
r
s
isd = a + ird a + nd.
If this were not the case, then
{sd,
2
sd, ··· ,
n
r
sd}
is (
k
1)-coloured, and this
is just a scaled up copy of N. So we are done by induction on k.
Note that when
λ
= 1, we have
1 1 1
is partition regular, and this
may be proved by Ramsey’s theorem. Can we prove this more general result by
Ramsey’s theorem as well? The answer is, we don’t know.
It turns out this is not just a warm up, but the main ingredient of what we
are going to do.
Theorem.
If
a
1
, ··· , a
n
Q
, then
a
1
··· a
n
is partition regular iff there
exists a non-empty I [n] such that
X
iI
a
i
= 0.
Proof.
One direction was done. To do the other direction, we recall that we had
a really easy case of, say,
2 3 5
,
because we can just make all the variables the same?
In the general case, we can’t quite do this, but we may try to solve this
equation with the least number of variables possible. In fact, we shall find some
monochromatic x, y, z, and then assign each of x
1
, ··· , x
n
to be one of x, y, z.
We know
X
iI
a
i
= 0.
We now partition I into two pieces. We fix i
0
I, and set
x
i
=
x i = i
0
y i I \ {i
0
}
z i 6∈ I
.
We would be done if whenever we finitely colour
N
, we can find monochromatic
x, y, z such that
a
i
0
x +
X
iI\{i
0
}
a
i
z +
X
i6∈I
a
i
y = 0.
But, since
X
iI
a
i
= 0,
this is equivalent to
a
i
0
x a
i
0
z + (something)y = 0.
Since all these coefficients were non-zero, we can divide out by
a
i
0
, and we are
done by the previous case.
Note that our proof of the theorem shows that if an equation is partition
regular for all last-digit base
p
colourings, then it is partition regular for all
colourings. This might sound like an easier thing to prove that the full-blown
Rado’s theorem, but it turns out the only proof we have for this implication is
Rado’s theorem.
We now prove the general case. We first do the easy direction, because it is
largely the same as the single-equation case.
Proposition.
If
A
is an
m × n
matrix with rational entries which is partition
regular, then A has the columns property.
Proof.
We again wlog all entries of
A
are integers. Let the columns of
A
be
c
(1)
, ··· , c
(n)
. Given a prime
p
, we consider the (
p
1)-colouring of
N
, where
x
is coloured
d
(
x
), the last non-zero digit in the base
p
expansion. Since
A
is
partition regular, we obtain a monochromatic solution.
We then get a monochromatic x
1
, ··· , x
n
such that Ax = 0, i.e.
X
x
i
c
(i)
= 0.
Any such solution with colour
d
induces a partition of [
n
] =
B
1
B
2
··· B
r
,
where
For all i, j B
s
, we have L(x
i
) = L(x
j
); and
For all s < t and i B
s
, j B
t
, the L(x
i
) < L(x
j
).
Last time, with the benefit of hindsight, we were able to choose some large prime
p
that made the argument work. So we use the trick we mentioned after the
proof last time.
Since there are finitely many possible partitions of [
n
], we may assume that
this particular partition is generated by infinitely many primes
p
. Call these
primes
P
. We introduce some notation. We say two vectors
u, v Z
m
satisfy
u v (mod p) if u
i
v
i
(mod p) for all i = 1, ··· , m.
Now we know that
X
x
i
c
(i)
= 0.
Looking at the first non-zero digit in the base p expansion, we have
X
iB
1
dc
(i)
0 (mod p).
From this, we conclude that, by multiplying by d
1
X
iB
1
c
(i)
0 (mod p),
for all p P. So we deduce that
X
iB
1
c
(i)
= 0.
Similarly, for higher s, we find that for each base p colouring, we have
X
iB
s
p
t
dc
(i)
+
X
iB
1
...B
s
x
i
c
(i)
0 (mod p
t+1
)
for all s 2, and some t dependent on s and p. Multiplying by d
1
, we find
X
iB
s
p
t
c
(i)
+
X
iB
1
...B
s1
(d
1
x
i
)c
(i)
0 (mod p
t+1
). ()
We claim that this implies
X
i∈B
s
c
(i)
hc
(i)
: i B
1
··· B
s1
i.
This is not exactly immediate, because the values of
x
i
in (
) may change as we
change our p. But it is still some easy linear algebra.
Suppose this were not true. Since we are living in a Euclidean space, we have
an inner product, and we can find some v Z
m
such that
hv, c
(i)
i = 0 for all i B
1
··· B
s1
,
and
*
v,
X
iB
s
c
(i)
+
6= 0.
But then, taking inner products with gives
p
t
*
v,
X
iB
s
c
(i)
+
0 (mod p
t+1
).
Equivalently, we have
*
v,
X
iB
s
c
(i)
+
0 (mod p),
but this is a contradiction. So we showed that
A
has the columns property with
respect to the partition [n] = B
1
··· B
r
.
We now prove the hard direction. We want an analogous gadget to our
1 λ 1
we had for the single-equation case. The definition will seem rather
mysterious, but it turns out to be what we want, and its purpose becomes more
clear as we look at some examples.
Definition
((
m, p, c
)-set)
.
For
m, p, c N
, a set
S N
is an (
m, p, c
)-set with
generators x
1
, ··· , x
m
if S has the form
S =
m
X
i=0
λ
i
x
i
:
λ
i
= 0 for all i < j
λ
j
= c
λ
i
[p, p]
.
In other words, we have
S =
m
[
j=1
{cx
j
+ λ
j+1
x
j+1
+ ··· + λ
m
x
m
: λ
i
[p, p]}.
For each
j
, the set
{cx
j
+
λ
j+1
x
j+1
+
···
+
λ
m
x
m
:
λ
i
[
p, p
]
}
is called a row
of S.
Example. What does a (2, p, 1) set look like? It has the form
{x
1
px
2
, x
1
(p 1)x
2
, ··· , x
1
+ px
2
} {x
2
}.
In other words, this is just an arithmetic progression with its common difference.
Example. A (2, p, 3)-set has the form
{3x
1
px
2
, ··· , 3x
1
, ··· , 3x
1
= px
2
} {3x
2
}.
The idea of an (
m, p, c
) set is that we “iterate” this process many times, and
so an (
m, p, c
)-set is an “iterated AP’s and various patterns of their common
differences”.
Our proof strategy is to show that that whenever we finitely-colour
N
, we can
always find an (
m, p, c
)-set, and given any matrix
A
with the columns property
and any (
m, p, c
)-set (for suitable
m
,
p
,
c
), there will always be a solution in
there.
Proposition.
Let
m, p, c N
. Then whenever
N
is finitely coloured, there
exists a monochromatic (m, p, c)-set.
Proof.
It suffices to find an (
m, p, c
)-set all of whose rows are monochromatic,
since when
N
is
k
-coloured, and (
m
0
, p, c
)-set with
m
0
=
mk
+ 1 has
m
monochro-
matic rows of the same colour by pigeonhole, and these rows contain a monochro-
matic (m, p, c)-set, by restricting to the elements where a lot of the λ
i
are zero.
In this proof, whenever we say (
m, p, c
)-set, we mean one all of whose rows are
monochromatic.
We will prove this by induction. We have a
k
-colouring of [
n
], where
n
is
very very very large. This contains a k-colouring of
B =
n
c, 2c, ··· ,
j
n
c
k
c
o
.
Since
c
is fixed, we can pick this so that
n
c
is large. By van der Waerden, we
find some set monochromatic
A = {cx
1
Nd, cx
1
(N 1)d, ··· , cx
1
+ Nd} B,
with
N
very very large. Since each element is a multiple of
c
by assumption,
we know that
c | d
. By induction, we may find an (
m
1
, p, c
)-set in the set
{d,
2
d, ··· , Md}
, where
M
is large. We are now done by the (
m, p, c
) set on
generators x
1
, ··· , x
n
, provided
cx
1
+
m
X
i=2
λ
i
x
i
A
for all
λ
i
[
p, p
], which is easily seen to be the case, provided
N
(
m
1)pM.
Note that the argument itself is quite similar to that in the
1 λ 1
case.
Recall that Schur’s theorem said that whenever we finitely-colour
N
, we can
find a monochromatic
{x, y, x
+
y}
. More generally, for
x
1
, x
2
, ··· , x
m
N
, we
let
F S(x
1
, ··· , x
m
) =
(
X
iI
x
i
: I [n], I 6=
)
.
The existence of a monochromatic (m, 1, 1)-sets gives us
Corollary
(Finite sum theorem)
.
For every fixed
m
, whenever we finitely-colour
N, there exists x
1
, ··· , x
m
such that F S(x
1
, ··· , x
m
) is monochromatic.
This is since an (
m,
1
,
1)-set contains more things than
F S
(
x
1
, ··· , x
m
). This
was discovered independently by a bunch of different people, including Folkman,
Rado and Sanders.
Similarly, if we let
F P (x
1
, ··· , x
m
) =
(
Y
iI
x
i
: I [n], I 6=
)
,
then we can find these as well. For example, we can restrict attention to
{
2
n
:
n N}
, and use the finite sum theorem. This is the same idea as we had
when we used van der Waerden to find geometric progressions.
But what if we want both? Can we have
F S
(
x
1
, ··· , x
m
)
F P
(
x
1
, ··· , x
m
)
in the same colour class? The answer is actually not known! Even the case
when
m
= 2, i.e. finding a monochromatic set of the form
{x, y, x
+
y, xy}
is
open. Until mid 2016, we did not know if we can find
{x
+
y, xy}
monochromatic
(x, y > 2).
To finish he proof of Rado’s theorem, we need the following proposition:
Proposition.
If
A
is a rational matrix with the columns property, then there
is some
m, p, c N
such that
Ax
= 0 has a solution inside any (
m, p, c
) set, i.e.
all entries of the solution lie in the (m, p, c) set.
In the case of a single equation, we reduced the general problem to the case
of 3 variables only. Here we are going to do something similar we will use the
columns property to reduce the solution to something much smaller.
Proof. We again write
A =
c
(1)
c
(2)
··· c
(n)
.
Re-ordering the columns of A if necessary, we assume that we have
[n] = B
1
··· B
r
such that max(B
s
) < min(B
s+1
) for all s, and we have
X
iB
s
c
(i)
=
X
iB
1
...B
s1
q
is
c
(i)
for some
q
is
Q
. These
q
is
only depend on the matrix. In other words, we have
X
d
is
c
(i)
= 0,
where
d
is
=
q
is
i B
1
···B
s1
1 i B
s
0 otherwise
For a fixed
s
, if we scan these coefficients starting from
i
=
n
and then keep
decreasing
i
, then the first non-zero coefficient we see is 1, which is good, because
it looks like what we see in an (m, p, c) set.
Now we try to write down a general solution with
r
many free variables.
Given x
1
, ··· , x
r
N
r
, we look at
y
i
=
r
X
s=1
d
is
x
s
.
It is easy to check that Ay = 0 since
X
y
i
c
(i)
=
X
i
X
s
d
is
x
s
c
(i)
=
X
s
x
s
X
i
d
is
c
(i)
= 0.
Now take
m
=
r
, and pick
c
large enough such that
cd
is
Z
for all
i, s
, and
finally, p = max{cd
is
: i, s Q}.
Thus, if we consider the (
m, p, c
)-set on generators (
x
m
, ··· , x
1
) and
y
i
as
defined above, then we have
Ay
= 0 and hence
A
(
cy
) = 0. Since
cy
is integral,
and lies in the (m, p, c) set, we are done!
We have thus proved Rado’s theorem.
Theorem
(Rado’s theorem)
.
A matrix
A
is partition regular iff it has the
column property.
So we have a complete characterization of all partition regular matrices.
Note that Rado’s theorem reduces Schur’s theorem, van der Waerden’s
theorem, finite sums theorem etc. to just checking if certain matrices have the
columns property, which are rather straightforward computations.
More interestingly, we can prove some less obvious “theoretical” results.
Corollary
(Consistency theorem)
.
If
A
,
B
are partition regular in independent
variables, then
A 0
0 B
is partition regular. In other words, we can solve
Ax
= 0 and
By
= 0 simultane-
ously in the same colour class.
Proof. The matrix
A 0
0 B
has the columns property if A and B do.
In fact, much much more is true.
Corollary.
Whenever
N
is finitely-coloured, one colour class contains solutions
to all partition regular systems!
Proof.
Suppose not. Then we have
N
=
D
1
··· D
k
such that for each
D
i
,
there is some partition regular matrix
A
i
such that we cannot solve
A
i
x
= 0
inside
D
i
. But this contradicts the fact that
diag
(
A
1
, A
2
, ··· , A
k
) is partition
regular (by applying consistency theorem many times).
Where did this whole idea of the (
m, p, c
)-sets come from? The original proof
by Rado didn’t use (
m, p, c
)-sets, and this idea only appeared only a while later,
when we tried to prove a more general result.
In general, we call a set
D N
partition regular if we can solve any partition
regular system in
D
. Then we know that
N,
2
N
are partition regular sets, but
2
N
+ 1 is not (because we can’t solve
x
+
y
=
z
, say). Then what Rado’s theorem
says is that whenever we finitely partition
N
, then one piece of
N
is partition
regular.
In the 1930’s, Rado conjectured that there is nothing special about
N
to
begin with whenever we break up a partition regular set, then one of the
pieces is partition regular. This conjecture was proved by Deuber in the 1970s
who introduced the idea of the (m, p, c)-set.
It is not hard to check that
D
is partition regular iff
D
contains an (
m, p, c
)
set of each size. Then Deuber’s proof involves showing that for all
m, p, c, k N
,
there exists
n, q, d N
such that any
k
-colouring of an (
n, q, d
)-set contains
a monochromatic (
m, p, c
) set. The proof is quite similar to how we found
(
m, p, c
)-sets in the naturals, but instead of using van der Waerden theorem, we
need the Hales–Jewett theorem.
We end by mentioning an open problem in this area. Suppose
A
is an
m ×n
matrix
A
that is not partition regular. That is, there is some
k
-colouring of
N
with no solution to Ax = 0 in a colour class. Can we find some bound f (m, n),
such that every such
A
has a “bad” colouring with
k < f
(
m, n
)? This is an open
problem, first conjectured by Rado, and we think the answer is yes.
What do we actually know about this problem? The answer is trivially yes
for
f
(1
,
2), as there aren’t many matrices of size 1
×
2, up to rescaling. It is a
non-trivial theorem that
f
(1
,
3) exists, and in fact
f
(1
,
3)
24. We don’t know
anything more than that.