III Combinatorics (Full)

Part III — Combinatorics

Based on lectures by B. Bollobas

Notes taken by Dexter Chua

Michaelmas 2017

These notes are not endorsed by the lecturers, and I have modified them (often

significantly) after lectures. They are nowhere near accurate representations of what

was actually lectured, and in particular, all errors are almost surely mine.

What can one say about a collection of subsets of a finite set satisfying certain conditions

in terms of containment, intersection and union? In the past fifty years or so, a good

many fundamental results have been proved about such questions: in the course we

shall present a selection of these results and their applications, with emphasis on the

use of algebraic and probabilistic arguments.

The topics to be covered are likely to include the following:

– The de Bruijn–Erd¨os theorem and its extensions.

– The Graham–Pollak theorem and its extensions.

– The theorems of Sperner, EKR, LYMB, Katona, Frankl and F¨uredi.

–

Isoperimetric inequalities: Kruskal–Katona, Harper, Bernstein, BTBT, and their

applications.

–

Correlation inequalities, including those of Harris, van den Berg and Kesten, and

the Four Functions Inequality.

– Alon’s Combinatorial Nullstellensatz and its applications.

– LLLL and its applications.

Pre-requisites

The main requirement is mathematical maturity, but familiarity with the basic graph

theory course in Part II would be helpful.

Contents

1 Hall’s theorem

2 Sperner systems

3 The Kruskal–Katona theorem

4 Isoperimetric inequalities

5 Sum sets

6 Projections

7 Alon’s combinatorial Nullstellensatz

1 Hall’s theorem

We shall begin with a discussion of Hall’s theorem. Ideally, you’ve already met

it in IID Graph Theory, but we shall nevertheless go through it again.

Definition

(Bipartite graph)

We say

= (

X, Y

;

) is a bipartite graph with

bipartition

and

if (

X t Y, E

) is a graph such that every edge is between a

vertex in X and a vertex in Y .

We say such a bipartite graph is (

k, `

)-regular if every vertex in

has degree

and every vertex in

has degree

. A bipartite graph that is (

k, `

)-regular for

some k, ` ≥ 1 is said to be biregular.

Definition

(Complete matching)

Let

= (

X, Y

;

) be a bipartite graph

with bipartition

and

. A complete matching from

is an injection

f : X → Y such that x f(x) is an edge for every x ∈ X.

Hall’s theorem gives us a necessary and sufficient condition for the existence

of a complete matching. Let’s try to first come up with a necessary condition.

If there is a complete matching, then for any subset

S ⊆ X

, we certainly have

Γ(

)

| ≥ |S|

, where Γ(

) is the set of neighbours of

. Hall’s theorem says this

is also sufficient.

Theorem

(Hall, 1935)

A bipartite graph

= (

X, Y

;

) has a complete match-

ing from X to Y if and only if |Γ(S)| ≥ |S| for all S ⊆ X.

This condition is known as Hall’s condition.

Proof.

We may assume

is edge-minimal satisfying Hall’s condition. We show

that

is a complete matching from

. For

to be a complete matching,

we need the following two properties:

(i) Every vertex in X has degree 1

(ii) Every vertex in Y has degree 0 or 1.

We first examine the second condition. Suppose

y ∈ Y

is such that there

exists edges

y, x

y ∈ E

. Then the minimality of

implies there are sets,

, X

⊆ X

such that

∈ X

such that

Γ(

)

and

is the only

neighbour of y in X

Now consider the set

∩ X

. We know Γ(

∩ X

)

⊆

Γ(

)

∩

Γ(

Moreover, this is strict, as y is in the RHS but not the LHS. So we have

Γ(X

∩ X

) ≤ |Γ(X

) ∩Γ(X

)| −1.

But also

∩ X

| ≤ |Γ(X

∩ X

≤ |Γ(X

) ∩Γ(X

)| −1

= |Γ(X

)| + |Γ(X

)| −|Γ(X

) ∪Γ(X

)| −1

= |X

| + |X

| −|Γ(X

∪ X

)| −1

≤ |X

| + |X

| −|X

∪ X

| −1

= |X

∩ X

| −1,

which contradicts Hall’s condition.

One then sees that the first condition is also satisfied — if

x ∈ X

is a vertex,

then the degree of

certainly cannot be 0, or else

Γ(

{x}

)

| < |{x}|

, and we see

that

(

) cannot be

1 or else we can just remove an edge from

without

violating Hall’s condition.

We shall now describe some consequences of Hall’s theorem. They will

be rather straightforward applications, but we shall later see they have some

interesting consequences.

Let

, . . . , A

}

be a set system. All sets are finite. A set of distinct

representatives of A is a set {a

, . . . a

} of distinct elements a

∈ A

Under what condition do we have a set of distinct representatives? If we

have one, then for any I ⊆ [m] = {1, 2, . . . , m}, we have



[

i∈I



≥ |I|.

We might hope this is sufficient.

Theorem. A has a set of distinct representatives iff for all B ⊆ A, we have



[

B∈B



≥ |B|.

This is an immediate consequence of Hall’s theorem.

Proof.

Define a bipartite graph as follows — we let

, and

i∈[m]

Then draw an edge from

x ∈ A

. Then there is a complete matching

of this graph iff

has a set of distinct representations, and the condition in the

theorem is exactly Hall’s condition. So we are done by Hall’s theorem.

Theorem.

Let

= (

X, Y

;

) be a bipartite graph such that

(

)

≥ d

(

) for

all x ∈ X and y ∈ Y . Then there is a complete matching from X to Y .

Proof.

Let

be such that

(

)

≥ d ≥ d

(

) for all

x ∈ X

and

y ∈ Y

. For

S ⊆ X

and

T ⊆ Y

, we let

(

S, T

) be the number of edges between

and

. Let

S ⊆ X

and T = Γ(S). Then we have

e(S, T ) =

x∈S

d(x) ≥ d|S|,

but on the other hand, we have

e(S, T ) ≤

y ∈T

d(y) ≤ d|T|.

So we find that |T | ≥ |S|. So Hall’s condition is satisfied.

Corollary.

= (

X, Y

;

) is a (

k, `

)-regular bipartite graph with 1

≤ ` ≤ k

then there is a complete matching from X to Y .

Theorem. Let G = (X, Y ; E) be biregular and A ⊆ X. Then

|Γ(A)|

|Y |

≥

|A|

|X|

Proof. Suppose G is (k, `)-regular. Then

k|A| = e(A, Γ(A)) ≤ `|Γ(A)|.

Thus we have

|Γ(A)|

|Y |

≥

k|A|

`|Y |

On the other hand, we can count that

|E| = |X|k = |Y |`,

and so

|Y |

|X|

So we are done.

Briefly, this says biregular graphs “expand”.

Corollary.

Let

= (

X, Y

;

) be biregular and let

|X| ≤ |Y |

. Then there is a

complete matching of X into Y .

In particular, for any biregular graph, there is always a complete matching

from one side of the graph to the other.

Notation.

Given a set

, we write

(r)

for the set of all subsets of

with

elements, and similarly for X

(≥r)

and X

(≤r)

If |X| = n, then |X

(r)

| =





Now given a set

and two numbers

r < s

, we can construct a biregular

graph (X

(r)

, X

(s)

; E), where A ∈ X

(r)

is joined to B ∈ X

(s)

if A ⊆ B.

Corollary.

Let 1

≤ r < s ≤ |X|

. Suppose

− r| ≥ |

− s|

. Then there

exists an injection f : X

(r)

→ X

(s)

such that A ⊆ f(A) for all A ∈ X

(r)

− r| ≤ |

− s|

, then there exists an injection

(s)

→ X

(r)

such that

A ⊇ g(A) for all A ∈ X

(s)

Proof. Note that |

− r| ≤ |

− s| iff





≥





2 Sperner systems

In the next few chapters, we are going to try to understand the power set

(

)

of a set. One particularly important structure of

(

) is that it is a graded

poset. A lot of the questions we ask can be formulated for arbitrary (graded)

posets, but often we will only answer them for power sets, since that is what we

are interested in.

Definition

(Chain)

A subset

C ⊆ S

of a poset is a chain if any two of its

elements are comparable.

Definition

(Anti-chain)

A subset

A ⊆ S

is an anti-chain if no two of its

elements are comparable.

Given a set

, the power set

(

) of

can be viewed as a Boolean lattice.

This is a poset by saying A < B if A ( B.

In general, there are many questions we can ask about a poset

. For

example, we may ask what is the largest possible of an anti-chain in

. While

this is quite hard in general, we may be able to produce answers if we impose

some extra structure on our posets. One particularly useful notion is that of a

graded poset.

Definition

(Graded poset)

We say

= (

S, <

) is a graded poset if we can write

S as a disjoint union

S =

i=0

such that

– S

is an anti-chain; and

– x < y

iff there exists elements

< z

i+1

< ··· < z

such that

∈ S

Example. If X is a set, P(X) is an anti-chain with X

= X

(i)

If we want to obtain the largest anti-chain as possible, then we might try

with

. But is this actually the largest possible? Or can we construct

some funny-looking anti-chain that is even larger? Sperner says no.

Theorem

(Sperner, 1928)

For

|X|

, the maximal size of an antichain in

P(X) is



bn/2c



, witnessed by X

bn/2c

Proof.

is a chain and

is an antichain, then

|A ∩C| ≤

1. So it suffices to

partition P(X) into

m = max







bn/2c





dn/2e



many chains.

We can do so using the injections constructed at the end of the previous

section. For

i ≥ b

, we can construct injections

i−1

→ X

such that

A ⊆ f

(

) for all

. By chaining these together, we get

chains ending in

Similarly, we can partition

(≤bn/2c)

into

chains with each chain ending

in X

(bn/2c)

. Then glue them together.

Another way to prove this result is to provide an alternative measure on how

large an antichain can be, and this gives a stronger result.

Theorem

(LYM inequality)

Let

be an antichain in

(

) with

|X|

Then

r=0

|A ∩X

(r)





≤ 1.

In particular, |A| ≤ max







bn/2c



, as we already know.

Proof.

A chain

⊆ C

⊆ ··· ⊆ C

is maximal if it has

+ 1 elements.

Moreover, there are

! maximal chains, since we start with the empty set and

then, given

, we produce

i+1

by picking one unused element and adding it

to C

For every maximal chain

, we have

|C ∩ A| ≤

1. Moreover, every set of

elements appears in

n −k

)! maximal chains, by a similar counting argument

as above. So

A∈A

|A|!(n −|A|)! ≤ n!.

Then the result follows.

There are analogous results for posets more general than just

(

). To

formulate these results, we must introduce the following new terminology.

Definition (Shadow). Given A ⊆ S

, the shadow at level i −1 is

∂A = {x ∈ S

i−1

: x < y for some y ∈ A}.

Definition

(Downward-expanding poset)

A graded poset

= (

S, <

) is said to

be downward-expanding if

|∂A|

i−1

≥

|A|

for all A ⊆ A

We similarly define upward-expanding, and say a poset is expanding if it is

upward or downward expanding.

Definition (Weight). The weight of a set A ⊆ S is

w(A) =

i=0

|A ∩S

The theorem is that the LYM inequality holds in general for any downward

expanding posets.

Theorem.

is downward expanding and

is an anti-chain, then

(

)

≤

In particular, |A| ≤ max

Since each S

is an anti-chain, the largest anti-chain has size max

Proof. We define the span of A to be

span A = max

6=∅

j − min

6=∅

We do induction on span A.

span A

= 0, then we are done. Otherwise, let

max

6=0

, and set

h−1

= ∂A

. Then since A is an anti-chain, we know A

h−1

∩ B

h−1

= ∅.

We set

A\A

∪B

h−1

. This is then another anti-chain, by the transitivity

of <. We then have

w(A) = w(A

) + w(A

) −w(B

h−1

) ≤ w(A

) ≤ 1,

where the first inequality uses the downward-expanding hypothesis and the

second is the induction hypothesis.

We may want to mimic our other proof of the fact that the largest size of an

antichain in P(X) is



bn/2c



. This requires the notion of a regular poset.

Definition

(Regular poset)

We say a graded poset (

S, <

) is regular if for each

, there exists

, s

such that if

x ∈ A

, then

dominates

elements at level

i −1, and is dominated by s

elements at level i + 1.

Proposition. An anti-chain in a regular poset has weight ≤ 1.

Proof.

Let

be the number of maximal chains of length (

+ 1), and for each

x ∈ S

, let m(x) be the number of maximal chains through x. Then

m(x) =

i=1

n−1

i=k

So if x, y ∈ S

, then m(x) = m(y).

Now since every maximal chain passes through a unique element in

, for

each x ∈ S

, we have

M =

x∈S

m(x) = |S

|m(x).

This gives the formula

m(x) =

now let

be an anti-chain. Then

meets each chain in

≤

1 elements. So we

have

M =

maximal chains

1 ≥

x∈A

m(x) =

i=0

|A ∩S

| ·

So it follows that

|A ∩S

≤ 1.

Let’s now turn to a different problem. Suppose

, . . . , x

∈ C

, with each

| ≥ 1. Given A ⊆ [n], we let

i∈A

We now seek the largest size of

such that

− x

| <

1 for all

A, B ∈ A

More precisely, we want to find the best choice of

, . . . , x

and

so that

|A|

is as large as possible while satisfying the above condition.

If we are really lazy, then we might just choose

= 1 for all

. By taking

A = [n]

bn/2c

, we can obtain |A| =



bn/2c



Erd¨os noticed this is the best bound if we require the x

to be real.

Theorem (Erd¨os, 1945). Let x

be all real, |x

| ≥ 1. For A ⊆ [n], let

i∈A

Let A ⊆ P(n). Then |A| ≤



bn/2c



Proof.

We claim that we may assume

≥

1 for all

. To see this, suppose we

instead had

−

2, say. Then whether or not

i ∈ A

determines whether

should include 0 or

−

2 in the sum. If we replace

with 2, then whether or not

i ∈ A

determines whether

should include 0 or 2. So replacing

with 2 just

essentially shifts all terms by 2, which doesn’t affect the difference.

But if we assume that

≥

1 for all

, then we are done, since

must be an

anti-chain, for if A, B ∈ A and A ( B, then x

− x

= x

B\A

≥ 1.

Doing it for complex numbers is considerably harder. In 1970, Kleitman

found a gorgeous proof for every normed space. This involves the notion of a

symmetric decomposition. To motivate this, we first consider the notion of a

symmetric chain.

Definition

(Symmetric chain)

We say a chain

, C

i+1

, . . . , C

n−i

}

symmetric if |C

| = j for all j.

Theorem. P(n) has a decomposition into symmetric chain.

Proof.

We prove by induction. In the case

= 1, we simply have to take

{∅, {1}}.

Now suppose

(

n −

1) has a symmetric chain decomposition

∪ ··· ∪ C

Given a symmetric chain

= {C

, C

i+1

, . . . , C

n−1−i

we obtain two chains C

(0)

, C

(1)

in P(n) by

(0)

= {C

, C

i+1

, . . . , C

n−1−i

, C

n−1−i

∪ {n}}

(1)

= {C

∪ {n}, C

i+1

∪ {n}, . . . , C

n−2−i

∪ {n}}.

Note that if

= 1, then

(1)

∅

, and we drop this. Under this convention, we

note that every A ∈ P(n) appears in exactly one C

(ε)

, and so we are done.

We are not going to actually need the notion of symmetric chains in our

proof. What we need is the “profile” of a symmetric chain decomposition. By a

simple counting argument, we see that for 0

≤ i ≤

, the number of chains with

n + 1 − 2i sets is

`(n, i) ≡





−



i −1



Theorem

(Kleitman, 1970)

Let

, x

, . . . , x

be vectors in a normed space

with norm kx

k ≥ 1 for all i. For A ∈ P(n), we set

i∈A

Let A ⊆ P(n) be such that kx

− x

k < 1. Then kAk ≤



bn/2c



This bound is indeed the best, since we can pick

for some

kxk ≥

and then we can pick A = [n]

bn/2c

Proof.

Call

F ⊆ P

(

) sparse if

− x

k ≥

1 for all

E, F ∈ F

E 6

. Note

that if

is sparse, then

|F ∩ A| ≤

1. So if we can find a decomposition of

(

)

into



bn/2c



sparse sets, then we are done.

We call a partition

(

) =

∪··· ∪F

symmetric if the number of families

with

+ 1

−

sets is

(

n, i

), i.e. the “profile” is that of a symmetric chain

decomposition.

Claim. P(n) has a symmetric decomposition into sparse families.

We again induct on

. When

= 1, we can take

{∅, {

}}

. Now suppose

∆

n−1

is a symmetric decomposition of P(n −1) as F

∪ ··· ∪ F

Given

, we construct

(0)

and

(1)

“as before”. We pick some

D ∈ F

, to

be decided later, and we take

(0)

= F

∪ {D ∪ {n}}

(1)

= {E ∪ {n} : E ∈ F

\ {D}}.

The resulting set is certainly still symmetric. The question is whether it is sparse,

and this is where the choice of

comes in. The collection

(1)

is certainly still

sparse, and we must pick a D such that F

(0)

is sparse.

To do so, we use Hahn–Banach to obtain a linear functional

such that

kfk

= 1 and

(

) =

k ≥

1. We can then pick

to maximize

(

). Then

we check that if E ∈ F

, then

f(x

D∪{n}

− x

) = f(x

) −f (x

) + f (x

By assumption,

(

)

≥

1 and

(

)

≥ f

(

). So this is

≥

1. Since

kfk

= 1, it

follows that kx

D∪{n}

− x

k ≥ 1.

3 The Kruskal–Katona theorem

For A ⊆ X

(r)

, recall we defined the lower shadow to be

∂A = {B ∈ X

(r−1)

: B ⊆ A for some A ∈ A}.

The question we wish to understand is how small we can make

∂A

, relative to

A. Crudely, we can bound the size by

|∂A| ≥ |A|



r−1







n −r

|A|.

But surely we can do better than this. To do so, one reasonable strategy is to

first produce some choice of

we think is optimal, and see how we can prove

that it is indeed optimal.

To do so, let’s look at some examples.

Example. Take n = 6 and r = 3. We pick

A = {123, 456, 124, 256}.

Then we have

∂A = {12, 13, 23, 45, 46, 56, 14, 24, 25, 26},

and this has 10 elements.

But if we instead had

A = {123, 124, 134, 234},

then

∂A = {12, 13, 14, 23, 24, 34},

and this only has 6 elements, and this is much better.

Intuitively, the second choice of

is better because the terms are “bunched”

together.

More generally, we would expect that if we have

|A|





, then the best

choice should be

= [

]

(r)

, with

|∂A|



r−1



. For other choices of

, perhaps

a reasonable strategy is to find the largest

such that





< |A|

, and then take

to be [

]

(r)

plus some elements. To give a concrete description of which extra

elements to pick, our strategy is to define a total order on [

]

(r)

, and say we

should pick the initial segment of length |A|.

This suggests the following proof strategy:

(i)

Come up with a total order on [

]

(r)

, or even

(r)

such that [

]

(r)

are

initial segments for all k.

(ii)

Construct some “compression” operators

(

(r)

)

→ P

(

(r)

) that pushes

each element down the ordering without increasing the |∂A|.

(iii)

Show that the only subsets of

(r)

that are fixed by the compression

operators are the initial segments.

There are two natural orders one can put on [n]

(r)

– lex: We say A < B if min A∆B ∈ A.

– colex: We say A < B if max A∆B ∈ B.

Example. For r = 3, the elements of X

(3)

in colex order are

123, 124, 134, 234, 125, 135, 235, 145, 245, 345, 126, . . .

In fact, colex is an order on

(r)

, and we see that the initial segment with





elements is exactly [n]

(r)

. So this is a good start.

If we believe that colex is indeed the right order to do, then we ought to

construct some compression operators. For

i 6

, we define the (

i, j

)-compression

as follows: for a set A ∈ X

(r)

, we define

(A) =

(

(A \{j}) ∪{i} j ∈ A, i 6∈ A

A otherwise

For a set system, we define

(A) = {C

(A) : A ∈ A}∪ {A ∈ A : C

(A) ∈ A}

We can picture our universe of sets as follows:

B ∪ {j}

B ∪ {i}

The set system

is some subset of all these points, and we what we are doing

is that we are pushing everything down when possible.

It is clear that we have |C

(A)| = |A|. We further observe that

Lemma. We have

∂C

(A) ⊆ C

(∂A).

In particular, |∂C

(A)| ≤ |∂A|.

Given

A ⊆ X

(r)

, we say

is left-compressed if

(

) =

for all

i < j

. Is

this good enough?

Of course initial segments are left-compressed. However, it turns out the

converse is not true.

Example. {123, 124, 125, 126} are left-compressed, but not an initial segment.

So we want to come up with “more powerful” compressions. For

U, V ∈ X

(s)

with

U ∩V

∅

, we define a (

U, V

)-compression as follows: for

A ⊆ X

, we define

(A) =

(

(A \V ) ∪U A ∩(U ∪ V ) = V

A otherwise

Again, for A ⊆ X

(r)

, we can define

(A) = {C

(A) : A ∈ A}∪ {A ∈ A : C

(A) ∈ A}.

Again, A is (U, V )-compressed if C

(A) = A.

This time the behaviour of the compression is more delicate.

Lemma.

Let

A ⊆ X

(r)

and

U, V ∈ X

(s)

U ∩ V

∅

. Suppose for all

u ∈ U

there exists v such that A is (U \ {u}, V \ {v})-compressed. Then

∂C

(A) ⊆ C

(∂A). 

Lemma. A ⊆ X

(r)

is an initial segment of

(r)

in colex if and only if it is

(U, V )-compressed for all U, V disjoint with |U| = |V | and max V > max U .

Proof. ⇒

is clear. Suppose

is (

U, V

) compressed for all such

U, V

. If

is not

an initial segment, then there exists

B ∈ A

and

C 6∈ A

such that

C < B

. Then

A is not (C \ B, B \ C)-compressed. A contradiction.

Lemma.

Given

A ∈ X

(r)

, there exists

B ⊆ X

(r)

such that

is (

U, V

compressed for all |U| = |V |, U ∩V = ∅, max V > max U, and moreover

|B| = |A|, |∂B| ≤ |∂A|. (∗)

Proof. Let B be such that

B∈B

i∈B

is minimal among those

’s that satisfy (

∗

). We claim that this

will do. Indeed,

if there exists (

U, V

) such that

|U|

|V |

max V > max U

and

(

)

then pick such a pair with

|U|

minimal. Then apply a (

U, V

)-compression, which

is valid since given any

u ∈ U

we can pick any

v ∈ V

that is not

max V

to satisfy

the requirements of the previous lemma. This decreases the sum, which is a

contradiction.

From these, we conclude that

Theorem

(Kruskal 1963, Katona 1968)

Let

A ⊆ X

(r)

, and let

C ⊆ X

(r)

the initial segment with |C| = |A|. Then

|∂A| ≥ |∂C|.

We can now define the shadow function

∂

(r)

(m) = min{|∂A| : A ⊆ X

(r)

, |A| = m}.

This does not depend on the size of

as long as

is large enough to accommo-

date

sets, i.e.





≥ m

. It would be nice if we can concretely understand this

function. So let’s try to produce some initial segments.

Essentially by definition, an initial segment is uniquely determined by the

last element. So let’s look at some examples.

Example.

Take

= 4. What is the size of the initial segment ending in 3479?

We note that anything that ends in something less than 8 is less that 3479,

and there are





such elements. If you end in 9, then you are still fine if the

second-to-last digit is less than 7, and there are





such elements. Continuing,

we find that there are













such elements.

Given

> m

r−1

> ··· > m

≥ s

, we let

(r)

(

, m

r−1

, . . . , m

) be the

initial segment ending in the element

+ 1, m

r−1

+ 1, . . . , m

s+1

+ 1, m

, m

− 1, m

− 2, . . . , m

− (s − 1).

This consists of the sets

< a

< ··· < a

}

such that there exists

j ∈

[

s, r

]

with a

= m

+ 1 for i > j, and a

≤ m

To construct an element in

(r)

(

, . . . , m

), we need to first pick a

, and

then select j elements that are ≤ m

. Thus, we find that

(r)

, . . . , m

)| =

j=s





= b

(r)

, . . . , m

We see that this

(r)

is indeed the initial segment in the colex order ending

in that element. So we know that for all

m ∈ N

, there is a unique sequence

> m

r−1

> . . . , > m

≥ s such that n =

j=0





It is also not difficult to find the shadow of this set. After a bit of thinking,

we see that it is given by

(r−1)

, . . . , m

Thus, we find that

∂

(r)

i=s





i=s



i −1



and moreover every

can be expressed in the form

i=s





for some unique

choices of m

In particular, we have

∂

(r)







r − 1



Since it might be slightly annoying to write

in the form

i=s





, Lov´asz

provided another slightly more convenient bound.

Theorem (Lov´asz, 1979). If A ⊆ X

(r)

with |A| =





for x ≥ 1, x ∈ R, then

|∂A| ≥



r − 1



This is best possible if x is an integer.

Proof. Let

= {A ∈ A : 1 6∈ A}

= {A ∈ A : 1 ∈ A}.

For convenience, we write

− 1 = {A \ {1} : A ∈ A

We may assume

is (

i, j

)-compressed for all

i < j

. We induct on

and then on

|A|. We have

| = |A|−|A

We note that A

is non-empty, as A is left-compressed. So |A

| < |A|.

If r = 1 and |A| = 1 then there is nothing to do.

Now observe that

∂A ⊆ A

−

1, since if

A ∈ A

, 1

6∈ A

, and

B ⊆ A

is such

that

|A \ B|

= 1, then

B ∪ {

} ∈ A

since

is left-compressed. So it follows

that

|∂A

| ≤ |A

Suppose |A

| <



x−1

r−1



. Then

| >





−



x −1

r − 1





x −1



Therefore by induction, we have

|∂A

| >



x −1

r − 1



This is a contradiction, since

|∂A

| ≤ |A

. Hence

| ≥



x−1

r−1



. Hence we are

done, since

|∂A| ≥ |∂A

| = |A

| + |∂(A

− 1)| ≥



x −1

r − 1





x −1

r − 2





r − 1



4 Isoperimetric inequalities

We are now going to ask a question similar to the one answered by Kruskal–

Katona. Kruskal–Katona answered the question of how small can

∂A

be among

all

A ⊆ X

(r)

of fixed size. Clearly, we obtain the same answer if we sought to

minimized the upper shadow instead of the lower. But what happens if we want

to minimize both the upper shadow and the lower shadow? Or, more generally,

if we allow

A ⊆ P

(

) to contain sets of different sizes, how small can the set of

“neighbours” of A be?

Definition

(Boundary)

Let

be a graph and

A ⊆ V

(

). Then the boundary

b(A) is the set of all x ∈ G such that x 6∈ A but x is adjacent to A.

Example. In the following graph

the boundary of the green vertices is the red vertices.

An isoperimetric inequality on G is an inequality of the form

|b(A)| ≥ f(|A|)

for all

A ⊆ G

. Of course, we could set

f ≡

0, but we would like to do better

than that.

The “continuous version” of this problem is well-known. For example, in a

plane, given a fixed area, the perimeter of the area is minimized if we pick the

area to be a disc. Similarly, among subsets of

of a given volume, the solid

sphere has the smallest surface area. Slightly more exotically, among subsets of

of given area, the circular cap has smallest perimeter.

Before we proceed, we note the definition of a neighbourhood :

Definition

(Neighbourhood)

Let

be a graph and

A ⊆ V

(

). Then the

neighbourhood of A is N (A) = A ∪ b(A).

Of course,

(

)

(

)

| −|A|

, and it is often convenient to express and

prove our isoperimetric inequalities in terms of the neighbourhood instead.

If we look at our continuous cases, then we observe that all our optimal

figures are balls, i.e. they consist of all the points a distance at most

from a

point, for some

and single point. We would hope that this pattern generalizes.

Of course, it would be a bit ambitious to hope that balls are optimal for all

graphs. However, we can at least show that it is true for the graphs we care

about, namely graphs obtained from power sets.

Definition

(Discrete cube)

Given a set

, we turn

(

) into a graph as

follows: join

∆

= 1, i.e. if

y ∪ {a}

for some

a 6∈ y

, or vice versa.

This is the discrete cube Q

, where n = |X|.

Example. Q

looks like

123

1312 23

21 3

∅

This looks like a cube! Indeed, if we identify each

x ∈ Q

with the 0-1 sequence

of length

(e.g. 13

7→

101000

···

0), or, in other words, its indicator function,

then Q

is naturally identified with the unit cube in R

∅

2 12

23 123

Note that in this picture, the topmost layer is the points that do have 3, and the

bottom layer consists of those that do not have a 3, and we can make similar

statements for the other directions.

Example.

Take

, and try to find a size

of size 4 that has minimum

boundary. There are two things we might try — we can take a slice, or we can

take a ball. In this case, we see the ball is the best.

We can do more examples, and it appears that the ball

(≤r)

is the best all

the time. So that might be a reasonable thing to try to prove. But what if we

have |A| such that |X

(≤r)

| < |A| < |X

(≤r+1)

It is natural to think that we should pick an

with

(≤r)

⊆ A ⊆ X

(≤r+1)

so we set

(≤r)

∪B

, where

B ⊆ X

(r+1)

. Such an

is known as a Hamming

ball.

What B should we pick? Observe that

N(A) = X

(≤r+1)

∪ ∂

So we want to pick

to minimize the upper shadow. So by Kruskal–Katona,

we know we should pick B to be the initial segment in the lex order.

Thus, if we are told to pick 1000 points to minimize the boundary, we go up

in levels, and in each level, we go up in lex.

Definition

(Simplicial ordering)

The simplicial ordering on

is defined by

x < y if either |x| < |y|, or |x| = |y| and x < y in lex.

Our aim is to show that the initial segments of the simplicial order minimize

the neighbourhood. Similar to Kruskal–Katona, a reasonable strategy would be

to prove it by compression.

For

A ⊆ Q

, and 1

≤ i ≤ n

, the

-sections of

are

(i)

, A

(i)

−

⊆ P

(

X \ {i}

)

defined by

(i)

−

= {x ∈ A : i 6∈ x}

(i)

= {x \{i} : x ∈ A, i ∈ x}.

These are the top and bottom layers in the i direction.

The

-compression (or co-dimension 1 compression) of

(

), defined by

(A)

= first |A

| elements of P(X \ {i}) in simplicial

(A)

−

= first |A

−

| elements of P(X \ {i}) in simplicial

Example. Suppose we work in Q

, where the original set is

The resulting set is then

Clearly, we have

(

)

|A|

, and

(

) “looks more like” an initial segment

in simplicial ordering than A did.

We say A is i-compressed if C

(A) = A.

Lemma. For A ⊆ Q

, we have |N(C

(A))| ≤ |N(A)|.

Proof. We have

|N(A)| = |N(A

) ∪A

−

| + |N (A

−

) ∪A

Take B = C

(A). Then

|N(B)| = |N(B

) ∪B

−

| + |N (B

−

) ∪B

= max{|N(B

)|, |B

−

|} + max{|N (B

−

)|, |B

≤ max{|N(A

)|, |A

−

|} + max{|N (A

−

)|, |A

≤ |N(A

) ∪A

| + |N (A

−

) ∪A

= |N(A)|

Since each compression moves us down in the simplicial order, we can keep

applying compressions, and show that

Lemma. For any A ⊆ Q

, there is a compressed set B ⊆ Q

such that

|B| = |A|, |N(B)| ≤ |N(A)|.

Are we done? Does being compressed imply being an initial segment? No!

For

= 3, we can take

{∅,

}

, which is obviously compressed, but is not

an initial segment. To obtain the actual initial segment, we should replace 12

with 3.

∅

2 12

23 123

For

= 4, we can take

{∅,

}

, which is again compressed by

not an initial segment. It is an initial segment only if we replace 23 with 14.

∅

2 12

23 123

4 14

24 124

134

234 1234

We notice that these two examples have a common pattern. The “swap” we

have to perform to get to an initial segment is given by replacing an element

with its complement, or equivalently, swapping something the opposite diagonal

element. This is indeed general.

Lemma.

For each

, there exists a unique element

z ∈ Q

such that

is the

successor of z.

Moreover, if

B ⊆ Q

is compressed but not an initial segment, then

|B|

n−1

and

is obtained from taking the initial segment of size 2

n−1

and replacing

x with x

Proof.

For the first part, simply note that complementation is an order-reversing

bijection

→ Q

, and

is even. So the 2

n−1

th element is the only such

element z.

Now if

is not an initial segment, then we can find some

x < y

such that

x 6∈ B

and

y ∈ B

. Since

is compressed, it must be the case that for each

there is exactly one of

and

that contains

. Hence

. Note that this is

true for all

x < y

such that

x 6∈ B

and

y ∈ B

. So if we write out the simplicial

order, then B must look like

···

since any

x 6∈ B

such that

x < y

must be given by

, and so there must be

a unique such

, and similarly the other way round. So it must be the case that

y is the successor of x, and so x = z.

We observe that these anomalous compressed sets are worse off than the

initial segments (exercise!). So we deduce that

Theorem

(Harper, 1967)

Let

A ⊆ Q

, and let

be the initial segment in the

simplicial order with |C| = |A|. Then |N(A)| ≥ |N(C)|. In particular,

|A| =

i=0





implies |N (A)| ≥

r+1

i=0





The edge isoperimetric inequality in the cube

Let

A ⊆ V

be a subset of vertices in a graph

= (

V, E

). Consider the edge

boundary

∂

A = {xy ∈ E : x ∈ A, y 6∈ A}.

Given a graph

, and given the size of

, can we give a lower bound for the size

of ∂

Example.

Take

. For the vertex isoperimetric inequality, our optimal

solution with |A| = 4 was given by

∅

2 12

23 123

The edge boundary has size 6. However, if we just pick a slice, then the edge

boundary has size 4 only.

More generally, consider

2k+1

, and take the Hamming ball

(≤`)

. Then

∂

= {AB : A ⊆ B ⊆ X : |A| = k, |B| = k + 1}.

So we have

|∂

| =



2k + 1

k + 1



· (k + 1) ∼

√

2π

However, if we pick the bottom face of

. Then

|A|

= 2

n−1

and

|∂

= 2

n−1

This is much much better.

More generally, it is not unreasonable to suppose that sub-cubes are always

the best. For a k-dimensional sub-cube in Q

, we have

|∂

| = 2

(n −k).

If we want to prove this, and also further solve the problem for

|A|

not a power

of 2, then as our previous experience would suggest, we should define an order

on P(X).

Definition

(Binary order)

The binary order on

∼

(

) is given by

x < y

if max x∆y ∈ y.

Equivalently, define ϕ : P(X) → N by

ϕ(x) =

i∈x

Then x < y if ϕ(x) < ϕ(y).

The idea is that we avoid large elements. The first few elements in the

elements would like like

∅, 1, 2, 123, 13, 23, 123, . . . .

Theorem.

Let

A ⊆ Q

be a subset, and let

C ⊆ Q

be the initial segment of

length |A| in the binary order. Then |∂

C| ≤ |∂

A|.

Proof.

We induct on

using codimension-1 compressions. Recall that we

previously defined the sets A

(i)

The

-compression of

is the set

B ⊆ Q

such that

(i)

, and

(i)

are initial segments in the binary order. We set D

(A) = B.

Observe that performing

reduces the edge boundary. Indeed, given any

A, we have

|∂

A| = |∂

(i)

| + |∂

(i)

−

| + |A

(i)

∆A

(i)

Applying

clearly does not increase any of those factors. So we are happy.

Now note that if A 6= D

A, then

x∈A

i∈x

x∈D

i∈x

So after applying compressions finitely many times, we are left with a compressed

set.

We now hope that a compressed subset must be an initial segment, but this

is not quite true.

Claim. If A is compressed but not an initial, then

A =

B = P(X \ {n}) \{123 ···(n −1)} ∪ {n}.

By direct computation, we have

|∂

B| = 2

n−1

− 2(n − 2),

and so the initial segment is better. So we are done.

The proof of the claim is the same as last time. Indeed, by definition, we can

find some

x < y

such that

x 6∈ A

and

y ∈ A

. As before, for any

, it cannot be

the case that both

and

contain

or neither contain

, since

is compressed.

So x = y

, and we are done as before.

5 Sum sets

Let G be an abelian group, and A, B ⊆ G. Define

A + B = {a + b : a ∈ A, b ∈ B}.

For example, suppose

and

< a

< ··· < a

}

and

< ··· < b

}

. Surely,

B ≤ nm

, and this bound can be achieved. Can we

bound it from below? The elements

+ b

, a

+ b

, . . . , a

+ b

, a

+ b

, . . . , a

+ b

are certainly distinct as well, since they are in strictly increasing order. So

|A + B| ≥ m + n − 1 = |A| + |B|− 1.

What if we are working in a finite group? In general, we don’t have an order, so

we can’t make the same argument. Indeed, the same inequality cannot always

be true, since

|G|

. Slightly more generally, if

is a subgroup of

then |H + H| = |H|.

So let’s look at a group with no subgroups. In other words, pick G = Z

Theorem

(Cauchy–Davenport theorem)

Let

and

be non-empty subsets

of Z

with p a prime, and |A| + |B| ≤ p + 1. Then

|A + B| ≥ |A| + |B|−1.

Proof.

We may assume 1

≤ |A| ≤ |B|

. Apply induction on

|A|

. If

|A|

= 1, then

there is nothing to do. So assume A ≥ 2.

Since everything is invariant under translation, we may assume 0

, a ∈ A

with

a 6

= 0. Then

{a,

a, . . . , pa}

. So there exists

k ≥

0 such that

ka ∈ B

and

(k + 1)a 6∈ B.

By translating B, we may assume 0 ∈ B and a 6∈ B.

Now 0 ∈ A ∩B, while a ∈ A \ B. Therefore we have

1 ≤ |A ∩B| < |A|.

Hence

|(A ∩B) + (A ∪B)| ≥ |A ∩ B|+ |A ∪ B|− 1 = |A| + |B|− 1.

Also, clearly

(A ∩B) + (A ∪B) ⊆ A + B.

So we are done.

Corollary. Let A

, . . . , A

be non-empty subsets of Z

such that

i=1

| ≤ p + k −1.

Then

+ . . . + A

| ≥

i=1

| −k + 1.

What if we don’t take sets, but sequences? Let

, . . . , a

∈ Z

. What

do we need to take to guarantee that there are

elements that sums to 0? By

the pigeonhole principle, m ≥ n suffices. Indeed, consider the sequence

, a

+ a

, a

+ a

, ··· , a

+ ··· + a

If they are all distinct, then one of them must be zero, and so we are done. If

they are not distinct, then by the pigeonhole principle, there must be

k < k

such that

+ ··· + a

= a

+ ··· + a

So it follows that

k+1

+ ··· + a

So in fact we can even require the elements we sum over to be consecutive. On

the other hand, m ≥ n is also necessary, since we can take a

= 1 for all i.

We can tackle a harder question, where we require that the sum of a fixed

number of things vanishes.

Theorem

(Erd¨os–Ginzburg–Ziv)

Let

, . . . , a

2n−1

∈ Z

. Then there exists

I ∈ [2n −1]

(n)

such that

i∈I

= 0

in Z

Proof. First consider the case n = p is a prime. Write

0 ≤ a

≤ a

≤ ··· ≤ a

2p−1

< p.

i+p−1

, then there are

terms that are the same, and so we are done

by adding them up. Otherwise, set

, a

i+p−1

}

for

= 1

, . . . , p −

1, and

= {a

2p−1

}, then |A

| = 2 for i = 1, . . . , p −1 and |A

| = 1. Hence we know

+ ··· + A

| ≥ (2(p − 1) + 1) − p + 1 = p.

Thus, every element in

is a sum of some

of our terms, and in particular 0 is.

In general, suppose

is not a prime. Write

, where

is a prime and

m >

1. By induction, for every 2

m −

1 terms, we can find

terms whose sum

is a multiple of m.

Select disjoint S

, S

, . . . , S

2p−1

∈ [2n −1]

(m)

such that

j∈S

= mb

This can be done because after selecting, say, S

, . . . , S

2p−2

, we have

(2n −1) − (2p −2)m = 2m − 1

elements left, and so we can pick the next one.

We are essentially done, because we can pick

, . . . , j

such that

k=1

a multiple of p. Then

k=1

j∈S

is a sum of mp = n terms whose sum is a multiple of mp.

6 Projections

So far, we have been considering discrete objects only. For a change, let’s work

with something continuous.

Let K ⊆ R

be a bounded open set. For A ⊆ [n], we set

= {(x

)

i∈A

: ∃y ∈ K, y

− x

for all i ∈ A} ⊆ R

We write

for the Lebesgue measure of

as a subset of

. The question

we are interested in is given some of these

, can we bound

|K|

? In some

cases, it is completely trivial.

Example. If we have a partition of A

∪ ··· ∪ A

= [n], then we have

|K| ≤

i=1

But, for example, in R

, can we bound |K| given |K

|, |K

| and |K

It is clearly not possible if we only know, say

and

. For example,

we can consider the boxes





× (0, n) ×(0, n).

Proposition. Let K be a body in R

. Then

|K|

≤ |K

||K

This is actually quite hard to prove! However, given what we have done so

far, it is natural to try to compress

in some sense. Indeed, we know equality

holds for a box, and if we can make

look more like a box, then maybe we can

end up with a proof.

For K ⊆ R

, its n-sections are the sets K(x) ⊆ R

n−1

defined by

K(x) = {(x

, . . . , x

n−1

) ∈ R

n−1

: (x

, . . . , x

n−1

, x) ∈ K}.

Proof. Suppose first that each section of K is a square, i.e.

K(x) = (0, f(x)) × (0, f(x)) dx

for all x and some f. Then

|K| =

f(x)

dx.

Moreover,

| =



sup

f(x)



≡ M

, |K

| = |K

| =

f(x) dx.

So we have to show that



f(x)



≤ M



f(x) dx



but this is trivial, because f(x) ≤ M for all x.

Let’s now consider what happens when we compress

. For the general case,

define a new body L ⊆ R

by setting its sections to be

L(x) = (0,

|K(x)|) ×(0,

|K(x)|).

Then |L| = |K|, and observe that

| ≤ sup |K(x)| ≤



[

K(x)



= |K

To understand the other two projections, we introduce

g(x) = |K(x)

|, h(x) = |K(x)

Now observe that

|L(x)| = |K(x)| ≤ g(x)h(x),

Since

(

) is a square, it follows that

(

) has side length

≤ g

(

)

1/2

(

)

1/2

. So

| = |L

| ≤

g(x)

1/2

h(x)

1/2

dx.

So we want to show that



1/2



≤



g dx



h dx



Observe that this is just the Cauchy–Schwarz inequality applied to

1/2

and

1/2

. So we are done.

Let’s try to generalize this.

Definition ((Uniform) cover). We say a family A

, . . . , A

⊆ [n] covers [n] if

[

i=1

= [n],

and is a uniform k-cover if each i ∈ [n] is in exactly k many of the sets.

Example.

With

= 3, the singletons

{

}, {

}

form a 1-uniform cover,

and so does

{

}, {

}

. Also,

{

}, {

}

and

{

}

form a uniform 2-cover.

However, {1, 2} and {2, 3} do not form a uniform cover of [3].

Note that we allow repetitions.

Example. {1}, {1}, {2, 3}, {2}, {3} is a 2-uniform cover of [3].

Theorem

(Uniform cover inequality)

, . . . , A

is a uniform

-cover of [

then

|K|

i=1

Proof. Let A be a k-uniform cover of [k]. Note that A is a multiset. Write

−

= {A ∈ A : n 6∈ A}

= {A \{n} ∈ A : n ∈ A}

We have |A

| = k, and A

∪ A

−

forms a k-uniform cover of [n − 1].

Now note that if K = R

and n 6∈ A, then

| ≥ |K(x)

| (1)

for all x. Also, if n ∈ A, then

| =

|K(x)

A\{n}

| dx. (2)

In the previous proof, we used Cauchy–Schwarz. What we need here is H¨older’s

inequality

fg dx ≤





1/p





1/q

where

= 1. Iterating this, we get

···f

dx ≤

i=1





1/k

Now to perform the proof, we induct on

. We are done if

= 1. Otherwise,

given K ⊆ R

and n ≥ 2, by induction,

|K| =

|K(x)| dx

≤

A∈A

−

|K(x)

1/k

A∈A

|K(x)

1/k

dx (by induction)

≤

A∈A

−

1/k

A∈A

|K(x)

1/k

dx (by (1))

≤

A≤|A

−

1/k

A∈A



|K(x)



1/k

(by H¨older)

A∈A

1/k

A∈A

A∪{n}

1/k

. (by (2))

This theorem is great, but we can do better. In fact,

Theorem

(Box Theorem (Bollob´as, Thomason))

Given a body

K ⊆ R

, i.e.

a non-empty bounded open set, there exists a box

such that

|L|

|K|

and

| ≤ |K

| for all A ⊆ [n].

Of course, this trivially implies the uniform cover theorem. Perhaps more

surprisingly, we can deduce this from the uniform cover inequality.

To prove this, we first need a lemma.

Definition

(Irreducible cover)

A uniform

-cover is reducible if it is the disjoint

union of two uniform covers. Otherwise, it is irreducible.

Lemma. There are only finitely many irreducible covers of [n].

Proof.

Let

and

be covers. We say

A < B

is a “subset” of

, i.e. for

each A ⊆ [n], the multiplicity of A in A is less than the multiplicity in B.

Then note that the set of irreducible uniform

-covers form an anti-chain,

and observe that there cannot be an infinite anti-chain.

Proof of box theorem. For A an irreducible cover, we have

|K|

≤

A∈A

Also,

| ≤

i∈A

{i}

Let

A ⊆

[

]

}

be a minimal array with

≤ |K

such that for each

irreducible k-cover A, we have

|K|

≤

A∈A

(1)

and moreover

≤

i∈A

{i}

(2)

for all

A ⊆

[

]. We know this exists since there are only finitely many inequalities

to be satisfied, and we can just decrease the

’s one by one. Now again by

finiteness, for each

, there must be at least one inequality involving

on the

right-hand side that is in fact an equality.

Claim.

For each

i ∈

[

], there exists a uniform

-cover

containing

{i}

with

equality

|K|

A∈C

Indeed if

occurs on the right of (1), then we are done. Otherwise, it occurs

on the right of (2), and then there is some

such that (2) holds with equality.

Now there is some cover

containing

such that (1) holds with equality. Then

replace A in A with {{j} : j ∈ A}, and we are done.

Now let

C =

[

i=1

, C

= C \ {{1}, {2}, . . . , {n}}, k =

i=1

Then

|K|

A∈C

≥ |K|

k−1

i=1

So we have

|K| ≥

i=1

But we of course also have the reverse inequality. So it must be the case that

they are equal.

Finally, for each

, consider

{A} ∪{{i}

i 6∈ A}

. Then dividing (1) by

i∈A

gives us

i6∈A

≤ x

By (2), we have the inverse equality. So we have

i∈A

for all i. So we are done by taking L to be the box with side length x

Corollary.

is a union of translates of the unit cube, then for any (not

necessarily uniform) k-cover A, we have

|K|

≤

A∈A

Here a k-cover is a cover where every element is covered at least k times.

Proof.

Observe that if

B ⊆ A

, then

| ≤ |K

. So we can reduce

to a

uniform k-cover.

7 Alon’s combinatorial Nullstellensatz

Alon’s combinatorial Nullstellensatz is a seemingly unexciting result that has

surprisingly many useful consequences.

Theorem

(Alon’s combinatorial Nullstellensatz)

Let

be a field, and let

, . . . , S

be non-empty finite subsets of

with

+ 1. Let

f ∈

[

, . . . , X

] have degree

i=1

, and let the coefficient of

···X

be non-zero. Then f is not identically zero on S = S

× ··· × S

Its proof follows from generalizing facts we know about polynomials in one

variable. Here

will always be a ring;

always a field, and

the unique field

of order q = p

. Recall the following result:

Proposition

(Division algorithm)

Let

f, g ∈ R

[

] with

monic. Then we can

write

f = hg + r,

where deg h ≤ deg f − deg g and deg r < deg g.

Our convention is that deg 0 = −∞.

Let

= (

, . . . , X

) be a sequence of variables, and write

[

] =

R[X

, . . . , X

Lemma.

Let

f ∈ R

[

], and for

= 1

, . . . , n

, let

(

)

∈ R

[

]

⊆ R

[

]

be monic of degree

deg g

deg

. Then there exists polynomials

, . . . , h

, r ∈ R[X] such that

f =

+ r,

where

deg h

≤ deg f − deg d

deg

r ≤ d

− 1

deg

≤ deg

f − d

deg

r ≤ deg

deg

≤ deg

f deg r ≤ deg f

for all i, j.

Proof.

Consider

as a polynomial with coefficients in

[

, . . . , X

], then divide

by g

using the division algorithm. So we write

f = h

+ r

Then we have

deg

≤ deg

f − d

deg

≤ d

− 1

deg h

≤ deg f deg

≤ deg

deg

≤ deg

f deg r ≤ deg f.

Then repeat this with f replaced by r

, g

by g

, and X

by X

We also know that a polynomial of one variable of degree

n ≥

1 over a field

has at most n zeroes.

Lemma.

Let

, . . . , S

be non-empty finite subsets of a field

, and let

h ∈ F

[

]

be such that

deg

h < |S

for

= 1

, . . . , n

. Suppose

is identically 0 on

S = S

× ··· × S

⊆ F

. Then h is the zero polynomial.

Proof.

Let

|−

1. We induct on

. If

= 1, then we are done. For

n ≥

consider

as a one-variable polynomial in

[

, . . . , X

n−1

] in

. Then we can

write

h =

i=0

, . . . , X

n−1

Fix (

, . . . , x

n−1

)

∈ S

× ···S

n−1

, and set

(

, . . . , x

n−1

)

∈ F

. Then

i=0

vanishes on

. So

(

, . . . , x

n−1

) = 0 for all (

, . . . , x

n−1

)

∈

× ··· × S

n−1

. So by induction, g

= 0. So h = 0.

Another fact we know about polynomials in one variables is that if

f ∈ F

[

]

vanishes at z

, . . . , z

, then f is a multiple of

i=1

(X − z

Lemma. For i = 1, . . . , n, let S

be a non-empty finite subset of F, and let

) =

s∈S

− s) ∈ F[X

] ⊆ F [X].

Then if

f ∈ F

[

] is identically zero on

× ··· × S

, then there exists

∈ F[X], deg h

≤ deg f − |S

| and

f =

i=1

Proof. By the division algorithm, we can write

f =

i=1

+ r,

where

satisfies

deg

r < deg g

. But then

vanishes on

×···×S

, as both

f and g

do. So r = 0.

We finally get to Alon’s combinatorial Nullstellensatz.

Theorem

(Alon’s combinatorial Nullstellensatz)

Let

, . . . , S

be non-empty

finite subsets of

with

+ 1. Let

f ∈ F

[

] have degree

i=1

and let the coefficient of

···X

be non-zero. Then

is not identically zero

on S = S

× ··· × S

Proof.

Suppose for contradiction that

is identically zero on

. Define

(

)

and h

as before such that

f =

Since the coefficient of

···X

is non-zero in

, it is non-zero in some

But that’s impossible, since

deg h

≤

i=1

− deg g

i6=j

− 1,

and so h

cannot contain a X

···

···X

term.

Let’s look at some applications. Here

is a prime,

, and

is the

unique field of order q.

Theorem (Chevalley, 1935). Let f

, . . . , f

∈ F

, . . . , X

] be such that

i=1

deg f

< n.

Then the f

cannot have exactly one common zero.

Proof.

Suppose not. We may assume that the common zero is 0 = (0

, . . . ,

0).

Define

f =

i=1

(1 −f

(X)

q−1

) −γ

i=1

s∈F

− s),

where γ is chosen so that F (0) = 0, namely the inverse of



s∈F

(−s)



Now observe that for any non-zero

, the value of

(

)

q−1

= 1, so

(

) = 0.

Thus, we can set

, and they satisfy the hypothesis of the theorem. In

particular, the coefficient of

q −1

···X

q −1

γ 6

= 0. However,

vanishes on

This is a contradiction.

It is possible to prove similar results without using the combinatorial Null-

stellensatz. These results are often collectively refered to as Chevalley–Warning

theorems.

Theorem

(Warning)

Let

(

) =

(

, . . . , X

)

∈ F

[

] have degree

< n

Then N (f), the number of zeroes of f is a multiple of p.

One nice trick in doing these things is that finite fields naturally come with

an “indicator function”. Since the multiplicative group has order

q −

1, we know

that if x ∈ F

, then

q −1

(

1 x 6= 0

0 x = 0

Proof. We have

1 −f (x)

q −1

(

1 f(x) = 0

0 otherwise

Thus, we know

N(f) =

x∈F

(1 −f (x)

q−1

) = −

x∈F

f(x)

q −1

∈ F

Further, we know that if k ≥ 0, then

x∈F

(

−1 k = q − 1

0 otherwise

So let’s write

(

)

q −1

as a linear combination of monomials. Each monomial

has degree

< n

(

q −

1). So there is at least one

such that the power of

that monomial is

< q −

1. Then the sum over

vanishes for this monomial.

So each monomial contributes 0 to the sum.

We can use Alon’s combinatorial Nullstellensatz to effortlessly prove some of

our previous theorems.

Theorem

(Cauchy–Davenport theorem)

Let

be a prime and

A, B ⊆ Z

non-empty subsets with |A| + |B| ≤ p + 1. Then |A + B| ≥ |A|+ |B| − 1.

Proof.

Suppose for contradiction that

B ⊆ C ⊆ Z

, and

|C|

|A|

|B|−

Let’s come up with a polynomial that encodes the fact that

contains the sum

A + B. We let

f(X, Y ) =

c∈C

(X + Y − c).

Then f vanishes on A × B, and deg f = |C|.

To apply the theorem, we check that the coefficient of

|A|−1

|B|−1



|C|

|A|−1



which is non-zero in

, since

C < p

. This contradicts Alon’s combinatorial

Nullstellensatz.

We can also use this to prove Erd¨os–Ginzburg–Ziv again.

Theorem

(Erd¨os–Ginzburg–Ziv)

Let

be a prime and

, . . . , a

2p+1

∈ Z

Then there exists I ∈ [2p −1]

(p)

such that

i∈I

= 0 ∈ Z

Proof. Define

, . . . , X

2p−1

) =

2p−1

i=1

p−1

, . . . , X

2p−1

) =

2p−1

i=1

p−1

Then by Chevalley’s theorem, we know there cannot be exactly one common

zero. But 0 is one common zero. So there must be another. Take this solution,

and let

= 0

}

. Then

(

) = 0 is the same as saying

|I|

, and

(X) = 0 is the same as saying

i∈I

= 0.

We can also consider restricted sums. We set

+ B = {a + b : a ∈ A, b ∈ B, a 6= b}.

Example. If n 6= m, then

[n]

+ [m] = {3, 4, . . . , m + n}

[n]

+ [n] = {3, 4, . . . , 2n −1}

From this example, we show that if

|A| ≥

2, then

+ A|

can be as small as

2|A| −3. In 1964, Erd¨os and Heilbronn

Conjecture

(Erd¨os–Heilbronn, 1964)

If 2

|A| ≤ p

+ 3, then

+ A| ≥

|A|−

This remained open for 30 years, and was proved by Dias da Silva and

Hamidoune. A much much simpler proof was given by Alon, Nathanson and

Ruzsa in 1996.

Theorem.

Let

A, B ⊆ Z

be such that 2

≤ |A| < |B|

and

|A|

|B| ≤ p

+ 2.

Then A

+ B ≥ |A| + |B|−2.

The above example shows we cannot do better.

Proof. Suppose not. Define

f(X, Y ) = (X − Y )

c∈C

(X + Y − c),

where A

+ B ⊆ C ⊆ Z

and |C| = |A| + |B|−3.

Then deg g = |A| + |B| −2, and the coefficient of X

|A|−1

|B|−1



|A| + |B| − 3

|A| −2



−



|A| + |B| − 3

|A| −1



6= 0.

Hence by Alon’s combinatorial Nullstellensatz,

(

x, y

) is not identically zero on

A ×B. A contradiction.

Corollary

(Erd¨os–Heilbronn conjecture)

A, B ⊆ Z

, non-empty and

|A|

|B| ≤ p + 3, and p is a prime, then |A

+ B| ≥ |A|+ |B| −3.

Proof. We may assume 2 ≤ |A| ≤ |B|. Pick a ∈ A, and set A

= A \{a}. Then

+ B| ≥ |A

| + |B| − 2 = |A| + |B| − 3.

Now consider the following problem: suppose we have a circular table

2n+1

Suppose the host invites

couples, and the host, being a terrible person, wants

the

th couple to be a disatnce

apart for some 1

≤ d

≤ n

. Can this be done?

Theorem. If 2n + 1 is a prime, then this can be done.

Proof.

We may wlog assume the host is at 0. We want to partition

}

into

pairs

, x

}

. Consider the polynomial ring

[

, . . . , X

] =

[

We define

f(x) =

)

i<j

−X

)(X

−X

)(X

−X

−d

)(X

−X

−d

We want to show this is not identically zero on Z

First of all, we have

deg f = 4





+ 2n = 2n

So we are good. The coefficient of X

···X

is the same as that in

i<j

− X

)

i6=j

− X

)

i6=j



1 −



This, we are looking for the constant term in

i6=j



1 −



By a question on the example sheet, this is



2, 2, . . . , 2



6= 0 in Z

Our final example is as follows: suppose we are in

, and

, . . . , a

and

, . . . , c

are enumerations of the elements, and

− a

. Then clearly we

have

= 0. Is the converse true? The answer is yes!

Theorem.

, . . . , b

∈ Z

are such that

= 0, then there exists numer-

ations

, . . . , a

and

, . . . , b

of the elements of

such that for each

, we

have

+ b

= c

Proof.

It suffices to show that for all (

), there are distinct

, ··· , a

p−1

such

that a

+ b

6= a

+ b

for all i 6= j. Consider the polynomial

i<j

− X

)(X

+ b

− X

− b

The degree is



p −1



= (p −1)(p − 2).

We then inspect the coefficient of

p−2

···X

p−2

p−1

, and checking that this is

non-zero is the same as above.