3Supersaturation
III Extremal Graph Theory
3 Supersaturation
Suppose we have a graph
G
with
e
(
G
)
> ex
(
n, F
). Then by definition, there is
at least one copy of
F
in
G
. But can we tell how many copies we have? It turns
out this is not too difficult to answer, and in fact we can answer the question for
any hypergraph.
Recall that an
`
-uniform hypergraph is a pair (
V, E
), where
E ⊆ V
(`)
. We
can define the extremal function of a class of hypergraphs F by
ex(n, F) = max{e(G) : |G| = n, G contains no f ∈ F}.
Again we are interested in the limiting density
π(F) = lim
n→∞
ex(n, F)
n
`
.
It is an easy exercise to show that this limit always exists. We solved it explicitly
for graphs explicitly previously, but we don’t really need the Erd¨os–Stone theorem
just to show that the limit exists.
The basic theorem in supersaturation is
Theorem
(Erd¨os–Simonovits)
.
Let
H
be some
`
-uniform hypergraph. Then
for all
ε >
0, there exists
δ
(
H, ε
) such that every
`
-uniform hypergraph
G
with
|G| = n and
e(G) > (π(H) + ε)
n
`
contains bδn
|H|
c copies of H.
Note that
n
|H|
is approximately the number of ways to choose
|H|
vertices
from n, so it’s the number of possible candidates for subgraphs H.
Proof.
For each
m
-set
M ⊆ V
(
G
), we let
G
[
M
] be the sub-hypergraph induced
by these vertices. Let the number of subsets
M
with
e
(
G
[
M
])
>
π(H) +
ε
2
m
`
be η
n
m
. Then we can estimate
(π(H) + ε)
n
`
≤ e(G) =
P
M
e(G[M])
n−`
m−`
≤
η
n
m
m
`
+ (1 − η)
n
m
π(H) +
ε
2
m
`
n−`
m−`
.
So if n > m, then
π(H) + ε ≤ η + (1 + η)
π(H) +
ε
2
.
So
η ≥
ε
2
1 − π(H) −
ε
2
> 0.
The point is that it is positive, and that’s all we care about.
We pick m large enough so that
ex(m, H) <
π(H) +
ε
2
m
`
.
Then H ⊆ G[M ] for at least η
n
m
choices of M. Hence G contains at least
η
n
m
n−|H|
m−|H|
=
η
n
|H|
m
|H|
copies of
H
, and we are done. (Our results hold when
n
is large enough, but we
can choose δ small enough so that δn
|H|
< 1 when n is small)
Let
k
P
(
G
) be the number of copies of
K
p
in
G
. Ramsey’s theorem tells us
k
p
(
G
) +
k
p
(
¯
G
)
>
0 if
|G|
is large (where
¯
G
is the complement of
G
). In the
simplest case where
p
= 3, it turns out we can count these monochromatic
triangles exactly.
Theorem (Lorden, 1961). Let G have degree sequence d
1
, . . . , d
n
. Then
k
3
(G) + k
3
(
¯
G) =
n
3
− (n − 2)e(G) +
n
X
i=1
d
i
2
.
Proof. The number of paths of length 2 in G and
¯
G is precisely
n
X
i=1
d
i
2
+
n − 1 − d
i
2
= 2
n
X
i=1
d
i
2
− 2(n − 2)e(G) + 3
n
3
,
since to find a path of length 2, we pick the middle vertex and then pick the
two edges. A complete or empty
K
3
contains 3 such paths; Other sets of three
vertices contain exactly 1 such path. Hence
n
3
+ 2(k
3
(G) + k
3
(
¯
G)) = number of paths of length 2.
Corollary (Goodman, 1959). We have
k
3
(G) + k
3
(
¯
G) ≥
1
24
n(n − 1)(n − 5).
In particular, the Ramsey number of a triangle is at most 6.
Proof. Let m = e(G). Then
k
3
(G) + k
3
(
¯
G) ≥
n
3
− (n − 2)m + n
2m/n
2
.
Then minimize over m.
This shows the minimum density of a monochromatic
K
3
in a red/blue
colouring of
K
n
(for
n
large) is
∼
1
4
. But if we colour edges monochromatic, then
1
4
is the probability of having a triangle being monochromatic. So the density
is achieved by a “random colouring”. Recall also that the best bounds on the
Ramsey numbers we have are obtained by random colourings. So we might think
the best way of colouring if we want to minimize the number of monochromatic
cliques is to do so randomly.
However, this is not true. While we do not know what the minimum density
of monochromatic
K
4
in a red/blue colouring of
K
n
, it is already known to be
<
1
33
(while
1
32
is what we expect from a random colouring). It is also known by
flag algebras to be >
1
35
. So we are not very far.
Corollary. For m = e(G) and n = |G|, we have
k
3
(G) ≥
m
3n
(4m − n
2
).
Proof. By Lorden’s theorem, we know
k
3
(G) + k
3
(
¯
G) =
n
3
− (n − 2)e(
¯
G) +
X
¯
d
i
2
,
where
¯
d
i
is the degree sequence in
¯
G. But
3k
3
(
¯
G) ≤
X
¯
d
i
2
,
since the sum is counting the number of paths of length 2. So we find that
k
3
(G) ≥
n
3
− (n − 2) ¯m −
2
3
n
2 ¯m/n
2
,
and ¯m =
n
2
− m.
Observe that equality is almost never attained. It is attained only for regular
graphs with no subgraphs looking like
So non-adjacency is an equivalence relation, so the graph is complete multi-partite
and regular. Thus it is T
r
(n) with r | n.
Theorem.
Let
G
be a graph. For any graph
F
, let
i
F
(
G
) be the number of
induced copies of
F
in
G
, i.e. the number of subsets
M ⊆ V
(
G
) such that
G[M]
∼
=
F . So, for example, i
K
p
(G) = k
p
(G).
Define
f(G) =
X
F
α
F
i
F
(G),
with the sum being over a finite collection of graphs
F
, each being complete
multipartite, with
α
F
∈ R
and
α
F
≥
0 if
F
is not complete. Then amongst
graphs of given order,
f
(
G
) is maximized on a complete multi-partite graph.
Moreover, if α
¯
K
3
> 0, then there are no other maxima.
Proof.
We may suppose
α
¯
K
3
>
0, because the case of
α
¯
K
3
= 0 follows from a
limit argument. Choose a graph
G
maximizing
f
and suppose
G
is not complete
multipartite. Then there exist non-adjacent vertices
x, y
whose neighbourhoods
X, Y differ.
There are four contributions to i
F
(G), coming from
(i) F’s that contain both x and y;
(ii) F’s that contain y but not x;
(iii) F’s that contain x but not y;
(iv) F’s that contain neither x nor y.
We may assume that the contribution from (iii)
≥
(ii), and if they are equal,
then |X| ≤ |Y |.
Consider what happens when we remove all edges between
y
and
Y
, and add
edges from y to everything in X. Clearly (iii) and (iv) are unaffected.
–
If (iii)
>
(ii), then after this move, the contribution of (ii) becomes equal to
the contribution of (iii) (which didn’t change), and hence strictly increased.
The graphs
F
that contribute to (i) are not complete, so
α
F
≥
0. Moreover,
since
F
is complete multi-partite, it cannot contain a vertex in
X
∆
Y
. So
making the move can only increase the number of graphs contributing to
(i), and each contribution is non-negative.
–
If (iii) = (ii) and
|X| ≤ |Y |
, then we make similar arguments. The
contribution to (ii) is unchanged this time, and we know the contribution
of (i) strictly increased, because the number of
¯
K
3
’s contributing to (i) is
the number of points not connected to x and y.
In both cases, the total sum increased.
Perhaps that theorem seemed like a rather peculiar one to prove. However,
it has some nice consequences. The following theorem relates
k
p
(
G
) with
k
r
(
G
)
for different p and r:
Theorem
(Bollob´as, 1976)
.
Let 1
≤ p ≤ r
, and for 0
≤ x ≤
n
p
, let
ψ
(
x
) be a
maximal convex function lying below the points
{(k
p
(T
q
(n)), k
r
(T
q
(n))) : q = r − 1, r, . . .} ∪ {(0, 0)}.
Let G be a graph of order n. Then
k
r
(G) ≥ ψ(k
p
(G)).
Proof. Let f(G) = k
p
(G) − ck
r
(G), where c > 0.
Claim. It is enough to show that f is maximized on a Tur´an graph for any c.
Indeed, suppose we plot out the values of (k
p
(T
q
(n)), k
r
(T
q
(n)):
k
p
(G)
k
r
(G)
k
p
(T
r−1
(n))
k
p
(T
r
(n))
k
r
(T
r
(n))
k
p
(T
r+1
(n))
k
r
(T
r+1
(n))
If the theorem doesn’t hold, then we can pick a
G
such that (
k
p
(
G
)
, k
r
(
G
))
lies below
ψ
. Draw a straight line through the point with slope parallel to
ψ
. This
has slope
1
c
>
0 for some
c
. The intercept on the
x
-axis is then
k
p
(
G
)
− ck
r
(
G
),
which would be greater than
f
(
any Tur´an graph
) by convexity, a contradiction.
Now the previous theorem immediately tells us
f
is maximized on some
complete multi-partite graph. Suppose this has
q
class, say of sizes
a
1
≤ a
2
≤
··· ≤ a
q
. It is easy to verify
q ≥ r −
1. In fact, we may assume
q ≥ r
, else the
maximum is on a Tur´an graph T
r−1
(n).
Then we can write
f(G) = a
1
a
q
A − ca
q
a
q
B + C,
where
A, B, C
are rationals depending only on
a
2
, . . . , a
q−1
and
a
1
+
a
q
(
A
and
B
count the number of ways to pick a
K
p
and
K
r
respectively in a way that
involves terms in both the first and last classes).
We wlog assume c is irrational. Hence a
q
a
q
A − ca
1
a
q
= a
1
a
q
(A − cB) 6= 0.
–
If
A−cB <
0, replace
a
1
and
a
q
by 0 and
a
1
+
a
q
. This would then increase
f, which is impossible.
–
If
A −cB >
0 and
a
1
≤ a
q
−
2, then we can replace
a
1
, a
q
by
a
1
+ 1
, a
q
−
1
to increase f.
Hence a
1
≥ a
q
− 1. So G = T
q
(n).