2Hypothesis testing
IB Statistics
2.4
Tests of homogeneity, and connections to confidence
intervals
2.4.1 Tests of homogeneity
Example. 150 patients were randomly allocated to three groups of 50 patients
each. Two groups were given a new drug at different dosage levels, and the third
group received a placebo. The responses were as shown in the table below.
Improved No difference Worse Total
Placebo 18 17 15 50
Half dose 20 10 20 50
Full dose 25 13 12 50
Total 63 40 47 150
Here the row totals are fixed in advance, in contrast to our last section, where
the row totals are random variables.
For the above, we may be interested in testing
H
0
: the probability of
“improved” is the same for each of the three treatment groups, and so are
the probabilities of “no difference” and “worse”, i.e.
H
0
says that we have
homogeneity down the rows.
In general, we have independent observations from
r
multinomial distributions,
each of which has
c
categories, i.e. we observe an
r ×c
table (
n
ij
), for
i
= 1
, ··· , r
and j = 1, ··· , c, where
(N
i1
, ··· , N
ic
) ∼ multinomial(n
i+
, p
i1
, ··· , p
ic
)
independently for each i = 1, ··· , r. We want to test
H
0
: p
1j
= p
2j
= ··· = p
rj
= p
j
,
for j = 1, ··· , c, and
H
1
: p
ij
are unrestricted.
Using H
1
, for any matrix of probabilities (p
ij
),
like((p
ij
)) =
r
Y
i=1
n
i+
!
n
i1
! ···n
ic
!
p
n
i1
i1
···p
n
ic
ic
,
and
log like = constant +
r
X
i=1
c
X
j=1
n
ij
log p
ij
.
Using Lagrangian methods, we find that ˆp
ij
=
n
ij
n
i+
.
Under H
0
,
log like = constant +
c
X
j=1
n
+j
log p
j
.
By Lagrangian methods, we have ˆp
j
=
n
+j
n
++
.
Hence
2 log Λ =
r
X
i=1
c
X
j=1
n
ij
log
ˆp
ij
ˆp
j
= 2
r
X
i=1
c
X
j=1
n
ij
log
n
ij
n
i+
n
+j
/n
++
,
which is the same as what we had last time, when the row totals are unrestricted!
We have

Θ
1

=
r
(
c −
1) and

Θ
0

=
c −
1. So the degrees of freedom is
r
(
c −
1)
−
(
c −
1) = (
r −
1)(
c −
1), and under
H
0
, 2
log
Λ is approximately
χ
2
(r−1)(c−1)
. Again, it is exactly the same as what we had last time!
We reject H
0
if 2 log Λ > χ
2
(r−1)(c−1)
(α) for an approximate size α test.
If we let
o
ij
=
n
ij
, e
ij
=
n
i+
n
+j
n
++
, and
δ
ij
=
o
ij
− e
ij
, using the same approxi
mating steps as for Pearson’s chisquared, we obtain
2 log Λ ≈
X
(o
ij
− e
ij
)
2
e
ij
.
Example. Continuing our previous example, our data is
Improved No difference Worse Total
Placebo 18 17 15 50
Half dose 20 10 20 50
Full dose 25 13 12 50
Total 6 3 40 47 150
The expected under H
0
is
Improved No difference Worse Total
Placebo 21 13.3 15.7 50
Half dose 21 13.3 15.7 50
Full dose 21 13.3 15.7 50
Total 63 40 47 150
We find 2
log
Λ = 5
.
129, and we refer this to
χ
2
4
. Clearly this is not significant,
as the mean of
χ
2
4
is 4, and is something we would expect to happen solely by
chance.
We can calculate the
p
value: from tables,
χ
2
4
(0
.
05) = 9
.
488, so our observed
value is not significant at 5%, and the data are consistent with H
0
.
We conclude that there is no evidence for a difference between the drug at
the given doses and the placebo.
For interest,
X
(o
ij
− e
ij
)
2
e
ij
= 5.173,
giving the same conclusion.
2.4.2 Confidence intervals and hypothesis tests
Confidence intervals or sets can be obtained by inverting hypothesis tests, and
vice versa
Definition (Acceptance region). The acceptance region
A
of a test is the
complement of the critical region C.
Note that when we say “acceptance”, we really mean “nonrejection”! The
name is purely for historical reasons.
Theorem (Duality of hypothesis tests and confidence intervals). Suppose
X
1
, ··· , X
n
have joint pdf f
X
(x  θ) for θ ∈ Θ.
(i)
Suppose that for every
θ
0
∈
Θ there is a size
α
test of
H
0
:
θ
=
θ
0
. Denote
the acceptance region by
A
(
θ
0
). Then the set
I
(X) =
{θ
: X
∈ A
(
θ
)
}
is a
100(1 −α)% confidence set for θ.
(ii)
Suppose
I
(X) is a 100(1
− α
)% confidence set for
θ
. Then
A
(
θ
0
) =
{
X :
θ
0
∈ I(X)} is an acceptance region for a size α test of H
0
: θ = θ
0
.
Intuitively, this says that “confidence intervals” and “hypothesis accep
tance/rejection” are the same thing. After gathering some data X, we can
produce a, say, 95% confidence interval (
a, b
). Then if we want to test the
hypothesis H
0
: θ = θ
0
, we simply have to check whether θ
0
∈ (a, b).
On the other hand, if we have a test for
H
0
:
θ
=
θ
0
, then the confidence
interval is all θ
0
in which we would accept H
0
: θ = θ
0
.
Proof. First note that θ
0
∈ I(X) iff X ∈ A(θ
0
).
For (i), since the test is size α, we have
P(accept H
0
 H
0
is true) = P(X ∈ A(θ
0
)  θ = θ
0
) = 1 −α.
And so
P(θ
0
∈ I(X)  θ = θ
0
) = P(X ∈ A(θ
0
)  θ = θ
0
) = 1 −α.
For (ii), since I(X) is a 100(1 − α)% confidence set, we have
P (θ
0
∈ I(X)  θ = θ
0
) = 1 −α.
So
P(X ∈ A(θ
0
)  θ = θ
0
) = P(θ ∈ I(X)  θ = θ
0
) = 1 −α.
Example. Suppose
X
1
, ··· , X
n
are iid
N
(
µ,
1) random variables and we want
a 95% confidence set for µ.
One way is to use the theorem and find the confidence set that belongs to the
hypothesis test that we found in the previous example. We find a test of size 0.05
of
H
0
:
µ
=
µ
0
against
H
1
:
µ
=
µ
0
that rejects
H
0
when

√
n
(
¯x − µ
0
)
 >
1
.
96
(where 1.96 is the upper 2.5% point of N(0, 1)).
Then
I
(X) =
{µ
: X
∈ A
(
µ
)
}
=
{µ
:

√
n
(
¯
X − µ
)
 <
1
.
96
}
. So a 95%
confidence set for µ is (
¯
X − 1.96/
√
n,
¯
X + 1.96/
√
n).