2Hypothesis testing
IB Statistics
2.2 Composite hypotheses
For composite hypotheses like
H
:
θ ≥
0, the error probabilities do not have a
single value. We define
Definition (Power function). The power function is
W (θ) = P(X ∈ C | θ) = P(reject H
0
| θ).
We want W (θ) to be small on H
0
and large on H
1
.
Definition (Size). The size of the test is
α = sup
θ∈Θ
0
W (θ).
This is the worst possible size we can get.
For θ ∈ Θ
1
, 1 − W (θ) = P(Type II error | θ).
Sometimes the Neyman-Pearson theory can be extended to one-sided alter-
natives.
For example, in the previous example, we have shown that the most powerful
size α test of H
0
: µ = µ
0
versus H
1
: µ = µ
1
(where µ
1
> µ
0
) is given by
C =
x :
√
n(¯x −µ
0
)
σ
0
> z
α
.
The critical region depends on
µ
0
, n, σ
0
, α
, and the fact that
µ
1
> µ
0
. It does
not depend on the particular value of
µ
1
. This test is then uniformly the most
powerful size α for testing H
0
: µ = µ
0
against H
1
: µ > µ
0
.
Definition (Uniformly most powerful test). A test specified by a critical region
C
is uniformly most powerful (UMP) size
α
test for test
H
0
:
θ ∈
Θ
0
against
H
1
: θ ∈ Θ
1
if
(i) sup
θ∈Θ
0
W (θ) = α.
(ii)
For any other test
C
∗
with size
≤ α
and with power function
W
∗
, we have
W (θ) ≥ W
∗
(θ) for all θ ∈ Θ
1
.
Note that these may not exist. However, the likelihood ratio test often works.
Example. Suppose
X
1
, ··· , X
n
are iid
N
(
µ, σ
2
0
) where
σ
0
is known, and we
wish to test H
0
: µ ≤ µ
0
against H
1
: µ > µ
0
.
First consider testing
H
′
0
:
µ
=
µ
0
against
H
′
1
:
µ
=
µ
1
, where
µ
1
> µ
0
. The
Neyman-Pearson test of size α of H
′
0
against H
′
1
has
C =
x :
√
n(¯x −µ
0
)
σ
0
> z
α
.
We show that
C
is in fact UMP for the composite hypotheses
H
0
against
H
1
.
For µ ∈ R, the power function is
W (µ) = P
µ
(reject H
0
)
= P
µ
√
n(
¯
X − µ
0
)
σ
0
> z
α
= P
µ
√
n(
¯
X − µ)
σ
0
> z
α
+
√
n(µ
0
− µ)
σ
0
= 1 −Φ
z
α
+
√
n(µ
0
− µ)
σ
0
To show this is UMP, we know that
W
(
µ
0
) =
α
(by plugging in).
W
(
µ
) is an
increasing function of µ. So
sup
µ≤µ
0
W (µ) = α.
So the first condition is satisfied.
For the second condition, observe that for any
µ > µ
0
, the Neyman-Pearson
size
α
test of
H
′
0
vs
H
′
1
has critical region
C
. Let
C
∗
and
W
∗
belong to any
other test of
H
0
vs
H
1
of size
≤ α
. Then
C
∗
can be regarded as a test of
H
′
0
vs
H
′
1
of size
≤ α
, and the Neyman-Pearson lemma says that
W
∗
(
µ
1
)
≤ W
(
µ
1
).
This holds for all µ
1
> µ
0
. So the condition is satisfied and it is UMP.
We now consider likelihood ratio tests for more general situations.
Definition (Likelihood of a composite hypothesis). The likelihood of a composite
hypothesis H : θ ∈ Θ given data x to be
L
x
(H) = sup
θ∈Θ
f(x | θ).
So far we have considered disjoint hypotheses Θ
0
,
Θ
1
, but we are not interested
in any specific alternative. So it is easier to take Θ
1
= Θ rather than Θ
\
Θ
0
.
Then
Λ
x
(H
0
; H
1
) =
L
x
(H
1
)
L
x
(H
0
)
=
sup
θ∈Θ
1
f(x | θ)
sup
θ∈Θ
0
f(x | θ)
≥ 1,
with large values of Λ indicating departure from H
0
.
Example. Suppose that
X
1
, ··· , X
n
are iid
N
(
µ, σ
2
0
), with
σ
2
0
known, and we
wish to test
H
0
:
µ
=
µ
0
against
H
1
:
µ
=
µ
0
(for given constant
µ
0
). Here
Θ
0
= {µ
0
} and Θ = R.
For the numerator, we have
sup
Θ
f
(x
| µ
) =
f
(x
| ˆµ
), where
ˆµ
is the mle.
We know that ˆµ = ¯x. Hence
Λ
x
(H
0
; H
1
) =
(2πσ
2
0
)
−n/2
exp
−
1
2σ
2
0
P
(x
i
− ¯x)
2
(2πσ
2
0
)
−n/2
exp
−
1
2σ
2
0
P
(x
i
− µ
0
)
2
.
Then H
0
is rejected if Λ
x
is large.
To make our lives easier, we can use the logarithm instead:
2 log Λ(H
0
; H
1
) =
1
σ
2
0
h
X
(x
i
− µ
0
)
2
−
X
(x
i
− ¯x)
2
i
=
n
σ
2
0
(¯x −µ
0
)
2
.
So we can reject H
0
if we have
√
n(¯x −µ
0
)
σ
0
> c
for some c.
We know that under
H
0
,
Z
=
√
n(
¯
X − µ
0
)
σ
0
∼ N
(0
,
1). So the size
α
generalised likelihood test rejects H
0
if
√
n(¯x −µ
0
)
σ
0
> z
α/2
.
Alternatively, since
n(
¯
X − µ
0
)
2
σ
2
0
∼ χ
2
1
, we reject H
0
if
n(¯x −µ
0
)
2
σ
2
0
> χ
2
1
(α),
(check that z
2
α/2
= χ
2
1
(α)).
Note that this is a two-tailed test — i.e. we reject
H
0
both for high and low
values of ¯x.
The next theorem allows us to use likelihood ratio tests even when we cannot
find the exact relevant null distribution.
First consider the “size” or “dimension” of our hypotheses: suppose that
H
0
imposes
p
independent restrictions on Θ. So for example, if Θ =
{θ
:
θ
=
(θ
1
, ··· , θ
k
)}, and we have
– H
0
: θ
i
1
= a
1
, θ
i
2
= a
2
, ··· , θ
i
p
= a
p
; or
– H
0
: Aθ = b (with A p × k, b p × 1 given); or
– H
0
: θ
i
= f
i
(φ), i = 1, ··· , k for some φ = (φ
1
, ··· , φ
k−p
).
We say Θ has
k
free parameters and Θ
0
has
k − p
free parameters. We write
|Θ
0
| = k − p and |Θ| = k.
Theorem (Generalized likelihood ratio theorem). Suppose Θ
0
⊆
Θ
1
and
|
Θ
1
|−
|Θ
0
| = p. Let X = (X
1
, ··· , X
n
) with all X
i
iid. If H
0
is true, then as n → ∞,
2 log Λ
X
(H
0
; H
1
) ∼ χ
2
p
.
If
H
0
is not true, then 2
log
Λ tends to be larger. We reject
H
0
if 2
log
Λ
> c
,
where c = χ
2
p
(α) for a test of approximately size α.
We will not prove this result here. In our example above,
|
Θ
1
| − |
Θ
0
|
= 1,
and in this case, we saw that under
H
0
, 2
log
Λ
∼ χ
2
1
exactly for all
n
in that
particular case, rather than just approximately.