2Hypothesis testing

IB Statistics



2.6 Student’s t-distribution
Definition (
t
-distribution). Suppose that
Z
and
Y
are independent,
Z N
(0
,
1)
and Y χ
2
k
. Then
T =
Z
p
Y/k
is said to have a t-distribution on k degrees of freedom, and we write T t
k
.
The density of t
k
turns out to be
f
T
(t) =
Γ((k + 1)/2)
Γ(k/2)
1
πk
1 +
t
2
k
(k+1)/2
.
This density is symmetric, bell-shaped, and has a maximum at
t
= 0, which
is rather like the standard normal density. However, it can be shown that
P
(
T > t
)
> P
(
Z > t
), i.e. the
T
distribution has a “fatter” tail. Also, as
k
,
t
k
approaches a normal distribution.
Proposition. If k > 1, then E
k
(T ) = 0.
If k > 2, then var
k
(T ) =
k
k2
.
If k = 2, then var
k
(T ) = .
In all other cases, the values are undefined. In particular, the
k
= 1 case has
undefined mean and variance. This is known as the Cauchy distribution.
Notation. We write
t
k
(
α
) be the upper 100
α
% point of the
t
k
distribution, so
that P(T > t
k
(α)) = α.
Why would we define such a weird distribution? The typical application is
to study random samples with unknown mean and unknown variance.
Let
X
1
, ··· , X
n
be iid
N
(
µ, σ
2
). Then
¯
X N
(
µ, σ
2
/n
). So
Z
=
n(
¯
Xµ)
σ
N(0, 1).
Also, S
XX
2
χ
2
n1
and is independent of
¯
X, and hence Z. So
n(
¯
X µ)
p
S
XX
/((n 1)σ
2
)
t
n1
,
or
n(
¯
X µ)
p
S
XX
/(n 1)
t
n1
.
We write
˜σ
2
=
S
XX
n1
(note that this is the unbiased estimator). Then a 100(1
α
)%
confidence interval for µ is found from
1 α = P
t
n1
α
2
n(
¯
X µ)
˜σ
t
n1
α
2
.
This has endpoints
¯
X ±
˜σ
n
t
n1
α
2
.