3Linear models

IB Statistics

3.6 Simple linear regression

We can apply our results to the case of simple linear regression. We have

Y

i

= a

′

+ b(x

i

− ¯x) + ε

i

,

where ¯x =

P

x

i

/n and ε

i

are iid N(0, σ

2

) for i = 1, ··· , n.

Then we have

ˆa

′

=

¯

Y ∼ N

a

′

,

σ

2

n

ˆ

b =

S

xY

S

xx

∼ N

b,

σ

2

S

xx

ˆ

Y

i

= ˆa

′

+

ˆ

b(x

i

− ¯x)

RSS =

X

i

(Y

i

−

ˆ

Y

i

)

2

∼ σ

2

χ

2

n−2

,

and (ˆa

′

,

ˆ

b) and ˆσ

2

= RSS/n are independent, as we have previously shown.

Note that

ˆσ

2

is obtained by dividing RSS by

n

, and is the maximum likelihood

estimator. On the other hand,

˜σ

is obtained by dividing RSS by

n − p

, and is

an unbiased estimator.

Example. Using the oxygen/time example, we have seen that

˜σ

2

=

RSS

n − p

=

67968

24 − 2

= 3089 = 55.6

2

.

So the standard error of

ˆ

β is

SE(

ˆ

b) =

q

˜σ

2

(X

T

X)

−1

22

=

r

3089

S

xx

=

55.6

28.0

= 1.99.

So a 95% interval for b has end points

ˆ

b ± SE(

ˆ

b) × t

n−p

(0.025) = 12.9 ± 1.99 ∗ t

22

(0.025) = (−17.0, −8.8),

using the fact that t

22

(0.025) = 2.07.

Note that this interval does not contain 0. So if we want to carry out a size

0

.

05 test of

H

0

:

b

= 0 (they are uncorrelated) vs

H

1

:

b

= 0 (they are correlated),

the test statistic would be

ˆ

b

SE(

ˆ

b)

=

−12.9

1.99

=

−

6

.

48. Then we reject

H

0

because

this is less than −t

22

(0.025) = −2.07.