2Orthogonal polynomials

IB Numerical Analysis

2.5 Least-squares polynomial approximation

If we want to approximate a function with a polynomial, polynomial interpolation

might not be the best idea, since all we do is make sure the polynomial agrees

with

f

at certain points, and then hope it is a good approximation elsewhere.

Instead, the idea is to choose a polynomial

p

in

P

n

[

x

] that “minimizes the error”.

What exactly do we mean by minimizing the error? The error is defined as

the function

f −p

. So given an appropriate inner product on the vector space of

continuous functions, we want to minimize

kf − pk

2

= hf − p, f − pi.

This is usually of the form

hf − p, f − pi =

Z

b

a

w(x)[f(x) − p(x)]

2

dx,

but we can also use alternative inner products such as

hf − p, f − pi =

m

X

j=1

w

j

[f(ξ

i

) − p(ξ

i

)]

2

.

Unlike polynomial interpolation, there is no guarantee that the approximation

agrees with the function anywhere. Unlike polynomial interpolation, there is

some guarantee that the total error is small (or as small as we can get, by

definition). In particular, if

f

is continuous, then the Weierstrass approximation

theorem tells us the total error must eventually vanish.

Unsurprisingly, the solution involves the use of the orthogonal polynomials

with respect to the corresponding inner products.

Theorem.

If

{p

n

}

n

k=0

are orthogonal polynomials with respect to

h·, ·i

, then

the choice of c

k

such that

p =

n

X

k=0

c

k

p

k

minimizes kf − pk

2

is given by

c

k

=

hf, p

k

i

kp

k

k

2

,

and the formula for the error is

kf − pk

2

= kfk

2

−

n

X

k=0

hf, p

k

i

2

kp

k

k

2

.

Note that the solution decouples, in the sense that

c

k

depends only on

f

and

p

k

. If we want to take one more term, we just need to compute an extra term,

and not redo our previous work.

Also, we notice that the formula for the error is a positive term

kfk

2

sub-

tracting a lot of squares. As we increase

n

, we subtract more squares, and the

error decreases. If we are lucky, the error tends to 0 as we take

n → ∞

. Even

though we might not know how many terms we need in order to get the error to

be sufficiently small, we can just keep adding terms until the computed error

small enough (which is something we have to do anyway even if we knew what

n to take).

Proof. We consider a general polynomial

p =

n

X

k=0

c

k

p

k

.

We substitute this in to obtain

hf − p, f − pi = hf, fi − 2

n

X

k=0

c

k

hf, p

k

i +

n

X

k=0

c

2

k

kp

k

k

2

.

Note that there are no cross terms between the different coefficients. We minimize

this quadratic by setting the partial derivatives to zero:

0 =

∂

∂c

k

hf − p, f − pi = −2hf, p

k

i + 2c

k

kp

k

k

2

.

To check this is indeed a minimum, note that the Hessian matrix is simply 2

I

,

which is positive definite. So this is really a minimum. So we get the formula for

the

c

k

’s as claimed, and putting the formula for

c

k

gives the error formula.

Note that our constructed

p ∈ P

n

[

x

] has a nice property: for

k ≤ n

, we have

hf − p, p

k

i = hf, p

k

i − hp, p

k

i = hf, p

k

i −

hf, p

k

i

kp

k

k

2

hp

k

, p

k

i = 0.

Thus for all q ∈ P

n

[x], we have

hf − p, qi = 0.

In particular, this is true when

q

=

p

, and tells us

hf, pi

=

hp, pi

. Using this to

expand hf − p, f − pi gives

kf − pk

2

+ kpk

2

= kfk

2

,

which is just a glorified Pythagoras theorem.