IB Optimisation - Introduction and preliminaries

1Introduction and preliminaries

IB Optimisation

1.2 Review of unconstrained optimization

Let

→ R

∗

∈ R

. A necessary condition for

∗

to minimize

over

is ∇f(x

∗

) = 0, where

∇f =



∂f

∂x

, · · · ,

∂f

∂x



is the gradient of f.

However, this is obviously not a sufficient condition. Any such point can be

a maximum, minimum or a saddle. Here we need a notion of convexity:

Definition

(Convex region)

A region

S ⊆ R

is convex iff for all

δ ∈

1],

x, y ∈ S

, we have

δx

+ (1

− δ

)

y ∈ S

. Alternatively, If you take two points, the

line joining them lies completely within the region.

non-convex convex

Definition

(Convex function)

A function

S → R

is convex if

is convex,

and for all x, y ∈ S, δ ∈ [0, 1], we have δf(x) + (1 − δ)f(y) ≥ f(δx + (1 − δ)y).

x y

δx + (1 − δ)y

δf (x) + (1 − δ)f (y)

A function is concave if

−f

is convex. Note that a function can be neither

concave nor convex.

We have the following lemma:

Lemma.

Let

be twice differentiable. Then

is convex on a convex set

the Hessian matrix

∂

∂x

is positive semidefinite for all x ∈ S, where this fancy term means:

Definition

(Positive-semidefinite)

A matrix

is positive semi-definite if

Hv ≥ 0 for all v ∈ R

Which leads to the following theorem:

Theorem.

Let

X ⊆ R

be convex,

→ R

be twice differentiable on

∗

∈ X

satisfy

∇f

(

∗

) = 0 and

(

) is positive semidefinite for all

x ∈ X

then x

∗

minimizes f on X.

We will not prove these.

Note that this is helpful, since linear functions are convex (and concave).

The problem is that our problems are constrained, not unconstrained. So we

will have to convert constrained problems to unconstrained problems.