1Introduction and preliminaries

IB Optimisation

1.2 Review of unconstrained optimization

Let

f

:

R

n

→ R

,

x

∗

∈ R

n

. A necessary condition for

x

∗

to minimize

f

over

R

n

is ∇f(x

∗

) = 0, where

∇f =

∂f

∂x

1

, · · · ,

∂f

∂x

n

T

is the gradient of f.

However, this is obviously not a sufficient condition. Any such point can be

a maximum, minimum or a saddle. Here we need a notion of convexity:

Definition

(Convex region)

.

A region

S ⊆ R

n

is convex iff for all

δ ∈

[0

,

1],

x, y ∈ S

, we have

δx

+ (1

− δ

)

y ∈ S

. Alternatively, If you take two points, the

line joining them lies completely within the region.

non-convex convex

Definition

(Convex function)

.

A function

f

:

S → R

is convex if

S

is convex,

and for all x, y ∈ S, δ ∈ [0, 1], we have δf(x) + (1 − δ)f(y) ≥ f(δx + (1 − δ)y).

x y

δx + (1 − δ)y

δf (x) + (1 − δ)f (y)

A function is concave if

−f

is convex. Note that a function can be neither

concave nor convex.

We have the following lemma:

Lemma.

Let

f

be twice differentiable. Then

f

is convex on a convex set

S

if

the Hessian matrix

Hf

ij

=

∂

2

f

∂x

i

∂x

j

is positive semidefinite for all x ∈ S, where this fancy term means:

Definition

(Positive-semidefinite)

.

A matrix

H

is positive semi-definite if

v

T

Hv ≥ 0 for all v ∈ R

n

.

Which leads to the following theorem:

Theorem.

Let

X ⊆ R

n

be convex,

f

:

R

n

→ R

be twice differentiable on

X

.

If

x

∗

∈ X

satisfy

∇f

(

x

∗

) = 0 and

Hf

(

x

) is positive semidefinite for all

x ∈ X

,

then x

∗

minimizes f on X.

We will not prove these.

Note that this is helpful, since linear functions are convex (and concave).

The problem is that our problems are constrained, not unconstrained. So we

will have to convert constrained problems to unconstrained problems.