Csaba Vincze (2013)
University of Debrecen
In what follows let K be a non-empty open convex subset in the coordinate space of dimension n and consider a function
Definition The function f is convex if for any points p and q in K
where λ is in [0,1]. The function f is concave if - f is convex.
The geometric meaning of equation 1.34 is that chords joining the points on the graph are above it. This can be expressed in terms of the so-called epigraph as the following theorem shows.
Proposition 1.6.1 The function f is convex if and only if itsepigraph
is a convex subset in the coordinate space of dimension n+1.
Proof Let f be a convex function and suppose that (p,t) and (q,s) are in epi f. Then
where v:=(1 - λ)p+λq is in K because of its convexity and the scalar "coordinate" satisfies the inequalities
because of the convexity of the function. Therefore epi f is convex (as a set). Conversely, if epi f is a convex set then the chords joining its boundary points are "above" the graph of f and inequality 1.34 follows immediately. ▮
Proposition 1.6.2 (Jensen, Johan) The function f is convex if and only if
for any convex combination of elements from K.
Proof Inequality 1.35 gives the definition of convex functions under the special choice k=2. Conversely, if a function is convex then inequality 1.35 is satisfied in case of k=2 because of the definition of convex functions (if k=1 then there is nothing to prove). For convex combinations involving more that two terms the proof is based on a simple induction. Suppose that 1.35 is true for convex combinations containing at most k - 1 vectors and consider the convex combination
of the elements v(1), ..., v(k) in K. Because at least one of the coefficients must be different from 1 we can write, for example, that
is in K because of its convexity and
and, by the inductive hypothesis,
Relations 1.36 and 1.37 give that
as was to be proved. ▮
Proposition 1.6.3 Let K be a non-empty open convex set. If the function
is convex then it is continuous at any point in K.
Proof Let p in K be a given point (recall that K is a non-empty open convex subset). Without loss of generality we can suppose that p is just the origin 0. As the first step we are going to prove that f is locally bounded. Consider an open box R of dimension n centered at the origin in K. Since the elements in R can be expressed as a convex combination of the vertices
we have that for any v in R
On the other hand
where - v is in R because the origin is the center of the box. Using the upper bound M and the convexity of the function
is a lower bound. Therefore
In the second step we claim that f is locally Lipschitzian. Consider an open ball B centered at p with radius r such that 2B is contained in the box R. Then for each q in B we have a point z not in B but in R such that
is the simple ratio among the points. Using the convexity of the function we have that
because z is not in B but z is in R. Therefore
Using the same argumentation as above for the triplet q, p and u:= - z we have that
The last equation allows us to present a lower estimation
Inequalities 1.39 and 1.42 say that
i.e. the function is locally Lipschitzian and, consequently, it is continuous at p as was to be proved. ▮
Inequalities 1.38 and 1.42 imply more than the continuity: the existence of the one-sided directional derivatives
at each point into each direction. Indeed, consider the function
defined on a sufficiently small open interval. Using the notation q=p+tv inequality 1.42 says that h is bounded from below. Taking t < s and z=p+sv 1.38 shows that h is monotone increasing. Therefore its infimum M*=inf h exists and
For further regularity properties of convex functions see Lebesgue's theorem and . Figure 10 shows why it is important for the point p to be in the interior of the domain.
Definition The element w is called a subgradient of the function f at the point p in K if the inequality
holds for any point q in K. The subdifferential of the function f is the set of its subgradients.
For the geometric description of a subgradient vector write inequality 1.43 into the form
to express that the graph of the function must be entirely above the hyperplane
passing through the point (p, f(p)) in the coordinate space of dimension n+1. The vector (w, - 1) plays the role of the normal vector to the hyperplane 1.44.
The subgradient involves a global property whereas the derivative has a local character. Nevertheless the convexity of the function allows us to describe the set of subgradients locally in terms of the directional derivative.
Proposition 1.6.4 (Local characterization.) Let K be a non-empty open convex set and consider a convex function
The element w is a subgradient at the point p in K if and only if the inequality
holds for any element v in the coordinate space.
Proof Suppose that w is a subgradient of the function f at the point p and let us choose the point q in the special form
where v is a nonzero vector and t is a positive real number which is small enough for q to be in K. Then the relation
follows immediately from the definition of the subgradient. Therefore
In order to see the converse statement let q be an arbitrary point in K and consider the line segment
joining p and q. Since the function is convex, the formula
holds for any parameter t between 0 and 1. Therefore
This means that
as was to be proved. ▮
Corollary 1.6.5 Let K be a non-empty open convex set and consider a convex function
The following conditions are equivalent:
• The point p in K is a global minimizer.
• The zero vector 0 belongs to the subdifferential of f at p.
• For any element v
Proof If p is a global minimizer then for any q in K
showing that 0 is one of the subgradient at p. If 0 is one of the subgradient at p then, by definition
and we have that p is a global minimizer. The equivalence of (ii) and (iii) is a direct consequence of the local characterization Proposition 1.6.4 of the subgradient vectors. ▮
Definition Suppose that
is differentiable at the point p. The gradient vector is defined in terms of the usual partial derivatives:
Actually it is a special notation for the Jacobian matrix at the point p.
For the sake of simplicity we restrict ourselves to the coordinate plane to present the geometric characterization of the gradient vector. We will use the standard symbols x and y for the coordinates of the points in the plane. Let U be a non-empty open subset and consider a (not necessarily convex) function
Suppose that f is continuously differentiable, i.e. it is differentiable everywhere and the partial derivatives are continuous. Let p be a point in U with a non-zero gradient vector. This means, for example, that the partial derivative with respect to the second coordinate at p is different from zero:
Let us define the mapping
is different from zero at p.
Using the inverse mapping theorem we have an inverse function defined on an open neighbourhoof Φ(V) of Φ(p). We are going to give a local parameterization for the level curve
passing through the point p. Let r be a sufficiently small positive real number such that
is a parametrization of the horizontal segment passing through Φ(p) and t is between r and - r. Then
is just a local parametrization of the level curve 1.45 because
which means that the gradient vector field along the level curves is orthogonal to the tangent lines represented by the derivative vector w'.
Remark In case of higher dimensional spaces the gradient vectors are orthogonal to the tangent hyperplanes of the level hypersurfaces.