Ugrás a tartalomhoz

Convex Geometry

Csaba Vincze (2013)

University of Debrecen

8.3 The best affine approximation

8.3 The best affine approximation

Another application of Kirchberger's original theorem is the solution of the problem how to find the best affine approximation for a finite collection F of given points. Geometrically we want to find a hyperplane as "close" to F as possible. Functions with one variables. Let r > 2 different points be given inthe coordinate plane:

( x 1 , y 1 ) , , ( x r , y r ) .

(8.2)

The problem is to find an affine function f(x)=ax+b for which the largest error

δ ( f ) : = m a x { | f ( x 1 ) - y 1 | , , | f ( x r ) - y r | }

(8.3)

is as small as possible. The figure illustrates two approximations of 8.2 with r=5. The negative slope gives a better approximation with respect to 8.3.

Figure 55: The best affine approximation.

The solution of the problem. Let f be a given affine function and consider the sets

A : = { x i | f ( x i ) = y i + δ ( f ) }   a n d   B : = { x i | f ( x i ) = y i - δ ( f ) } .

If

c o n v A c o n v B =

then A and B can be strictly separated by a zero-dimensional hyperplane, i.e. a point x(0) in the coordinate axis[9]. If we rotate the graph of the function around the point (x(0), f(x(0)) with a sufficiently small angle we have a better approximation as the figure shows: for the function f (having a positive slope)

A = x 5   a n d   B = { x 2 , x 3 }

and, consequently,

c o n v A c o n v B = .

The rotation of the graph around the point (x(0), f(x(0))) with a sufficiently small angle into the clockwise direction reduces the absolute value of the difference at the points of both A and B. It should be sufficiently small because the rotation may in fact increase the difference (with absolute value less than δ) at other points of the domain. Therefore if f is one of the best approximations of 8.2 with respect to 8.3 then

c o n v A c o n v B .

To prove the converse statement consider a function f in such a way that the convex hulls of the sets A and B intersect each other. Suppose, in contrary, that f' is a better approximation than f. Then the graph of f' must be lower than the graph of f at each point of A. Especially the same must be true at each point of conv A. On the other hand the graph of f' must be higher than the graph of f at each point of B. Especially the same must be true at each point of conv B. This obviously gives a contradiction to the non-empty intersection of the convex hulls. We have just proved that f is the best approximation if and only if the convex hulls of the sets A and B have a non-empty intersection. Therefore A and B can not be strictly separated which is equivalent, by Kirchberger's separation theorem, to the condition that we have a subset T of A U B containing at most three points such that the sets

T A   a n d   T B

can not be strictly separated. In other words the solutions of the original problem involving r given points are the solutions of the reduced problems involving some three points among the given ones: choose a collection of three points containing T. For the sake of simplicity consider the triplet

( x 1 , y 1 ) , ( x 2 , y 2 ) , ( x 3 , y 3 )

(8.4)

ordered in such a way that x(1)<x(2)<x(3). Then we have the following two possibilities:

A = { x 1 , x 3 }   a n d   B = { x 2 }

or

B = { x 1 , x 3 }   a n d   A = { x 2 } .

Therefore one of the systems of linear equations

y 1 + δ = a x 1 + b ,

y 2 - δ = a x 2 + b ,

y 3 + δ = a x 3 + b

or

y 1 - δ = a x 1 + b ,

y 2 + δ = a x 2 + b ,

y 3 - δ = a x 3 + b

depending on the ordering of the second coordinates should be solved. Geometrically we need a mid-line in the triangle formed by the points (x(i),y(i)), where i=1, 2, 3. After calculating the best approximation for each triplet we can explicitly determine the errors with respect to the full system of points 8.2 to choose the best one.

Functions with two variables. Let r > 3 different points be given in the coordinate spaceof dimension three:

( x 1 , y 1 , z 1 ) , , ( x r , y r , z r ) .

(8.5)

The problem is to find an affine function

f ( x , y ) = a x + b y + c

for which the largest error

δ ( f ) : = m a x { | f ( x 1 , y 1 ) - z 1 | , , | f ( x r , y r ) - z r | }

(8.6)

is as small as possible. The solution of the problem. Let f be a given affine function and consider the sets:

A : = { ( x i , y i ) | f ( x i , y i ) = z i + δ ( f ) }   a n d   B : = { ( x i , y i ) | f ( x i , y i ) = z i - δ ( f ) } .

If

c o n v A c o n v B =

then A and B can be strictly separated by a line l in the coordinate plane (x,y). It can be considered as the orthogonal projection of the line

l ' = { ( x , y , f ( x , y ) ) | ( x , y ) l }

in the space (especially in the graph of the affine function f). Rotating the graph of f around the line l' with a sufficiently small angle we can decrease the (common) error at the points of both A and B. This results in a better approximation than f in a similar way as above. Therefore if f is one of the best approximations of 8.5 with respect to 8.6 then

c o n v A c o n v B .

To prove the converse statement consider a function f in such a way that the convex hulls of the sets A and B intersect each other. Suppose, in contrary, that f' is a better approximation than f. Then the graph of f' must be lower than the graph of f at each point of A. Especially the same must be true at each point of conv A. On the other hand the graph of f' must be higher than the graph of f at each point of B. Especially the same must be true at each point of conv B. This obviously gives a contradiction to the non-empty intersection of the convex hulls. We have just proved that f is the best approximation if and only if the convex hulls of the sets A and B have a non-empty intersection. Therefore A and B can not be strictly separated which is equivalent, by Kirchberger's separation theorem, to the condition that we have a subset T of A U B containing at most four points such that the sets

T A   a n d   T B

can not be strictly separated. In other words the solutions of the original problem involving r given points are the solutions of the reduced problems involving some four points among the given ones: choose a collection of four points containing T.