Home > Geometry, Numerical Analysis, Optimization > Simple optimization problem regarding distances

Simple optimization problem regarding distances

Given {n} points {A_1,...,A_n} in the plane we can study the problem

\displaystyle \min_M MA_1+...+MA_n.

For {n=3} the solution is known and for a triangle which has an angle less than {120^\circ} is the Toricelli (or Fermat) point, which lies at the intersection of the circumcircles of the equilateral triangles built outside the triangle having the sides of the triangle as support.

For {n=4}, if we have a convex quadrilateral, then a simple triangle inequality argument proves that the optimal position for {M} is the intersection of the diagonals.

In general, we cannot precisely say what is the position of the solution. Providing a numerical algorithm which computes the optimal position of {M} is not a hard task, and I will do that in the end of this post.


First let’s prove that the problem always has a solution. I note {|M|=OM} where {O} is a point chosen as the origin of the plane. If {f(M)= MA_1+...+MA_n} then it is easy to see that {\lim_{|M| \rightarrow \infty} f(M)=+\infty}. One way to see this is to write the triangle inequality {OM \leq MA_i+OA_i}.

Therefore the set {\{ M : f(M) \leq k\}} is compact for every real number {k}. It is closed by the continuity of {f} and bounded from the above remark. Pick some {M_0} in the plane and {k=f(M_0)} very large. Since the function {f} is bounded below by {0} there exists {\inf f(M) \geq 0} and {\inf f(M) < k}. We can pick a minimizing sequence {(M_n)} such that {f(M_n) \rightarrow \inf f(M)}.

From a point on {(M_n)} is in {\{ M : f(M) \leq k\}} which is compact. Therefore {(M_n)} has a limit point {S} and taking the limit we find that {f(S)=\inf f(M)}. This proves that the problem has a solution for every configuration of points {A_1,...,A_n}.

Let’s turn to the numerical part. If we want to find numerically the position of {M} then we can use an optimization algorithm like the gradient descent. The objective function is {f(M) = MA_1+...+MA_n} which is easy to calculate.

Let’s find the gradient of {f}. If {v} is a direction in {\Bbb{R}^2} with {|v|=1} then the derivative of {\phi: M \mapsto MA} in {t=0} is

\displaystyle L_v= \lim_{ t \rightarrow 0} \frac{\phi(M+tv)-\phi(M)}{t}.

Denote {M_t = \phi(M+tv)}. Then using the cosine theorem we have

\displaystyle M_tA^2= MA^2+t^2-2t\cdot MA \cdot\cos \angle (v,MA).

Using this we have

\displaystyle L_v = \lim_{t \rightarrow 0} \frac{\sqrt{MA^2+t^2-2t\cdot MA \cdot \cos\angle(v,MA)}-MA}{t}=

\displaystyle = \lim_{t \rightarrow 0} \frac{t^2-2t\cdot MA\cos \angle(v,MA)}{t(\sqrt{MA^2+t^2-2t\cdot MA \cdot \cos\angle(v,MA)}+MA)}=-\cos\angle(v,MA).

This is according to intuition: if we have a segment {MA} and we go from {M} in a direction {v} which points towards the zone of the plane where {A} is then the angle between {MA} and {v} is in the interval {(-\pi/2,pi/2)} region where {\cos} is positive and therefore {L_v} is negative. This means that going in the direction {v} will decrease the distance between {M} and {A} (at least for a little while…).

If the angle between {v} and {MA} is obtuse then {\cos} is negative and going in that direction will increase the distance between {M} and {A}.

The cosine of an angle of two vectors is connected to their scalar product by

\displaystyle \cos \angle (v,MA) = \frac{\langle v , MA \rangle}{ |v||MA|}=\langle v,MA\rangle /MA.

The direction {v} was chosen arbitrary, and to find the gradient we only need the directions corresponding to the canonical base {(1,0)} and {(0,1)}. Therefore if we consider {\phi(M)=\phi(x,y)} where {(x,y)} are the coordinates of {M} then we have

\displaystyle \frac{\partial \phi}{\partial x}(M) = -\langle (1,0) , M \rangle/MA =(x_M - x_A)/\sqrt{(x_M-x_A)^2+(y_M-y_A)^2}


\displaystyle \frac{\partial \phi}{\partial y}(M) = -\langle (0,1) , M \rangle/MA =(y_M - y_A)/\sqrt{(x_M-x_A)^2+(y_M-y_A)^2}

Taking a good look at our results we can see that

\displaystyle \nabla \phi(M) = -\frac{\overrightarrow{MA}}{MA}

which is minus the unit vector in the direction of {A}.

Our function {f} is a sum of functions of the type {\phi} for different points {A} so

\displaystyle \nabla f (M) = -\sum_{i=1}^n \frac{\overrightarrow{MA_i}}{MA_i}.

Having calculated the gradient of {f} we can plug it in a gradient descent algorithm and see the results. Usually for optimization algorithms we need to be close to the solution if we want to find the global minimum. But here, since the distance function is strictly convex, we have a unique minimizer, so the algorithm will converge to it whatever starting point we choose.

Here is a matlab code for calculating the distances and the gradient:

function [val,gradt,exit_opts] = cost_steiner(x,opts)
P = opts.P;
exit_opts = opts;
val = sum(sqrt(sum((P-repmat(x,[1 size(P,2)])).^2,1)));
gradt = P-repmat(x,[1 size(P,2)]);
norms = sqrt(sum(gradt.^2,1));
gradt = gradt*diag(1./norms);
gradt = - sum(gradt,2);
axis equal

Below you have some examples of some executions of the algorithm. Notice that we find the same solution even if we choose different starting points.






  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: