IMC 2021 Problems – Day 1

September 1, 2021 Leave a comment

Problem 1. Let {A} be a real {n\times n} matrix such that {A^3=0}.

(a) Prove that there is a unique real {n\times n} matrix {X} that satisfies the equation

\displaystyle X+AX+XA^2 = A.

(b) Express {X} in terms of {A} 

Problem 2. Let {n} and {k} be fixed positive integers, and let {a} be an arbitrary non-negative integer. Choose a random {k}-element subset {X} of {\{1,2,...,k+a\}} uniformly (i.e., all {k}-element subsets are chosen with the same probability) and, independently of {X}, choose a random {n}-element subset {Y} of {\{1,...,k+n+a\}} uniformly. Prove that the probability

\displaystyle P(\min(Y)>\max(X))does not depend on {a}

Problem 3. We say that a positive real number {d} is good if there exists an infinite sequence {a_1,a_2,... \in (0,d)} such that for each {n} the points {a_1,...,a_n} partition the interval {[0,d]} into segments of length at most {1/n} each. Find \displaystyle \sup \{d | d\text{ is good} \}

Problem 4. Let {f: \Bbb{R}\rightarrow \Bbb{R}} be a function. Suppose that for every {\varepsilon>0} there exists a function {g: \Bbb{R}\rightarrow (0,\infty)} such that for every pair {(x,y)} of real numbers if {|x-y|<\min\{g(x),g(y)\}} then {|f(x)-f(y)|<\varepsilon}.

Prove that {f} is the pointwise limit of a sequence of continuous functions, i.e. there is a sequence of functions {h_n :\Bbb{R} \rightarrow \Bbb{R}} such that {\lim_{n\rightarrow \infty} h_n(x) = f(x)} for every {x \in \Bbb{R}}.

Accelerating convergence: Wynn’s algorithm

May 21, 2021 Leave a comment

Sometimes we know a sequence can be expressed as a power expansion and we might want to evaluate this series for some given value of the variable, to the highest precission possible. If the series converges fast then summing a few terms should give a decent result. If however the sequence converges slowly, many terms should be summed in order to get a good approximation of the final result.

There are many algorithms known for accelerating the convergence of a sequence. One of them is Richardson’s extrapolation. However, here I would like to talk about another such algorithm which is easy to implement and which gives surprisingly good results. I first learned about this algorithm while reading the SIAM 100 digit challenge book (which I recommend to all who want to learn many nice aspects related to scientific computing).

The algorithm takes a sequence of 2n+1 terms, performs some modifications and outputs an extrapolation of the limit of the sequence. Moreover, the result is exact for a power series of exactly n terms. Many online resources exist which can give you the formulas of the epsilon algorithm. Here are the basic ideas. Start with a matrix of size 2n+1\times 2n+1 and set the first column to be equal to the initial values of the sequence.

In order to be coherent with the Python implementation, arrays will be indexed starting from zero. At the first iteration, fill the first 2n elements of the second column with the inverses of the differences of consecutive values from the first column: a_{i,1} = 1/(a_{i+1,0}-a_{i,0}). You may note that the last element from the second column is not used. This will be a characteristic of this algorithm. As we update the columns we have each time one value less than in the previous column.

Starting from the second column, the update formulas change to a_{i,j} = a_{i+1,j-2}+1/(a_{i+1,j-1}-a_{i,j-1}). Perform the operations until you reach column 2n+1. The only element on this column that is updated will give you the approximation of the limit.

Keep in mind that you need an odd number of terms in the sequence in order to apply the algorithm. As an illustration, one may look at the approximation of \pi given by the series of the arctangent: \arctan x = \sum_{n=0}^\infty (-1)^n \displaystyle \frac{x^{2n+1}}{2n+1} . For $x = 1$ four times this series should give \pi. However, note that the terms of the series go to zero very slowly, so this will take a lot of time to converge. Using 200 terms barely gets between two and three digits precision for \pi. Applying Wynn’s algorithm for 21 terms of the sequence gives close to machine precision!

Below you can see a very basic implementation of the algorithm that you can test for yourself. Don’t use it for more than 21 terms since it will give you some errors (since the algorithm cannot advance anymore, and somewhere within there is a division by zero).

# Wynn epsilon algorithm
def Wynn(s):
    n = len(s)
    print(n)
    if(n%2==0):
        s = s[:-1]
        n = n-1
    A = np.zeros((n,n))
    A[:,0] = s
    for j in range(1,n):
        for i in range(0,n-j):
            if j==1:
                A[i,j]=1/(A[i+1,j-1]-A[i,j-1])
            else:
                A[i,j]=A[i+1,j-2]+1/(A[i+1,j-1]-A[i,j-1])
    #print(A)
    return A[0,n-1]

n = 21
s = np.zeros(n)  
s[0] = 4
for i in range(1,n):
    s[i] = (s[i-1]+4*(-1)**i*1/(2*i+1))
    
print(s)
res = Wynn(s)

print("Pi = ",np.pi)
print("Error before Wynn=",np.abs(s[-1]-np.pi)/np.pi)
print("Error after  Wynn=",np.abs(res-np.pi)/np.pi)
Categories: Uncategorized

Inscribed ellipses of largest area in triangles

May 14, 2021 Leave a comment

Ellipses inscribed in a triangle are quite well studied. Such ellipses are often called inellipses. The Wkikpedia article on the Steiner inellipse is quite interesting and summarizes a lot of informations related to the unique ellipse tangent to the edges of the triangles at its endpoints. I wrote an article in the American Mathematical Monthly on the subject a while ago. If you want to consult a copy and don’t have access to AMM check my website!

One of the properties of the Steiner inellipse is the fact that it is the largest area ellipse that is contained in a given triangle. It is interesting to see how this can be proved. When talking about ellipses it is difficult how to see how one can use classical geometrical arguments in order to conclude the proof. A liberating argument the use of affine transformations.

Affine transformations, in short, are transformations of the plane which preserve collinearity and ratios of lengths on lines. They can be thought as translations combined with scalings (which can be with different ratios along different basis directions).  Moreover images of ellipses through affine transformations are still ellipses and of course tangency is preserved.

With this in mind one can easily see that given an arbitrary inellipse E inscribed in a given triangle T it is possible to find an affine mapping which maps this ellipse onto a circle C inscribed into a triangle T_2. Moreover, following the properties of the affine mappings it follows that the ratios between the areas of the inellipse E and the area of T is equal to the ratio between the area of the incircle C to the area of the triangle T_2.

Therefore, the question of finding the maximal area inellipse is reduce to the question of finding what could be the maximal area of the incircle with respect to the area of the triangle. A quick geometric/trigonometric computation shows that the maximal proportion is found when the triangle is equilateral and the ratio in that case is \pi/(3\sqrt{3}).

Moreover, if an affine transformation is used for transforming the Steiner inellipse into a circle, the resulting triangle will have the property that the incircle is tangent to the sides at their midpoints, which only happens for an equilateral triangle. This shows that indeed, the Steiner inellipse has the maximal area among all inellipses and the maximal area is always proportional to the area of the triangle with the factor shown above.

Incidentally, the proof above also shows why there exists such a Steiner inellipse. It is enough to consider an affine mapping that transforms an equilateral triangle into an arbitrary triangle and note that: the incircle is transformed into an inellipse and midpoints are preserved.

Checking the gradient and Hessian implementation using finite differences

April 14, 2021 Leave a comment

Solving a numerical optimization problem depends not only on the optimization algorithm you use, but also on the fact that you implemented correctly the gradient and, eventually, the Hessian matrix associated to the function you want to optimize. The correct implementation of the partial derivatives is not always a trivial calculus question, where you have an analytic formula for your function and you just need to take care when performing the computations. However, even the best students can make a typing error, put a wrong sign somewhere, miss a factor, etc, and in that case the optimization algorithm simply doesn’t work as expected.

Things get even more complicated when the computation of the gradient (and Hessian) goes through some PDE model, multiplying the places in your code where an error might hide. For example in some parametric shape optimization problem, one has a parametric description of the shape and the gradient of the function to be optimized is obtained by putting in the associated shape derivative formula a perturbation field associated to the variation in the corresponding parameter.

Read more…

Gradient algorithm with optimal step: Quadratic Case: theory

April 6, 2021 Leave a comment

In a previous post I looked at the theory behind the convergence of the gradient descent algorithm with fixed step for a quadratic function. In this post I will treat a similar topic, namely the gradient descent algorithm with optimal step for the case of a quadratic function. Let us consider again {J:\Bbb{R}^n\rightarrow \Bbb{R}} defined by

\displaystyle J(x) = \frac{1}{2}\langle Ax,x\rangle -\langle b,x \rangle,

the classical quadratic function, where {A} is symmetric positive definite and {b} is an arbitrary vector. The gradient descent algorithm with optimal step has the form

\displaystyle x_{k+1} = x_k - \mu_k \nabla J(x_k),

where the descent step {\mu_k} is chosen such that

\displaystyle \mu \mapsto J(x_k - \mu \nabla J(x_k))

is minimal.

Read more…

Gradient algorithm: Quadratic Case: theory

March 31, 2021 1 comment

Everyone who did a little bit of numerical optimization knows the principle gradient descent algorithm. This is the simplest gradient based algorithm: given the current iterate, advance in the opposite direction of the gradient with a given step

\displaystyle x_{k+1} = x_k - \mu \nabla J(x_j).

It is straightforward to see that if {\rho} is small enough, {\nabla J(x_k) \neq 0} and {J} is at least {C^1} then

\displaystyle J(x_{k+1}) = J(x_k)-\mu |\nabla J(x_k)|^2 +o(\rho).

This means that there exists a {\rho_k} small enough for which {J(x_{k+1})<J(x_k)}.

Read more…

Putnam 2020 – Problem A3

March 19, 2021 Leave a comment

A3. Let {a_0=\pi/2} and {a_n = \sin(a_{n-1})} for {n\geq 1}. Determine whether

\displaystyle \sum_{n=1}^\infty a_n^2

converges.

Read more…

Putnam 2020 problems – Day 1

March 19, 2021 Leave a comment

A1. How many positive integers {N} satisfy all of the following three conditions?

  • (i) {N} is divisible by {2020}.
  • (ii) {N} has at most {2020} decimal digits.
  • (iii) The decimal digits of {N} are a string of consecutive ones followed by a string of consecutive zeros.

A2. Let {k} be a nonnegative integer. Evaluate

\displaystyle \sum_{j=0}^k 2^{k-j} {k+j \choose j}.

A3. Let {a_0=\pi/2} and {a_n = \sin(a_{n-1})} for {n\geq 1}. Determine whether

\displaystyle \sum_{n=1}^\infty a_n^2

converges.

A4. Consider a horizontal strip of {N+2} squares in which the first and the last square are black and the remaining {N} squares are all white. Choose a white square uniformly at random, choose one of its two neighbors with equal probability, and color this neighboring square black if it is not already black. Repeat this process untill all the remaining white squares have only black neighbors. Let {w(N)} be the expected number of white squares remaining. Find

\displaystyle \lim_{N \rightarrow \infty} \frac{w(N)}{N}.

A5. Let {a_n} be the number of sets {S} of positive integers for which

\displaystyle \sum_{ k \in S} F_k = n,where the Fibonacci sequence {(F_k)_{k\geq 1}} satisfies {F_{k+2}=F_{k+1}+F_k} and begins with {F_1=1,F_2=1,F_3=2,F_4=3}. Find the largest integer {n} such that {a_n = 2020}.

A6. For a positive integer {N}, let {f_N} be the function defined by

\displaystyle f_N(x) = \sum_{n=0}^N \frac{N+1/2-n}{(N+1)(2n+1)}\sin ((2n+1)x).Determine the smallest constant {M} such that {f_N(x)\leq M} for all {N} and all real {x}.

Source: https://kskedlaya.org/putnam-archive/

Circles in a square – Part 2 – upgrade

March 12, 2021 Leave a comment

In the previous post it was shown that if some circles of total length 12 are placed in the unit square there exists a line which intersects at most three of them. The proof looked at the projection of the circles on one side of the square. Since the sum of the diameters was less than 4 it followed that there is a line orthogonal to some edge of the square which does not intersect more than three circles.

But we can say more. Instead of looking at the projection of these circles on one side of the square, look at the projection on the diagonal, which has length \sqrt{2}. The sum of the diameters of these circles is equal to 12/\pi\approx 3.8. Dividing this by \sqrt{2} gives approximately 2.70, which means that the projections of the circles on the diagonal cannot cover it more than three times. As a conclusion, there exists a line orthogonal to one of the diagonals of the square which does not intersect more than two circles!

Furthermore, looking at lines orthogonal to the diagonals gives another perspective into the structure of this problem. In fact, if only finitely many circles are placed inside the square, there are lines which intersect the square, but does not intersect any of the circles! Therefore, the problem is non-trivial only if infinitely many circles with radii that become infinitely small are placed inside the square.

Edit: After some thought, it seems like the statement of the original problem is wrong. Apparently, it seems that one can always find a line which intersects at most one circle. The right question that needs to be asked is the following:

We have some circles in the square with total perimeter equal to 12. Prove that there exists a line which intersects at least four of these circles!

For a proof, check the initial post with arguments reversed. The total projection of the circles on a side of the square does cover it more than three times. Pick one point covered by four of the circles and the orthogonal to the side at this point will intersect four circles!

Circles in a square

March 12, 2021 1 comment

Inside a square of edge length 1 a few circles are drawn. The total perimeter of these circle is equal to 12. Prove that there exists a line that intersects the square which intersects at most three of these circles.

Read more…

Recurrences converging to the wrong limit in finite precision

March 8, 2021 Leave a comment

Consider the following recurrence:

\displaystyle x_0 = 4, \ x_1 = 4.25, \ x_{n+1}=108-(815-1500/x_{n-1})/x_n, \forall n \geq 1.

a) Study this recurrence theoretically: prove that the sequence converges and find its limit.

b) Implement the recurrence using the programming language of your choice and see what is the limit obtained by numerical investigation. Interpret the results!

Read more…

Partitioning a disk with disks

February 7, 2021 2 comments

Suppose that disjoint disks {D_i} with radii {r_i} are placed inside the unit disk {D} such that {\sum_{i=1}^\infty r_i^2=1}. Prove that {\sum_{i=1}^\infty r_i} diverges

Details about this problem and the motivation behind it can be found in the article:An infinite packing theorem for spheres by Oscar Welser, Proc. AMS, Vol. 11, No. 2 (Apr., 1960)

Another way of formulating this problem is the following: a sequence of disks {D_n} which cover {D} exactly (up to a zero-measure set) must have infinite total perimeter.

Read more…

A rational harmonic series

January 9, 2021 1 comment

Can the rational numbers in the interval {[0,1]} be enumerated as a sequence {(q_n)_{n\geq 1}} in such a way that the series {\sum_{n=1}^\infty q_n/n} is convergent?

source: 17th University of Michigan Undergraduate Mathematics Competition 2000, you can also take a look at Bicycle or unicycle? Problem 63

Read more…

Ten bottles of wine – Python code

January 4, 2021 Leave a comment

In a previous post I presented a nice geometry problem which stated that if some bottles are stacked in the box such that the bottom layer extremities touch the sides of the box and each bottle on the next rows touches two bottles from the row below, then the top bottle will be centered in the box.

Below you will find a python code which allows you to produce pictures of stacks of bottles in a box for arbitrary numbers of rows (you may need to modify the code slightly for more than 15 rows).

The code for drawing the bottles goes like this:

import numpy as np
import matplotlib.pyplot as plt

N = 15          # number of rows
r = 1            # radius

skips = np.zeros(N)

# skips between bottom bottles

skips = [0,1.6,0.4,1.2,0.3,0.6,0.7,0.8,0.9,1,0.4,0.4,0.6,0.7,0.9]*r
skips = skips[0:N]
w = N*2*r+np.sum(skips)    # total width

skips = np.cumsum(skips)   # compute cumulative skips

plotw = 0.05*w             # compute space for plot borders

def drawing(N,r,skips):
    lin = np.linspace(0,N,N+1)[0:-1]
    xs = 2*r*lin+skips+r    # coordinates of the bottom bottles
    ys = np.zeros(N)
    plt.figure()
    plt.plot(xs,ys,'.r')
    for i in range(0,N):
        # draw the bottles
        circle1 = plt.Circle((xs[i],ys[i]),r,color='b',fill=False)
        plt.gcf().gca().add_artist(circle1)    
    
    # compute the position of the bottles from 
    # the second row up
    for i in range(1,N):
        newxs = np.zeros(N-i)
        newys = np.zeros(N-i)
        for j in range(0,N-i):
            seg = [xs[j+1]-xs[j],ys[j+1]-ys[j]]
            l   = np.linalg.norm(seg)/2
            nor = [-seg[1],seg[0]]
            nor = nor/np.linalg.norm(nor)
            mid = 0.5*np.array([xs[j+1]+xs[j],ys[j+1]+ys[j]])
            nl  = np.sqrt(4*r**2-l**2)
            newxs[j] = mid[0]+nl*nor[0]
            newys[j] = mid[1]+nl*nor[1]
            # draw circles
            circle1 = plt.Circle((newxs[j],newys[j]),r,color='b',fill=False)
            plt.gcf().gca().add_artist(circle1)
        plt.plot(newxs,newys,'.r')
        xs = newxs
        ys = newys  
    
    print("Last x coordinate: ",newxs[0]," Midpoint: ",w/2)
    
    # box plot 
    botx = np.array([0,w])
    boty = -r*np.ones(2)
    plt.plot(botx,boty,color='k')
    
    leftx = np.zeros(2)
    lefty = np.array([-r,newys[0]+r])
    plt.plot(leftx,lefty,color='k')

    rightx = w*np.ones(2)
    righty = np.array([-r,newys[0]+r])
    plt.plot(rightx,righty,color='k')
    
    # some plot settings
    plt.axis("scaled")
    plt.axis("off")
    plt.xlim([-plotw,w+plotw])
    plt.ylim([-1-plotw,newys+r+plotw])
    plt.savefig("Bottles.png",bbox_inches='tight',pad_inches = -0.1,dpi=300)
    plt.show()
    return 0

drawing(N,r,skips)

Here is an output obtained using this code:

15 rows of bottles: the top one is still centered

The code for drawing the segments between the centers which illustrate the rhombic pattern and which indicate why the top bottle should be centered goes like this:

def drawingRhombi(N,r,skips):
    lin = np.linspace(0,N,N+1)[0:-1]
    xs = 2*r*lin+skips+r   
    ys = np.zeros(N)
    plt.figure()
    plt.plot(xs,ys,'.r',zorder=2)
    
    for i in range(0,N-1):
        plt.plot(xs[i:i+2],ys[i:i+2],color='k',zorder=1)
    
    for i in range(1,N):
        newxs = np.zeros(N-i)
        newys = np.zeros(N-i)
        for j in range(0,N-i):
            seg = [xs[j+1]-xs[j],ys[j+1]-ys[j]]
            l   = np.linalg.norm(seg)/2
            nor = [-seg[1],seg[0]]
            nor = nor/np.linalg.norm(nor)
            mid = 0.5*np.array([xs[j+1]+xs[j],ys[j+1]+ys[j]])
            nl  = np.sqrt(4*r**2-l**2)
            newxs[j] = mid[0]+nl*nor[0]
            newys[j] = mid[1]+nl*nor[1]
            plt.plot([xs[j],newxs[j]],[ys[j],newys[j]],color='k',zorder=1)
            plt.plot([xs[j+1],newxs[j]],[ys[j+1],newys[j]],color='k',zorder=1)
            plt.plot([xs[j],newxs[j]],[-ys[j],-newys[j]],color='k',zorder=1)
            plt.plot([xs[j+1],newxs[j]],[-ys[j+1],-newys[j]],color='k',zorder=1)
        plt.plot(newxs,newys,'.r',zorder=2)
        plt.plot(newxs,-newys,'.r',zorder=2)

        xs = newxs
        ys = newys   
    
    print("Last x coordinate: ",newxs[0]," Midpoint: ",w/2)
    
    plt.axis("scaled")
    plt.axis("off")
    plt.xlim([r-plotw,w-r+plotw])
    plt.ylim([-newys-plotw,newys+plotw])
    plt.savefig("Rhombi.png",bbox_inches='tight',pad_inches = -0.1,dpi=300)

    plt.show()
    return 0

drawingRhombi(N,r,skips)

Here is an output:

The rhombic pattern with symmetry: this shows that the top bottle is exactly centered in the box.
Categories: Uncategorized

Ten Bottles of Wine

December 17, 2020 Leave a comment

Ten identical bottles of wine are stacked up in a bin as shown in the Figure below. Show that the top bottle is centered between the sides of the bin.

Bicycle or Unicycle?, D. Velleman, S. Wagon, Problem 3

One of the nice benefits of being a part of Mathematical Reviews is being able to use some points in order to buy books edited by the American Mathematical Society. The book referenced above is very nice and it contains many puzzle style problems with detailed solutions and nice pictures. The problem at hand is surely intriguing, since the hypotheses are very simple and the conclusion is beautiful.

I am not going to try and write a complete solution here. I will just give you a hint and show you a picture which should help you get to the solution yourself. I wrote a python code which can show a variety of configurations, notably, it can generate such pictures for more than four layers of bottles. The conclusion is still the same: the top bottle is centered, and the proof can be obtained following the same lines.

And here is an example for 10 layers of bottles!

Categories: Geometry, puzzles Tags: ,

Continuous functions on the 2-sphere

December 3, 2020 Leave a comment

The structure of a continuous function defined on the circle with real values is quite intuitive. No one is surprised by the following statement:

A continuous function {f : \Bbb{S}^1 \rightarrow \Bbb{R}} satisfies the following property: there exists {\theta \in \Bbb{S}^1} such that {f(\theta)=f(-\theta)}.

For a proof one simply should take the function {g(\theta) = f(\theta)-f(-\theta)} and notice that {g(-\theta)=-g(\theta)}. Therefore, going from {-\theta} to {\theta} on the circle, the function {g} should pass through zero, which implies that {f} takes the same value on two antipodal points. The following theorems involving the {2} dimensional sphere are less intuitive and it is not immediate to give a proof. Nevertheless, they have some cool applications.

Theorem 1. (Borsuk-Ulam) Let {f: \Bbb{S}^2\rightarrow \Bbb{R}^2} be a continuous function. Then there exists {x \in \Bbb{S}^2} such that {f(x)=f(-x)}.

Note that the proof strategy from the one-dimensional case does not work. The two coordinates of {f(x)-f(-x)} should vanish simultaneously, and it is not straightforward to see why this should happen. Two other similar results are shown below.

Theorem 2. (Kakutani) Let {f: \Bbb{S}^2\rightarrow \Bbb{R}} be a continuous function. Then there exist three points {x_1,x_2,x_3} such that {Ox_1,Ox_2,Ox_3} are mutually orthogonal and {f(x_1)=f(x_2)=f(x_3)}.

Theorem 3. (Dyson) Let {f: \Bbb{S}^2\rightarrow \Bbb{R}} be a continuous function. Then there exist {x_1,x_2,x_3,x_4} vertices of a square centered at the origin such that {f(x_1)=f(x_2)=f(x_3)=f(x_4)}.

We will not try to prove any of these results, since the techniques involved are not always elementary. However, let’s list a short list of cool consequences.

Consequence 1. At any moment in time there exists a place on Earth for which the temperature and the atmospherical pressure is the same as on the opposite side. It is enough to apply Borsuk-Ulam’s theorem for the function which associates to each point the temperature and the pressure, assuming they vary continuously.

Consequence 2. Let {K} be a convex three dimensional body. Then there exists a cube circumscribed to {K} (i.e. every side touches {K}).

For each direction {x \in \Bbb{S}^2} it is possible to associate the distance between the two tangent planes to {K}. This is a continuous function. Applying Kakutani’s result we find that there are three mutually orthogonal directions for which the distances between the tangent planes to {K} orthogonal to these directions are the same. This gives the desired cube circumscribed to {K}. It is possible to generalize this to an arbitrary parallelipiped by using some other variants of Kakutani’s result.

Consequence 3. Let {K} be a three dimensional body which is star-shaped with respect to the origin. Then there exist four points on its boundary which are vertices of a square.

In order to prove this, consider the function {f} which is the distance between the origin and a point of {K} in the direction {x \in \Bbb{S}^2}. This is a continuous function and Dyson’s theorem says that there are two orthogonal diameters in {\Bbb{S}^2} on which this function takes the same value. These will be the vertices of the desired square.

Shapes with minimal area circumscribed around all bodies of constant width

November 26, 2020 Leave a comment

In a previous post I talked about a proof of the Blaschke-Lebesgue theorem (that the Reuleaux triangle is the two dimensional constant width shape with the minimal area). The proof was quite surprising, as it gave a lower bound that was exactly attained for the Reuleaux triangle.

Looking closely at the proof one can deduce the following interesting result:

Among all convex, centrally symmetric domains containing all shapes of a given constant width {w} the regular hexagon of width {w} has the smallest area.

To prove this result is enough to recall the definition of the mixed area of two convex shapes in the plane given by

\displaystyle A(K_1+K_2) = A(K_1)+2A(K_1,K_2)+A(K_2).

Note that {A(K_1,K_2)} is monotone in each one of its arguments.

Then given a constant width shape {B} of width {1}, the shape {B+(-B)} is the unit disk of area {\pi}. Moreover, there exists a hexagon of width one {H} circumscribed to {B}. This gives

\displaystyle \pi = A(B+(-B)) = 2A(B)+2A(B,-B) \leq 2A(B)+2A(H).

This gives {A(B) \geq (\pi-A(H))/2}, and the equality is attained for the Reuleaux triangle.

Suppose now that there exists another shape {H'} which contains all shapes of constant width equal to {1} and {A(H')<A(H)}. Repeating the above argument we would also get

\displaystyle A(B) \geq (\pi-A(H'))/2>(\pi-A(H))/2.

However, plugging the Reuleaux triangle in place of {B} in the above inequality gives a contradiction.

Therefore, among all shapes containing copies of all constant width domains in dimension two, the regular hexagon has the minimal area.

The paper Sets of constant width by G.D. Chakerian gives more details about the proof shown above, as well as some ideas about the three dimensional case, which is still an open problem!

Minimal squared sum minus sum of squares

October 10, 2020 Leave a comment

a) Let {x_1,...,x_n} be real numbers such that {\sum_{i=1}^n |x_i|\leq 1}. Then

\displaystyle \left(\sum_{i=1}^n x_i\right)^2-\sum_{i=1}^n x_i^2 \geq -\frac{1}{2}.

b) Prove that if {\sum_{i=1}^n x_i^2 = \sum_{i=1}^n y_i^2 = 1} then

\displaystyle \left(\sum_{i=1}^n x_iy_i\right)-\sum_{i=1}^n x_i^2y_i^2 \geq -\frac{1}{2}.

Solution: a) Denote with {s_+} and {s_-} the sum of positive and negative terms among the {x_i}:

\displaystyle s_+ = \sum_{x_i\geq 0} x_i,\ s_- = -\sum_{x_i<0} x_i.It is obvious that {\sum_{i=1}^n x_i^2 \leq s_+^2+s_-^2} (it is enough to expand the sums). Moreover, the equality is attained here only if {s_+} and {s_-} contain at most one non-zero term.

Therefore, if {(s_+-s_-)^2-s_+^2-s_-^2\geq -\frac{1}{2}} the original inequality would be proved. This is equivalent to {s_+s_- \leq \frac{1}{4}}. But {s_++s_- = \sum_{i=1}^n |x_i|\leq 1}, which directly implies {s_+s_-\leq \frac{1}{4}}.

Let us now discuss the equality case. Equality holds if and only if {s_+s_-=1/4}. Since {1 \geq (s_++s_-)^2 \geq 4s_+s_-=1} it follows that {s_+=s_-=1/2}. Moreover, equality should also hold in

\displaystyle \sum_{i=1}^n x_i^2 = s_+^2+s_-^2.This means that all but one elements in {s_+} and in {s_-} are zero, which shows that the equality case is attaned for {(1/2,-1/2,0,0...,0)}.

b) It is enough to apply a) for {x_i : = x_iy_i} noting that

\displaystyle (\sum_{i=1}^n |x_iy_i|)^2 \leq \sum_{i=1}^n x_i^2\sum_{i=1}^n y_i^2=1.

Source: The solution here was inspired from the following MathOverflow answer: link.

Categories: Algebra, Inequalities Tags:

Visualizing isosurfaces – Python and Mayavi

October 7, 2020 Leave a comment

Recently I needed to plot some isosurfaces for some 3D unstructured tetrahedral meshes. I had the tetrahedral mesh of some 3D set and a P1 element function defined via the values at the nodes of this mesh. The objective was to visualize some iso surface of the finite element function.

Although this is quite a standard task, it was difficult for me to find the proper solution. Matplotlib did not allow me to interact with the 3D plots, so I decided to look into mayavi. There were a lot of answers on the web concerning isosurfaces on structured meshes (obtained with meshgrids like in Matlab), but that was not what I wanted. There were even some answers where the data from the unstructured mesh was somehow interpolated on a structured mesh and then the isosurface was computed. This is surely un-necessary, since the fact that my mesh was made of tetrahedrons makes it easy to see where the isosurface lies: just look for tetrahedrons in which the values at the vertices are on different sides of the iso-surface value. Fortunately, I did not give up and managed to find how to do this in Mayavi. I found an example of code in some obscure place in the online documentation which did exactly that and I present the code below.

I also had a mesh in the .mesh format (output from FreeFEM) which I needed to import into python. For this I used the “meshio” package.

import meshio
import numpy as np
import matplotlib.pyplot as plt

mesh = meshio.read("CurrentMesh3D.mesh")

mdet = mesh.cells
tetra = mdet[0].data
tri   = mdet[1].data

pts = mesh.points
x = pts[:,0]
y = pts[:,1]
z = pts[:,2]

v = np.loadtxt("OnePhase3D.data")

import mayavi

from mayavi import mlab
from tvtk.api import tvtk

fig = mlab.figure(bgcolor=(1,1,1),size=(400, 350))

tet_type = tvtk.Tetra().cell_type

ug = tvtk.UnstructuredGrid(points=pts)
ug.point_data.scalars = v
ug.point_data.scalars.name = "value"
ug.set_cells(tet_type, tetra)
ds = mlab.pipeline.add_dataset(ug)
iso = mlab.pipeline.iso_surface(ds,contours=[0.5,])

iso.actor.property.opacity = 0.7
iso.contour.number_of_contours = 1

l = mlab.triangular_mesh(x,y,z,tri,color=(0.3,0.3,0.3),
                        opacity=0.4)

mlab.show()

As can be seen in the code above, the isosurface is set with the iso_surface command from mlab.pipeline. Also, it is possible to plot the triangular mesh directly with mlab.triangular_mesh. One result obtained with this code is plotted below.

Isosurface and the surrounding mesh

Of course, the above example can be transformed to work with an unstructured mesh that you already have in python if you just have the tetrahedral information and the values at the vertices. Also you’ll need the triangles making the boundary of the big mesh in order to plot it. You can change the value of the iso surface, the number of isosurfaces, etc by just modifying the appropriate parts of the code.

What I found frustrating is that even though this kind of solution is already implemented, it is not easy to identify in the Mayavi documentation.

Of course, it is also possible to do all of this in ParaView, but I did not manage to find enough example of ParaView scripts in order to learn how to do it. And doing it by hand in ParaView was no fun since I had to do many such pictures and setting them all up to look the same by hand was just too much. (This is again an example of things that are out there, but which are only available to those who know them… Scripting in Paraview seems powerful, but in my case, if I don’t see a large palette of examples I’m unable to learn how to do it)

So hopefully, you’ll find this reference useful if you want to plot an isosufrace of a P1 function on an unstructured mesh in Mayavi!

Challenge: Try to find the place in the Mayavi documentation where a similar example is shown… I did not manage to find the link to put it here even though I saw it two days ago. Search engines did not help. Searching directly will give you examples on meshgrids. I hope I’m wrong and you find it quickly: if you do, please post the link in the comment section! 🙂

IMO 2020 – Problem 2

September 30, 2020 Leave a comment

Problem 2. The real numbers {a,b,c,d} are such that {a \geq b \geq c \geq d>0} and {a+b+c+d=1}. Prove that

\displaystyle (a+2b+3c+4d)a^ab^bc^cd^d<1.

Hints: This problem is not difficult if one knows a bit of real analysis. The function {x \mapsto x^x} is convex and can be extended continuously to {[0,1]} noting that {\lim_{x \rightarrow 0} x^x=1}. Therefore, the function {(a,b,c,d) \mapsto (a+2b+3c+4d)a^ab^bc^cd^d} is can be extended continuously to {[0,1]^4} and it attains its minimum and maximum values. Moreover, the constraint {a+b+c+d=1} defines a compact set inside {[0,1]^4} so there exists a minimum and a maximum for this function on the given domain. If at the maximum some of the variables should be {0}, the restriction {a,b,c,d} would make the inequality in the hypothesis to be sharp, but strict.

Having no other idea than to go ahead with the analysis, a natural idea is to see what happens when perturbing two of the variables while maintaining the constraint. First note that the case where {a=b=c=d=1/4} gives the value {10/16} which is obviously less than {1}. Taking {b,c,d \rightarrow 0} gives values as close to {1} as we want, so this the maximum is surely not attained when all variables are equal. So we may well assume that at the maximum there exist two variables, say {a} and {b} such that {a>b}.

Then, change the variables {a} and {b} to {a-t,b+t}, with {t>0}, which preserves the constraint. Consider the function

\displaystyle f: t \mapsto (a+2b+3c+4d+t)(a-t)^{(a-t)}(b+t)^{(b+t)}c^cd^d.

The derivative of this function is

\displaystyle (a-t)^{(a-t)}(b+t)^{(b+t)}c^cd^d[1+(a+2b+3c+4d+t)[\ln(b+t)-\ln(a-t)]].

It is not difficult to see that the function above is increasing with respect to {t}. Therefore, the function {f} is convex and attains its maxima at the extrema of the function are attained when {a-t} and {b+t} are as far away as possible. Therefore, {(a,b,c,d)} is not a maximum for the above function when {a<1}. A similar argument works when perturbing other pairs of points. Therefore the maximal value is attained when {a=1} and {b=c=d=0} and this maximal value is {1}. Since the hypothesis clearly states {b,c,d>0}, it follows that the upper bound is indeed {1} but is not attained.

%d bloggers like this: