### Archive

Posts Tagged ‘project euler’

## Sum of the Euler Totient function

Given a positive integer ${n}$, the Euler totient function ${\varphi(n)}$ is defined as the number of positive integers less than ${n}$ which are co-prime with ${n}$ (i.e. they have no common factors with ${n}$). There are formulas for computing ${\varphi(n)}$ starting from the factorization of ${n}$. One such formula is

$\displaystyle \varphi(n) = n \prod_{p|n} \left(1-\frac{1}{p}\right),$

where the product is made over all primes dividing ${n}$.

If you have to compute ${\varphi(n)}$ for all numbers less than a threshold then another property could be useful: ${\varphi}$ is arithmetic, that is, ${\varphi(mn) = \varphi(m)\varphi(n)}$ whenever ${\gcd(m,n)=1}$. Therefore you could store all values computed until ${k}$ and for computing the value ${\varphi(k+1)}$ there are two possibilities: ${k+1=p^\alpha}$ is a prime power and then ${\varphi(k+1) = p^{\alpha}-p^{\alpha-1}}$ or ${k+1}$ is composite and ${k+1 = mn}$ with ${m,n\leq k,\ \gcd(m,n)=1}$. Then use the stored values to compute ${\varphi(k+1)=\varphi(m)\varphi(n)}$.

I now come to the main point of this post: computing the sum of all values of the totient function up to a certain ${N}$:

$\displaystyle \text{Compute } S(N) = \sum_{i=1}^N \varphi(i).$

One approach is to compute each ${\varphi(i)}$ and sum them. I will call this the brute-force approach. For all numerical purposes I will use Pari-GP in this post. On my computer it takes less than a second to compute ${S(10^6)}$ and about ${12}$ seconds to compute ${S(10^7)}$. This is super linear in time, since the algorithm computes the factorization for each ${n}$ and then sums the values. Using the sieve approach could improve the timing a bit, but the algorithm is still super linear.

In some Project Euler problems it is not uncommon to have to compute something like ${S(10^{11})}$ or even larger. Therefore, there must be more efficient ways to compute ${S(N)}$ out there, so let’s study some of the properties of ${S(N)}$. In another post I dealt with the acceleration of the computation of the sum of the divisor function.

We have ${S(N) = \sum_{i=1}^N \varphi(i)}$ which is the number of pairs ${(a,b)}$ with ${1\leq a\leq b \leq N}$ such that ${\gcd(a,b) =1}$. It is not difficult to see that the total number of such pairs is ${n(n+1)/2}$. Moreover, the possible values of ${\gcd(a,b)}$ are ${1,2,...,N}$. Now, if for ${m \leq N}$ we search instead for pairs satisfying ${\gcd(a,b)=m}$ then we have ${a = ma',\ b = mb'}$ with ${\gcd(a',b')=1}$ and we get

$\displaystyle 1 \leq a' \leq b' \leq N/m,\ \gcd(a',b')=1.$

There fore the number of pairs with gcd equal to ${m}$ is ${S(\lfloor N/m\rfloor )}$. Now we arrive at an interesting recursive formula:

$\displaystyle S(N) = \frac{n(n+1)}{2} - \sum_{m=2}^N S(\lfloor N/m \rfloor ).$

At a first sight this looks more complicated, but there is a trick to keep in mind whenever you see a summation over ${m}$ of terms of the form ${\lfloor N/m \rfloor}$: these quantities are constant on large intervals. Indeed,

$\displaystyle \lfloor N/m \rfloor = d \Leftrightarrow md \leq N < m(d+1)\Leftrightarrow N/(d+1)

Therefore we can change the index of summation from ${m}$ to ${d=\lfloor N/m \rfloor}$. The range of ${d}$ for which the interval ${I_d = [N/(d+1),N/d]}$ contains more than one integer is of order ${\sqrt(N)}$. Indeed, ${N/d-N/(d+1) = N/(d(d+1))}$. Therefore for ${d\leq \sqrt(N)}$ we should have at least one integer in the interval ${I_d}$. The part where ${d}$ is larger than ${\sqrt{N}}$ corresponds to ${m}$ smaller than ${\sqrt{N}}$. Therefore, we can split ${S(N)}$ into two sums, each of order ${\sqrt{N}}$. and get that

$\displaystyle S(N) = \frac{n(n+1)}{2}- \sum_{m=2}^{\sqrt{N}} S(\lfloor N/m \rfloor)-\sum_{d=1}^{\sqrt{N}}\left(\lfloor N/d \rfloor-\lfloor N/(d+1)\rfloor\right)S(d),$

where in the last sum we must make sure that ${d \neq \lfloor N/d \rfloor}$ in order to avoid duplicating terms in the sum.

Therefore we replaced a sum until ${N}$ to two sums with upper bound ${\sqrt{N}}$. The complexity is not ${\sqrt{N}}$, but something like ${N^{2/3}}$ since we have a recursive computation. Nevertheless, with this new formula and using memoization, to keep track of the values of ${S}$ already computed, we can compute ${S(N)}$ very fast:

${S(10^6) = 303963552392}$ is computed instantly (vs ${1}$ second with brute force)

${S(10^7) = 30396356427242}$ takes ${1}$ second (vs ${12}$ seconds with brute force)

${S(10^8)}$ takes ${5}$ seconds (vs over ${3}$ minutes with brute force)

${S(10^9)}$ takes ${30}$ seconds

${S(10^{11})}$ takes about ${12}$ minutes

etc. Recall that these computations are done in Pari GP, which is not too fast. If you use C++ you can compute ${S(10^8)}$ in ${0.2}$ seconds, ${S(10^9)}$ in ${1}$ second and ${S(10^{10})}$ in ${6}$ seconds and ${S(10^{11})}$ in under a minute, if you manage to get past overflow errors.

Categories: Number theory, Programming

## A hint for Project Euler Pb 613

The text for Problem 613 can be found here. The hint is the following picture 🙂

## Project Euler – Problem 264

Today I managed to solve problem 264 from Project Euler. This is my highest rating problem until now: 85%. You can click the link for the full text of the problem. The main idea is to find all triangles ABC with vertices having integer coordinates such that

• the circumcenter O of each of the triangles is the origin
• the orthocenter H (the intersection of the heights) is the point of coordinates (0,5)
• the perimeter is lower than a certain bound

I will not give detailed advice or codes. You can already find a program online for this problem (I won’t tell you where) and it can serve to verify the final code, before going for the final result. Anyway, following the hints below may help you get to a solution.

The initial idea has to do with a geometric relation linking the points A, B, C, O and H. Anyone who did some problems with vectors and triangles should have come across the needed relation at some time. If not, just search for important lines in triangles, especially the line passing through O and H (and another important point).

Once you find this vectorial relation, it is possible to translate it in coordinates. The fact that points A, B, C are on a circle centered in O shows that their coordinates satisfy an equation of the form $x^2+y^2=n$, where $n$ is a positive integer, not necessarily a square… It is possible to enumerate all solutions to the following equation for fixed $n$, simply by looping over $x$ and $y$. This helps you find all lattice points on the circle of radius $\sqrt{n}$.

Once these lattice points are found one needs to check the orthocenter condition. The relations are pretty simple and in the end we have two conditions to check for the sum of the x and y coordinates. The testing procedure is a triple loop. We initially have a list of points on a circle, from the previous step. We loop over them such that we dont count triangles twice: i from 1 to m, j from i+1 to m, k from j+1 to m, etc. Once a suitable solution is found, we compute the perimeter using the classical distance formula between two points given in coordinates. Once the perimeter is computed we add it to the total.

Since the triple loop has cubic complexity, one could turn it in a double loop. Loop over pairs and construct the third point using the orthocenter condition. Then just check if the point is also on the circle. I didn’t manage to make this double loop without overcounting things, so I use it as a test: use double loops to check every family of points on a given circle. If you find something then use a triple loop to count it properly. It turns out that cases where the triple loop is needed are quite rare.

So now you have the ingredients to check if on a circle of given radius there are triangles with the desired properties. Now we just iterate over the square of the radius. The problem is to find the proper upper bound for this radius in order to get all the triangles with perimeter below the bound. It turns out that a simple observation can get you close to a near optimal bound. Since in the end the radii get really large and the size of the triangles gets really large, the segment OH becomes small, being of fixed length 5. When OH is very small, the triangle is almost equilateral. Just use the upper bound for the radius for an equilateral triangle of perimeter equal to the upper bound of 100000 given in the problem.

Using these ideas you can build a bruteforce algorithm. Plotting the values of the radii which give valid triangles will help you find that you only need to loop over a small part of the radii values. Factoring these values will help you reduce even more the search space. I managed to  solve the problem in about 5 hours in Pari GP. This means things could be improved. However, having an algorithm which can give the result in “reasonable” time is fine by me.

## Project Euler 607

If you like solving Project Euler problems you should try Problem number 607. It’s not very hard, as it can be reduced to a small optimization problem. The idea is to find a path which minimizes time, knowing that certain regions correspond to different speeds. A precise statement of the result can be found on the official page. Here’s an image of the path which realizes the shortest time:

## Project Euler tips

A few years back I started working on Project Euler problems mainly because it was fun from a mathematical point of view, but also to improve my programming skills. After solving about 120 problems I seem to have hit a wall, because the numbers involved in some of the problems were just too big for my simple brute-force algorithms.

Recently, I decided to try and see if I can do some more of these problems. I cannot say that I’ve acquired some new techniques between 2012-2016 concerning the mathematics involved in these problems. My research topics are usually quite different and my day to day programming routines are more about constructing new stuff which works fast enough than optimizing actual code. Nevertheless, I have some experience coding in Matlab, and I realized that nested loops are to be avoided. Vectorizing the code can speed up things 100 fold.

So the point of Project Euler tasks is making things go well for large numbers. Normally all problems are tested and should run within a minute on a regular machine. This brings us to choosing the right algorithms, the right simplifications and finding the complexity of the algorithms involved.

## Project Euler Problem 285

Another quite nice problem from Project Euler is number 285. The result of the problem depends on the computation of a certain probability, which in turn is related to the computation of a certain area. Below is an illustration of the computation for k equal to 10.

To save you some time, here’s a picture of the case k=1 which I ignored and spent quite a lot of time debugging… Plus, it only affects the last three digits or so after the decimal point…

Here’s a Matlab code which can construct the pictures above and can compute the result for low cases. To solve the problem, you should compute explicitly all these areas.


function problem285(k)

N = 100000;

a = rand(1,N);
b = rand(1,N);

ind = find(abs(sqrt((k*a+1).^2+(k*b+1).^2)-k)<0.5);

plot(a(ind),b(ind),'.');
axis equal

M = k;
pl = 1;

for k=1:M
if mod(k,100)==0
k
end
r1 = (k+0.5)/k;
r2 = (k-0.5)/k;

f1 = @(x) (x<=(-1/k+r1)).*(x>=(-1/k-r1)).*(sqrt(r1^2-(x+1/k).^2)-1/k).*(x>=0).*(x<=1); f1 = @(x) f1(x).*(f1(x)>=0);
f2 = @(x) (x<=(-1/k+r2)).*(x>=(-1/k-r2)).*(sqrt(r2^2-(x+1/k).^2)-1/k).*(x>=0).*(x<=1); f2 = @(x) f2(x).*(f2(x)>=0);

if k == pl
thetas = linspace(0,pi/2,200);
hold on
plot(-1/k+r1*cos(thetas),-1/k+r1*sin(thetas),'r','LineWidth',2);
plot(-1/k+r2*cos(thetas),-1/k+r2*sin(thetas),'r','LineWidth',2);
plot([0 1 1 0 0],[0 0 1 1 0],'k','LineWidth',3);
hold off
axis off
end

A(k) = integral(@(x) f1(x)-f2(x),0,1);

end

xs = xlim;
ys = ylim;

w = 0.01;
axis([xs(1)-w xs(2)+w ys(1)-w ys(2)+w]);

sum((1:k).*A)