pythonperformance

Project Euler 14 code efficiency


l = [[i, i, 1] for i in range(1,1000000)]
def collatz(li):
    for el in li:
        if el[1] == 1:
            li.remove(el)
        elif el[1] % 2 == 0:
            el[1] = el[1] / 2
            el[2] += 1
        elif el[1] % 2 == 1:
            el[1] = 3*el[1] + 1
            el[2] += 1
    return li
while len(collatz(l)) >= 2:
l = collatz(l)

print l

Hi, this is a (partial) solution to Euler problem 14, written in Python.

Longest Collatz sequence

Problem 14

The following iterative sequence is defined for the set of positive integers:

n → n/2 (n is even)
n → 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:

13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

I wrote partial because it does not really output the solution since I can't really run it in the whole 1 - 1000000 range. It's way too slow - taking more than 20 minutes the last time I killed the process. I have barely just started with python and programming in general (about 2 weeks) and I am looking to understand what's the obvious mistake I am making in terms of efficiency. I googled some solutions and even the average ones are orders of magnitude faster than mine. So what am I missing here? Any pointers to literature to avoid making the same mistakes in the future?


Solution

  • the problem is you use brute force algorithm that is inefficient.this is my solution to problem 14 from project Euler. it takes a few second to run. the key is you should save previous results in a dictionary so you don't have to compute those results again.:

    #problem 14 project euler
    import time
    start=time.time()
    has2={}
    def collatz(x):
        seq=[]
        seq.append(x)
        temp=x
        while(temp>1):
            if temp%2==0:
                temp=int(temp/2)
                if temp in has2:
                    seq+=has2[temp]
                    break
                else:
                    seq.append(temp)
            else:
                temp=3*temp+1
                if temp in has2:
                    seq+=has2[temp]
                    break
                else:
                    seq.append(temp)
    
    
        has2[x]=seq            
        return len(seq)
    
    num=0
    greatest=0
    for i in range(1000000):
        c=collatz(i)
        if num<c:
            num=c
            greatest=i
    print('{0} has {1} elements. calculation time ={2} seconds.'.format(greatest,num,time.time()-start))