pythonlistfor-loopprimes

Why does my script stop execution after 33178210th iteration?


I am writing a python program to find and return a list of Twin primes between two numbers.

This is my code:

#prime selector
def is_prime(num):
    if num < 2 or num % 2 == 0: 
        return False
    if num == 2: 
        return True
    upr_lmt = int((num ** 0.5) + 1) 
    for number in range(3, upr_lmt, 2): 
        if num % number == 0:
            return False
    return True


def primeLister(end_point, st_point = 2):
    primeList = []
    for i in range(st_point, end_point + 1):
        if is_prime(i):
            primeList.append(i)
    return primeList

It is saved as prime_selector.py in a folder.

This is the testcase that I have written:

# Test case
from prime_selector import primeLister

def log_newline(file_writer):
    file_writer.write("\n\n")

def log_testcase(case_num, case_desc, result, file_writer):
    file_writer.write(f"================= Test Case {case_num + 1} =================\n")
    print((f"================= Test Case {case_num + 1} ================="))
    file_writer.write(case_desc)
    print(case_desc, end = "")
    file_writer.write(result)
    print(result)

num_list = [99991,999999937]

with open("./primeLister_testcase","w") as file_writer:
    for case_num, i in enumerate(num_list):
        case_desc = f"Listing all prime numbers till {i}:\n"
        result = "".join([f" > {prime}\n" for prime in primeLister(i)])

        log_testcase(case_num, case_desc, result, file_writer)

        if case_num + 1 != len(num_list):
            log_newline(file_writer)    

The testcase is also saved in the same folder locally. The program stops abruptly during the execution of primeLister testcase after writing the primes up to 99991 in the file.

I guess that it stops during the process of listing primes between 99991 and 999999937. I've got no idea why. My computer has an Intel i5-12450H processor with 16GB RAM. If that's relevant.

I have already tried implementing an infinite loop, threading and also using exception handling. Nothing has worked so far. What am I doing wrong?

I was expecting the program to keep executing until it found all the primes within the provided range.


Solution

  • The problem is that your process is aiming to produce about 50 million primes for the input 999999937, accumulating them in a list, and does this in an non-optimal way, as for each integer 𝑘 between 1 and 999999937 you call is_prime, which executes √𝑘/2 iterations when 𝑘 is prime.

    Then, after a long running time, you concatenate those 50 million numbers with their decimal representation into one, long string, which will be about 500 million characters long. That is half a GB if we assume a 1-byte character representation. It may be that your program runs out of memory... if it ever finishes producing the primes in the first place.

    Running your code with a small adaptation, turning primeLister into a generator, and then printing the primes one by one (well, I only printed one per 100000 to save time), I could monitor the progress, I got that generating the first million of primes took about 2 minutes, and it continued like this:

    number of primes time needed
    1,000,000 2 minutes
    2,000,000 5 minutes
    3,000,000 9 minutes
    4,000,000 14 minutes
    .... ...

    I stopped there, but extrapolating, I estimate that it would take about 22 hours to finish the call to primeLister with input 999999937, and only then the string of 0.5 GB would be created from it. I suppose your system just kills the process either because of memory or time constraints. In my test it didn't get killed for the 15 minutes I waited for it. Then I killed it myself.

    But it is clear that this is not practical. You should use a sieve approach.