python-3.xmultithreadingamazon-s3boto3

Error "cannot schedule new futures after interpreter shutdown" while working through threading


I've got a huge issue that I can't resolve by myself for 3 days already. We have an application that creates Json files and sends them to an Amazon S3 server through the Boto3 library. This app was developed on Python 3.8 and there was no issues. Then Python has been upgraded to 3.9+ and the issue popped up. We need to use threading in this app so we created a new class for it:

class NewThread(Thread):
    def __init__(self, name):
        Thread.__init__(self)
        self.name = name

    def run(self):
        global i, listings
        if self.name=='control':
            # here is control-thread. Code removed for this example
            while True:
                time.sleep(10)
        else:
            i += 1
            print(f'Thread {self.name} works on {files[i]}')
            try:
                create_file(files[i])
                move_file(c.root+f'json/{files[i].replace(".", "-")}.json', 's3folder')
            except Exception as e:
                get_exception(e)

Function create_file() is long and boring. It creates a json file with size of 20-25kb and uses nothing difficult in it. Then, files must be moved to S3 with the function move_file(). Here is the code:

# Function for moving files to s3 bucket
def move_file(file, path, bucket=c.s3cfg['bucket'], folder=c.s3cfg['folder']):
    s3 = boto3.client('s3', aws_access_key_id=c.s3cfg['access_key'], aws_secret_access_key=c.s3cfg['secret_key'])
    name = file.split('/')
    name = folder + '/' + path + '/' + name[len(name) - 1]
    try:
        s3.upload_file(file, bucket, name)
        os.remove(file)
    except Exception as e:
        get_exception(e)

Threads starting by this:

def start_thread(count=5):
    NewThread(name='control').start()
    for i in range(count):
        name = f'thread_{i+1}'
        threads[name] = NewThread(name=name)
        threads[name].start()
        time.sleep(0.5)

Here is the error message:

cannot schedule new futures after interpreter shutdown; Place: script.py; Line: 49;

This row links to s3.upload_file(file, bucket, name) in code. But this error didn't show every time. Sometimes it can send a few files to server before starting the error. Boto3 works good in separate non-thread script even from the move_file() function. And this code works good on Python 3.8. It looks like there is some global variable shutdown that being set to True somewhere in the working process. Please help me understand.


Solution

  • I stumbled upon exactly the same problem, and it's not with BOTO3. MVE:

    import threading
    import boto3
    
    class worker (threading.Thread):
        terminate = False
        def __init__(self):
            threading.Thread.__init__(self)
        def run(self):
            # make BOTO3 CLIENT
            s3_client = boto3.client(...)
            while not self.terminate:
                # BOTO3 downloads from global list, not shown in this code
                s3_client = boto3.download_file(...)
        def stop(self):
            self.terminate = True
    
    mythread = worker()
    mythread.start()
    # **************** THIS IS IMPORTANT
    mythread.join()
    # **************** /THIS IS IMPORTANT
    

    Your error is most likely that you don't wait in the main thread for completion of the others. And BOTO3 probably needs ressources from main thread for its operations.

    Adding the mythread.join() solved it for me.