[SOLVED] Why is there a big performance difference between those 2 simple python multithreading codes?

Why is there a big performance difference between those 2 simple python multithreading codes?

Let's consider this python code:

def process_payload(payload, url, headers):
    response = requests.post(url, headers=headers, json=payload)
    return response

def parallel_group2(payloads, url, headers):
    with ThreadPoolExecutor() as executor:
        results = executor.map(process_payload,payloads, [url]*len(payloads), [headers]*len(payloads))
    return list(results)

def parallel_group(payloads, url, headers):
    with ThreadPoolExecutor() as executor:
        results = executor.map(requests.post, [url]*len(payloads), [headers]*len(payloads), payloads)
    return list(results)

times = []
# payloads grouped by 15
payloads_grouped = [payloads[i:i+15] for i in range(0, len(payloads), 15)]
print( "shape of payloads_grouped", len(payloads_grouped), " x ", len(payloads_grouped[0]))
for i in range(3):
    start = time.time()
    with ThreadPoolExecutor() as executor:
        # results = executor.map(parallel_group2, payloads_grouped, [url]*len(payloads_grouped), [headers]*len(payloads_grouped))
        results = executor.map(parallel_group, payloads_grouped, [url]*len(payloads_grouped), [headers]*len(payloads_grouped))
    end = time.time()
    times.append(end-start)
    print( "Durations of iterations:", times)
print( "Durations of iterations:", times)
print( "Average time for 150 requests:", sum(times)/len(times))

When I run the script with parallel_group, I have those results very consistently:

Durations of iterations: [5.246389389038086, 5.195073127746582, 5.278628587722778]
Average time for 150 requests: 5.2400303681691485

When I run it with parallel_group2 I have results looking more like this:

Durations of iterations: [10.99542498588562, 9.43007493019104, 23.003321170806885]
Average time for 150 requests: 10.142940362294516

Does someone have good knowledge in python multithreading and could explain why there is such a difference between multithreading calls to request.post and calls to a function that just do requests.call? I don't understand at all.

I ran the previous code several times and results were consistent.

Edit : the url is the chat completion api of openai ="api.openai.com/v1/chat/completions"

Solution

Your function parallel_group isn't doing what you would hope. The reason is that of the 3 parameters you're passing to requests.post, only the first one is correct (the URL). The payload will be assigned as data and the headers will be assigned to json The API is most likely to return an error but you're ignoring that possibility