The time complexity of (square, naive) matrix multiplication is O(N3), e.g. this answer
I can run a quick script
import numpy as np
import time
import matplotlib.pyplot as plt
def time_matmul_n(n):
#Create two arrays of dimension n
arr1 = np.random.rand(n, n)
arr2 = np.random.rand(n, n)
t0=time.process_time_ns() #start the clock
np.matmul(arr1,arr2)
t1=time.process_time_ns()
return t1-t0
N = 100
n_values = range(N)
t_values = np.zeros_like(n_values)
for i,n in enumerate(n_values):
t_values[i] = time_matmul_n(n)
#Plot the results
plt.plot(n_values,t_values)
This gives me something like
Now I appreciate there is a lot of noise here, and the run time will vary depending on the processes that are going on on my laptop, but this doesn't look like it scales anything like O(N3).
I am clearly not understanding something. Can anyone explain why the graph doesn't look O(N3) -like? Does this only matter when N gets large?
Can anyone explain why the graph doesn't look O(N3). Does this only matter when N gets large?
There are a few reasons:
Time complexity is about asymptotic behaviour. By definition this will only become apparent for values of N that are larger than some minimum value to be determined.
For sizes that are too small, the process is trivial and only takes a millisecond or less, so that your time measurements are more determined by other overhead.
Related: increasing the input size with steps of 1 limits the range of N for which you visualise the result. You can improve on this by taking larger (but constant) increments for N.
As random inputs may even give different results for the same value of N, it is expected you get erratic results. To improve on that, repeat the operation for several random inputs with the same N and then take the average of the time it took.
Here is how your script could be modified to generate more telling results:
import numpy as np
import time
import matplotlib.pyplot as plt
def time_matmul_n(n):
acc = 0
for i in range(10): # Multiple times to take average
arr1 = np.random.rand(n, n)
arr2 = np.random.rand(n, n)
t0 = time.process_time_ns()
np.matmul(arr1, arr2)
t1 = time.process_time_ns()
acc += t1-t0
return acc / 10
# Larger sizes, increase with steps
N = 1500
step = 50
n_values = range(step, N, step)
t_values = np.zeros_like(n_values)
# This takes a while to run
for i,n in enumerate(n_values):
t_values[i] = time_matmul_n(n)
print(i, t_values[i]) # Give some indication of progress
# Plot the results
plt.plot(n_values,t_values)
Now you'll get a graph that looks something like this: