pythonmatplotlibnumbanumba-pro

Matplotlib with Numba to try and accelerate code


How could I use numba for the following code to try and accelerate it? When I add @njit before func1 several error returns? Or would there be another way to optimise/accelerate the code to reduce the overall process time as depeding on the number of iterations it can take a very long time. Any explanations would be much appreciated.

from matplotlib import pyplot as plot
import math



def func1(z): 
    
    # enter starting value of x and y 
    x = 0.215 
    y = 0.512


    # enter set of random values for the quadratic map and set to record points of x and y
    a = [0.123,0.234,0.345,0.456,0.567,0.678,0.789,0.890,0.012,0.123,0.234,0.345]
    
    
    
    #starts equation and iterates to form sequence of images
    
    for q in range((z/2),0,-1):
    
        x_list = [x]
        y_list = [y]
        
        for i in range(2500000):
    
            # new values of x and y
            xnew = a[0] + a[1]*x + a[2]*x*x + a[3]*x*y +a[4]*y*y +a[5]*y
            ynew = a[6] + a[7]*x + a[8]*x*x + a[9]*x*y +a[10]*y*y +a[11]*y
    
            x = xnew
            y = ynew
    
            #add values to point set
    
            x_list.append(x)
            y_list.append(y)
    
    
    
        name = "C:\\Users\\user\\imgs\\"+str(q)+'.png'
    
        plot.style.use("dark_background")
        plot.scatter(x_list[100:],y_list[100:], s =0.001, marker='.', linewidth=0, c = '#ffd769')
        plot.axis("off")
        plot.savefig(name,dpi = 800)
        plot.clf()
    
        a = [k+0.0002 for k in a]

if __name__=='__main__': 
    func1(int(input("Enter Number of Frames (even): ")))

Solution

  • Accelerating computation

    Don't try to numba the plot. Separate the computation part from the plotting part, and numba the computation. For example that loop with 2500000 iterations.

    @jit(nopython=True)
    def fillXY(x,y,a, x_list, y_list):
        for i in range(2500000):
            # new values of x and y
            xnew = a[0] + a[1]*x + a[2]*x*x + a[3]*x*y +a[4]*y*y +a[5]*y
            ynew = a[6] + a[7]*x + a[8]*x*x + a[9]*x*y +a[10]*y*y +a[11]*y
            x,y=xnew,ynew
    
            #add values to point set
            x_list[i]=x
            y_list[i]=y
    

    Note that I use a np.array as instead of python list, so that numba is efficient, both for x_list/y_list and for a.

    Then func1 is almost unchanged, but for the call to this fillXY

    def func1(z): 
        
        # enter starting value of x and y 
        x = 0.3 # Btw, that is a what you should have given instead of illegal python code. A [mre] is supposed to be runnable as is. Just cut&paste and run
        y = 0.2
    
    
        # enter set of random values for the quadratic map and set to record points of x and y
        # And if we don't want to bother giving many numbers for our [mre]
        # we can still fill with random numbers, with a constant seed
        # for the reproducible part.
        np.random.seed(123)
        a = np.random.uniform(-1,1,12) # Note that `a` is a numpy array, not a list here
    
        #starts equation and iterates to form sequence of images
        # I allocate x_list first, and fill it afterward, rather that 
        # accumulating in python list. python is quite efficient in 
        # accumulating in lists. But we are not really coding in python here
        # but in C, implicitly, through numba. Besides, python is "relatively"
        # efficient only. Pure python is slow anyway; it is just not slower
        # when accumulating rather that filling a pre-allocated list.
        # But numpy arrays are a disaster when used to accumulate new elements
        # and, on the contrary way more efficient when filling preallocated
        x_list = np.empty((2500000,), dtype=np.float32)
        y_list = np.empty((2500000,), dtype=np.float32)
        
        for q in range((z//2),0,-1):
            fillXY(x,y,a,x_list, y_list)
    
            plot.style.use("dark_background")
            plot.clf()
            plot.scatter(x_list[100::11],y_list[100::11], marker='.', linewidth=2, c = '#ffd769')
            plot.axis("off")
            plot.gcf().canvas.draw() # to force redraw at each iteration
            plot.pause(2)
        
            a = a+0.0002 # Positive side effect of a being a ndarray: that line is simpler 
            # tho this is negligible cpu time anyway before the big "fillXY" part.
    

    With that, on my PC, it goes from 16 seconds to 0.03 seconds for computing x_list and y_list. So that is solved

    Accelerating the plot

    The next problem you have is that the plot itself (scatter) is very slow. Not as slow as the computation of 2500000 iterations were. But now that we solved those, what makes the thing still slow (tho already 3 or 4 times faster) is the scatter. That is very machine dependent. But, well, plt.scatter is not efficient in plotting 2500000 points. And numba cannot help here. Because, firstly, it doesn't understand neither the code, nor the API (I mean, they haven't reimplemented the API, as they did for many functions) of matplotlib. And, secondly, matplotlib already optimized what needed to be optimized. There aren't any marging for acceleration just by compiling stuff here

    One partial solve is what I did in previous code: plotting 1 point out of 11. But that is not entirely satisfactory. For start, I said "11" instead of "10", because otherwise, since what that equation does is obviously converging toward a central point by alternating around it, with any even number instead of "11", I select only one side of the alternating convergence. But that means that I need some knowledge on the behaviour of what I am studying, and often, I am studying it precisely because I haven't (as you can guess, I started to plot 1 of 10 points, and that is only when I realize that the plot was strangely very different that I understood the "alternating" part). Plus, maybe your really want to see all 2500000 points (I doubt it tho. Nobody need to see that many points. They just form an accumulation, some density, but you don't expect to see them individually. One proof of that is the very thin settings, that you've chosen ­— I increased it btw)

    So, one way to solve that, is to do that part yourself. After all, you talked yourself about "image" rather than "scatter plot". So, why not just build an image.

    def func1(z):  
        x = 0.3
        y = 0.2
        np.random.seed(123)
        a = np.random.uniform(-1,1,12)
    
        x_list = np.empty((2500000,), dtype=np.float32)
        y_list = np.empty((2500000,), dtype=np.float32)
        img = np.empty((1000,1000), dtype=np.uint8) # a 1000x1000 image for rendering
        
        for q in range((z//2),0,-1):
            fillXY(x,y,a,x_list, y_list)
            # Boundaries of the plot
            x1,x2,y1,y2=x_list.min(),x_list.max(),y_list.min(), y_list.max()
            # erase previous image
            img[:]=0
            # Set pixel to bright at each point
            img[((y_list-y1)/(y2-y1)*999).astype(np.int32), ((y_list-y1)/(y2-y1)*999).astype(np.int32)] = 1
        
            plot.style.use("dark_background")
            plot.clf()
            plot.imshow(img)
            plot.axis("off")
            plot.gcf().canvas.draw()
            plot.pause(0.5)
        
            a = a+0.0002
    

    You could even write another "numba" function, that fills the image directly, without returning x_list/y_list. It would be even faster, and would allow an histogram (pixels are brighter when 2 points are on the same pixel)