pythonperformancenumbaprocessing-efficiencyrunge-kutta

How to speed up this code for Poincare Section using Numba?


I have the following Python code for constructing a Poincare Section for chaotic Pendulum which uses RK4 algorithm and np arrays:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm

# Define function to calculate equations
def f(t, Yvec):
    """
    Computes the vector f.
    """
    theta = Yvec[0]
    omega = Yvec[1]
    phi = Yvec[2]
    fvec = np.zeros(3)
    fvec[0] = omega
    fvec[1] = -g/l*np.sin(theta) - gamma*omega + A*np.sin(phi)
    fvec[2] = drive
    return fvec

# Define a function for RK4 method
def rk4(t, Y_n, h):
    """
    Computes the vector Y(t_{n+1}).
    """
    k1 = h*f(t, Y_n)
    k2 = h*f(t + h/2, Y_n + k1/2)
    k3 = h*f(t + h/2, Y_n + k2/2)
    k4 = h*f(t + h, Y_n + k3)
    Y_nplus1 = Y_n + (k1 + 2*k2 + 2*k3 + k4)/6
    return Y_nplus1

# main program

# Define constants
g = 9.8  # acceleration due to gravity
l = 9.8  # length of the pendulum
gamma = 0.2  # damping coefficient
A = 1.2  # amplitude of driving force
drive = 2/3   # angular frequency of driving force
n = 2  # order of ODE

# Define time interval and step size
dt = 0.001  # step size
t0 = 0  # initial t
tf = 100000  # final t
t = np.arange(t0, tf, dt)

# Initialize array filled with zeroes to store solution after every increment
theta_array = np.zeros(len(t))
omega_array = np.zeros(len(t))
phi_array = np.zeros(len(t))

# Define Y vector with initial conditions which are also the values for first element of the arrays
Y = [0, 0, 2*np.pi] 
theta_array[0] = Y[0]
omega_array[0] = Y[1]
phi_array[0] = Y[2]


# Increment the Y vector according to RK4 and keep storing values in the arrays by defining a loop
sols = []
for i in range(len(t) - 1):
    Y_new = rk4(t[i], Y, dt)
    theta_array[i+1] = Y_new[0]
    omega_array[i+1] = Y_new[1]
    phi_array[i+1] = Y_new[2]
    # Check for Poincare section
    if np.abs(Y_new[2] % (2*np.pi)) < dt:
        sols.append([np.mod(Y_new[0] + np.pi, 2*np.pi) - np.pi, Y_new[1]])
    Y = Y_new
    
sols = np.array(sols)

# Define the color map
cmap = cm.get_cmap('plasma')

# Plot the results
fig, ax = plt.subplots(figsize=(12,8))
for i in range(len(sols)):
    color = cmap(i / len(sols))
    ax.plot(sols[i,0], sols[i,1], '.', markersize=2, color=color)

ax.set_xlabel('θ')
ax.set_ylabel('ω')
plt.show()

Right now it takes more that 30 minutes to run this code. I tried decreasing the step size but then I don't get a clear Poincare Section. How to speed up the compilation of this code using Numba?


Solution

  • Using Numba is pretty simple here : you can use add the decorator @nb.njit to the two first function and convert Y to an array using np.array. This is definitively something you could have simply tried, especially after your previous similar question. Once done, I get a 6x speed-up which is already good but not great.

    The main bottleneck then is the huge number of Numba function calls and the calls to np.abs (Numpy is designed to be fast when operating on large arrays, not scalars). You can put the main loop in a Numba function so to get a better speed up. I get a speed up of 30x on my machine (1 min 41 instead of > 50 min).

    The Numba code is still taking a lot of time allocating array of size 3 which is not efficient. If you want a faster code, you can preallocate all the array and then use loops to avoid creating any (additional) temporary arrays. I think this should be twice faster.

    Here is the final code:

    import numpy as np
    import matplotlib.pyplot as plt
    from matplotlib import cm
    
    # Define function to calculate equations
    @nb.njit
    def f(t, Yvec):
        """
        Computes the vector f.
        """
        theta = Yvec[0]
        omega = Yvec[1]
        phi = Yvec[2]
        fvec = np.zeros(3)
        fvec[0] = omega
        fvec[1] = -g/l*np.sin(theta) - gamma*omega + A*np.sin(phi)
        fvec[2] = drive
        return fvec
    
    # Define a function for RK4 method
    @nb.njit
    def rk4(t, Y_n, h):
        """
        Computes the vector Y(t_{n+1}).
        """
        k1 = h*f(t, Y_n)
        k2 = h*f(t + h/2, Y_n + k1/2)
        k3 = h*f(t + h/2, Y_n + k2/2)
        k4 = h*f(t + h, Y_n + k3)
        Y_nplus1 = Y_n + (k1 + 2*k2 + 2*k3 + k4)/6
        return Y_nplus1
    
    @nb.njit
    def compute(theta_array, omega_array, phi_array, t, Y, dt):
        sols = []
        for i in range(len(t) - 1):
            Y_new = rk4(t[i], Y, dt)
            theta_array[i+1] = Y_new[0]
            omega_array[i+1] = Y_new[1]
            phi_array[i+1] = Y_new[2]
            # Check for Poincare section
            if np.abs(Y_new[2] % (2*np.pi)) < dt:
                sols.append([np.mod(Y_new[0] + np.pi, 2*np.pi) - np.pi, Y_new[1]])
            Y = Y_new
            
        sols = np.array(sols)
        return sols
    
    # main program
    
    # Define constants
    g = 9.8  # acceleration due to gravity
    l = 9.8  # length of the pendulum
    gamma = 0.2  # damping coefficient
    A = 1.2  # amplitude of driving force
    drive = 2/3   # angular frequency of driving force
    n = 2  # order of ODE
    
    # Define time interval and step size
    dt = 0.001  # step size
    t0 = 0  # initial t
    tf = 100000  # final t
    t = np.arange(t0, tf, dt)
    
    # Initialize array filled with zeroes to store solution after every increment
    theta_array = np.zeros(len(t))
    omega_array = np.zeros(len(t))
    phi_array = np.zeros(len(t))
    
    # Define Y vector with initial conditions which are also the values for first element of the arrays
    Y = np.array([0, 0, 2*np.pi] )
    theta_array[0] = Y[0]
    omega_array[0] = Y[1]
    phi_array[0] = Y[2]
    
    
    # Increment the Y vector according to RK4 and keep storing values in the arrays by defining a loop
    sols = compute(theta_array, omega_array, phi_array, t, Y, dt)
    
    # Define the color map
    cmap = cm.get_cmap('plasma')
    
    # Plot the results
    fig, ax = plt.subplots(figsize=(12,8))
    for i in range(len(sols)):
        color = cmap(i / len(sols))
        ax.plot(sols[i,0], sols[i,1], '.', markersize=2, color=color)
    
    ax.set_xlabel('θ')
    ax.set_ylabel('ω')
    plt.show()
    

    Here is the beautiful result:

    enter image description here