I have input data stored in a dataframe [16,60000] with columns corresponding to different time. I'm trying to calculate the integral along two different axis using np.trapz at every timestep.
I tried :
but I could not see major improvements. How could I speed up this script ?
Here is a minimal snippet:
import numpy as np
import pandas as pd
import time
time_start = time.time()
# Read data
df_data = pd.DataFrame(np.random.randn(16, 60000))
x_values = np.array([0. , 0.03, 0.1 , 0.2 , 0.3 , 0.4 , 0.5 , 0.6 , 0.7 , 0.8 , 0.9 ,
0.85, 0.7 , 0.5 , 0.2 , 0.05])
x_values2 = np.array([0. , 0.043, 0.083, 0.114, 0.13 , 0.134, 0.124, 0.102,
0.078, 0.056, 0.03 , -0.006, -0.02 , -0.055, -0.069, -0.042])
# Get sample characteristics
Ns = df_data.shape[1]
times = range(Ns)
lt_data = [df_data.iloc[:,i] for i in times]
a = np.array([-np.trapz(y=data, x=x_values) for data in lt_data])
b = np.array([np.trapz(y=data, x=x_values2) for data in lt_data])
time_end = time.time()
elapsed = time_end - time_start
print(f'Elapsed: {elapsed:.1f}s')
try to modify this way after defining x values
data_np = df_data.values
a = np.trapz(y=data_np, x=x_values[:, np.newaxis], axis=0)
b = np.trapz(y=data_np, x=x_values2[:, np.newaxis], axis=0)