pythonpandastradingpandas-resample

Panda Resampling is incorrect for some cases


I have problem with resample function it's work with some timeframes and doesn't work with others

it doesn't work with 5h or 10h intervals, it's start okay and then it's changed to wrong datetime

d = {'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}
df = read.resample('5H').agg(d)
df

output:

                    Open    High    Low     Close
Date
2024-01-21 00:00:00 317.3   319.5   316.1   317.4
2024-01-21 05:00:00 317.4   319.4   317.2   319.0
2024-01-21 10:00:00 319.0   319.6   318.0   319.1
2024-01-21 15:00:00 319.1   321.6   317.5   320.6
2024-01-21 20:00:00 320.6   321.7   318.2   319.5
2024-01-22 01:00:00 319.5   320.1   314.1   317.2
2024-01-22 06:00:00 317.2   318.6   310.7   312.8
2024-01-22 11:00:00 312.8   315.3   308.4   311.1
2024-01-22 16:00:00 311.2   312.5   303.6   308.5
2024-01-22 21:00:00 308.5   308.7   304.8   308.2
2024-01-23 02:00:00 308.1   311.7   308.1   309.9
2024-01-23 07:00:00 310.0   310.2   291.7   295.9

It's start okay until the next day 2024-01-22

and it's work correctly with 2h or 3h or 4h or 6h intervals

d = {'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}
df = read.resample('4H').agg(d)
df

output:

2024-01-21 00:00:00 317.3   319.5   316.1   318.5
2024-01-21 04:00:00 318.5   318.9   317.2   317.5
2024-01-21 08:00:00 317.5   319.6   317.3   319.3
2024-01-21 12:00:00 319.3   319.6   317.5   318.9
2024-01-21 16:00:00 318.9   321.6   317.9   320.6
2024-01-21 20:00:00 320.6   321.7   318.2   318.6
2024-01-22 00:00:00 318.7   320.3   315.3   316.7
2024-01-22 04:00:00 316.7   318.6   314.1   317.8
2024-01-22 08:00:00 317.9   317.9   310.7   313.5
2024-01-22 12:00:00 313.4   315.3   308.4   311.1
2024-01-22 16:00:00 311.2   312.5   303.6   306.6

Solution

  • If you want to resample with 5h periods except the last one, so that it always starts at midnight, you should normalize your dates and use them as grouper for groupby.resample:

    N = 1000
    df = pd.DataFrame(np.random.random((N, 4)),
                      columns=['Open', 'High', 'Low', 'Close'],
                      index=pd.date_range('2024-01-21', periods=N, freq='10min')
                              .rename('Date')
                     )
    
    d = {'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}
    
    out = (df.groupby(df.index.normalize(), group_keys=False)
             .resample('5h', origin='start_day').agg(d)
          )
    

    Example output:

                             Open      High       Low     Close
    Date                                                       
    2024-01-21 00:00:00  0.854852  0.991030  0.001056  0.434836
    2024-01-21 05:00:00  0.423933  0.993614  0.024808  0.970102
    2024-01-21 10:00:00  0.114825  0.992917  0.000970  0.423687
    2024-01-21 15:00:00  0.368081  0.934378  0.019684  0.474279
    2024-01-21 20:00:00  0.618224  0.991313  0.004855  0.828402
    2024-01-22 00:00:00  0.314505  0.981417  0.015882  0.672777
    2024-01-22 05:00:00  0.637003  0.928499  0.018432  0.081630
    2024-01-22 10:00:00  0.798733  0.939966  0.002545  0.779787
    2024-01-22 15:00:00  0.786353  0.980841  0.007235  0.622986
    2024-01-22 20:00:00  0.455437  0.985423  0.207164  0.492761
    2024-01-23 00:00:00  0.759215  0.943299  0.012884  0.610557
    2024-01-23 05:00:00  0.491669  0.907216  0.022135  0.204769
    2024-01-23 10:00:00  0.035259  0.950254  0.020136  0.533030
    2024-01-23 15:00:00  0.977045  0.958560  0.000002  0.791182
    2024-01-23 20:00:00  0.327000  0.953044  0.021118  0.559102
    2024-01-24 00:00:00  0.730636  0.935785  0.004056  0.338256
    2024-01-24 05:00:00  0.539044  0.903956  0.053925  0.022443
    2024-01-24 10:00:00  0.315928  0.939252  0.004503  0.738087
    2024-01-24 15:00:00  0.455187  0.946352  0.084940  0.370417
    2024-01-24 20:00:00  0.373350  0.992475  0.003233  0.813431
    2024-01-25 00:00:00  0.937217  0.975881  0.013434  0.394555
    2024-01-25 05:00:00  0.805051  0.971622  0.066913  0.249198
    2024-01-25 10:00:00  0.271297  0.908170  0.028149  0.002023
    2024-01-25 15:00:00  0.136134  0.946287  0.014009  0.918888
    2024-01-25 20:00:00  0.128861  0.940258  0.004101  0.783039
    2024-01-26 00:00:00  0.663781  0.967509  0.015381  0.151662
    2024-01-26 05:00:00  0.697951  0.914169  0.056867  0.594478
    2024-01-26 10:00:00  0.919909  0.976555  0.038070  0.161614
    2024-01-26 15:00:00  0.149704  0.974648  0.047376  0.664060
    2024-01-26 20:00:00  0.879085  0.996539  0.077073  0.580693
    2024-01-27 00:00:00  0.172167  0.953116  0.012676  0.821176
    2024-01-27 05:00:00  0.966948  0.964718  0.042488  0.130513
    2024-01-27 10:00:00  0.720802  0.932359  0.013581  0.640035
    2024-01-27 15:00:00  0.864662  0.981908  0.021083  0.640291
    2024-01-27 20:00:00  0.090283  0.912008  0.014249  0.558703