pythonpandasyfinancepandas-timeindex

Last real-time candle from yfinance


When you are downloading data from yfinance in real-time and you are downloading multiple tickers at the same time you often get the last data example divided into several rows.

                                 Open         Volume      
                               BARC.L   BKG.L BARC.L BKG.L
Datetime                                                  
2021-11-11 08:05:19+00:00         NaN  4326.0    NaN   0.0
2021-11-11 08:07:10+00:00  194.539993     NaN    0.0   NaN

I don't care about these minimal differences in time, I just want the last example of each stock in the last row.

I have been thinking of grouping the last examples, but I am not sure how.

Just to note there could be way more stocks that I download at once, could be 10, and then this means that they could be returned in 10 separate rows.


Solution

  • Use ffill

    >>> df.ffill().iloc[[-1]]
                                     Open          Volume
                                   BARC.L   BKG.L  BARC.L  BKG.L
    Datetime                                                        
    2021-11-11 08:07:10+00:00  194.539993  4326.0     0.0    0.0
    

    For further:

    >>> df.ffill(0).iloc[[-1]].stack(level=1) \
          .reset_index().rename(columns={'level_1': 'Stock'})
    
                       Datetime   Stock         Open  Volume
    0 2021-11-11 08:07:10+00:00  BARC.L   194.539993     0.0
    1 2021-11-11 08:07:10+00:00   BKG.L  4326.000000     0.0