pythonpandasyfinance

How to remove the timezone from yfinance data?


I grab data with yfinance package. I convert it into a panda dataframe. However, I am unable to save the dataframe to excel file.

ValueError: Excel does not support datetimes with timezones. Please ensure that datetimes are timezone unaware before writing to Excel.

This is how the dataframe looks like. It should be 8 columns. Spyder says it has 7 columns. enter image description here

Below is my codes:

import yfinance as yf
import pandas as pd

stock = yf.Ticker("BABA")
# get stock info
stock.info

# get historical market data
hist = stock.history(start="2021-03-25",end="2021-05-20",interval="15m")
hist = pd.DataFrame(hist)

# pd.to_datetime(hist['Datetime'])
# hist['Datetime'].dt.tz_localize(None)

hist.to_excel(excel_writer= "D:/data/python projects/stock_BABA2.xlsx")

Solution

  • You can remove the time zone information of DatetimeIndex using DatetimeIndex.tz_localize() , as follows:

    hist.index = hist.index.tz_localize(None)
    

    Edit

    If you are using the yfinance yf.download function (rather than the history function in this question), you can also use the ignore_tz=True parameter, like this:

    hist = yf.download("AAPL", period="1mo", interval="1h", ignore_tz=True)  
    

    The dataframe downloaded will have the Datetime index with time-zone removed (and time adjusted to local time).

    The doc of the ignore_tz parameter is:

    ignore_tz: bool

    When combining from different timezones, ignore that part of datetime. Default depends on interval. Intraday = False. Day+ = True.

    Note that this ignore_tz parameter is not supported in the yfinance Ticker history function. For that function, you have to apply the DatetimeIndex.tz_localize() function to change the DatetimeIndex as in my original solution above here.