influxdbinfluxdb-python

write date from yfinance Timestamp to influxdb and query the date - timezone support


I am trying to write date/time data to influxdb and query the data to a dataframe.

when I write the data date time looks like this...

ticker= 'AAPL'
import yfinance as yf
df = yf.Ticker('AAPL').history(period="1d").index[0]
print(df)

output:
Timestamp('2023-01-05 00:00:00-0500', tz='America/New_York')

...and when i query the data to a dataframe and print it I get this:

df['_time']

output:
0   2023-01-05 05:00:00+00:00
Name: _time, dtype: datetime64[ns, tzutc()]

What do I need to do to properly write the time in influxdb?

please see full code below:

########## WRITE ##########

    import yfinance as yf
    import influxdb_client
    from influxdb_client.client.write_api import SYNCHRONOUS, PointSettings


    token = "my-token"
    org = "my-org"
    url = "my-url"
    bucket = "stocks_us"
    retention_policy = "autogen"
    
    client = influxdb_client.InfluxDBClient(url=url, token=token, org=org)
    write_api = client.write_api(write_options=SYNCHRONOUS)

   df = yf.Ticker('AAPL').history(period="1d")

   with client:
    """
    Ingest DataFrame with default tags
    """
    point_settings = PointSettings(**{"ticker": ticker})
    
    write_api = client.write_api(write_options=SYNCHRONOUS, 
                                 point_settings=point_settings)
    write_api.write(bucket=bucket, 
                    org= "dev", 
                    record=df, 
                    data_frame_measurement_name="stock_daily_df")

    client.close()

    print(df)

and

########## QUERY ##########

import influxdb_client 

token = "my-token"
org = "my-org"
url = "my-url"
bucket = "stocks_us"
retention_policy = "autogen"

client = influxdb_client.InfluxDBClient(url=url, token=token, org=org)

query_api = client.query_api()
measurement= "stock_daily_df"

with client:
    
    """
    Querying ingested data
    """
    query = 'from(bucket:"{}")' \
            ' |> range(start: 0, stop: now())' \
            ' |> filter(fn: (r) => r._measurement == "{}")' \
            ' |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")' \
            ' |> filter(fn: (r) => r["ticker"] == "AAPL")'\
            ' |> limit(n:10, offset: 0)'.format(bucket, measurement)
            
    df = query_api.query_data_frame(query=query)

    print(df)

Solution

  • Flux would do all of its work in UTC, which is a simple linear clock, and leave it to the user to figure out the display. Hence to keep the timestamp in consistent, we should convert the timestamp into UTC before inserting the data and convert the data result back to corresponding timestamp once we are done with the query.

    1.Convert the timestamp into UTC in the yfinance library:

    dt.replace(tzinfo=timezone.utc)
    

    2.Convert the UTC timestamp to your local one:

    import "timezone"
    
    option location = timezone.location(name: "America/New_York")
    

    See more details here.