pythonpandasdataframeweathermeteostat

How to get temperature measure given location and time values in a dataframe?


I have a pandas dataframe consisting of geo-locations and a time in the past.

location_time = pd.read_csv(r'geo_time.csv')
print (geo_time)

> +---------+---------+---------+-------------------+ 
  | latitude|longitude| altitude|              start|
  +---------+---------+---------+-------------------+ 
  |  48.2393|  11.5713|      520|2020-03-12 13:00:00|
  +---------+---------+---------+-------------------+ 
  |  35.5426| 139.5975|        5|2020-07-31 18:00:00|
  +---------+---------+---------+-------------------+ 
  |  49.2466|-123.2214|        5|2020-06-23 11:00:00|
  +---------+---------+---------+-------------------+ 
  ...

I want to add the temperatures at these locations and time in a new column from the Meteostat library in Python.

The library has the "Point" class. For a single location, it works like this:

location = Point(40.416775, -3.703790, 660)

You can now use this in the class "Hourly" that gives you a dataframe of different climatic variables. (normally you use like "start" and "end" to get values for every hour in this range, but using "start" twice, gives you only one row for the desired time). The output is just an example how the dataframe looks like.

data = Hourly(location, start, start).fetch()
print (data)

>                      temp  dwpt  rhum  prcp  ...  wpgt    pres  tsun  coco
time                                         ...                          
2020-01-10 01:00:00 -15.9 -18.8  78.0   0.0  ...   NaN  1028.0   NaN   0.0

What I want to do now, is to use the values from the dataframe "geo_time" as parameters for the classes to get a temperature for every row. My stupid idea was the following:

geo_time['location'] = Point(geo_time['latitude'], geo_time['longitude'], geo_time['altitude'])

data = Hourly(geo_time['location'], geo_time['start'], geo_time['start'])

Afterwards, I would add the "temp" column from "data" to "geo_time".

Does someone have an idea how to solve this problem or knows if Meteostat is even capable doing this?

Thanks in advance!


Solution

  • With the dataframe you provided:

    import pandas as pd
    
    df = pd.DataFrame(
        {
            "latitude": [48.2393, 35.5426, 49.2466],
            "longitude": [11.5713, 139.5975, -123.2214],
            "altitude": [520, 5, 5],
            "start": ["2020-03-12 13:00:00", "2020-07-31 18:00:00", "2020-06-23 11:00:00"],
        }
    )
    

    Here is one way to do it with Pandas to_datetime and apply methods:

    df["start"] = pd.to_datetime(df["start"], format="%Y-%m-%d %H:%M:%S")
    
    df["temp"] = df.apply(
        lambda x: Hourly(
            Point(x["latitude"], x["longitude"], x["altitude"]),
            x["start"],
            x["start"],
        )
        .fetch()["temp"]
        .values[0],
        axis=1,
    )
    

    Then:

    print(df)
    # Output
       latitude  longitude  altitude               start  temp
    0   48.2393    11.5713       520 2020-03-12 13:00:00  16.8
    1   35.5426   139.5975         5 2020-07-31 18:00:00  24.3
    2   49.2466  -123.2214         5 2020-06-23 11:00:00  14.9