pythonpandasnanwindrose

ValueError: cannot convert float NaN to integer when making wind rose


I wrote a code to create monthly pollution wind roses for a county in California. Pollution roses are similar to wind roses as they show the distribution of wind directions, but instead of showing the magnitude of the wind speed, they plot the concentration of PM2.5. I have used this code for many data sets from the California Air Resources Board, but now I am using data from a local monitoring network and get the following error message when I run my code:

Traceback (most recent call last):

  File "C:\***.py", line 341, in __call__
    return printer(obj)

  File "C:\***.py", line 253, in <lambda>
    png_formatter.for_type(Figure, lambda fig: print_figure(fig, 'png', **kwargs))

  File "C:\***.py", line 137, in print_figure
    fig.canvas.print_figure(bytes_io, **kw)

  File "C:\***.py", line 2230, in print_figure
    self.figure.draw(renderer)

  File "C:\***.py", line 74, in draw_wrapper
    result = draw(artist, renderer, *args, **kwargs)

  File "C:\***.py", line 51, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)

  File "C:\***.py", line 2780, in draw
    mimage._draw_list_compositing_images(

  File "C:\***.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)

  File "C:\***.py", line 431, in wrapper
    return func(*inner_args, **inner_kwargs)

  File "C:\***.py", line 431, in wrapper
    return func(*inner_args, **inner_kwargs)

  File "C:\***.py", line 960, in draw
    center = self.transWedge.transform((0.5, 0.5))

  File "C:\***.py", line 1765, in transform
    return self.transform_affine(values)

  File "C:\***.py", line 1830, in transform_affine
    mtx = self.get_matrix()

  File "C:\***.py", line 2619, in get_matrix
    inl, inb, inw, inh = self._boxin.bounds

  File "C:\***.py", line 395, in bounds
    (x0, y0), (x1, y1) = self.get_points()

  File "C:\***.py", line 759, in get_points
    wedge = mpatches.Wedge(self._center, points[1, 1],

  File "C:\***.py", line 1167, in __init__
    self._recompute_path()

  File "C:\***.py", line 1179, in _recompute_path
    arc = Path.arc(theta1, theta2)

  File "C:\***.py", line 950, in arc
    n = int(2 ** np.ceil((eta2 - eta1) / halfpi))

ValueError: cannot convert float NaN to integer

Here is a link to my csv file

Here is my code:

import pandas as pd
from windrose import WindroseAxes
import matplotlib.pyplot as plt
import matplotlib.cm as cm

wr = pd.read_csv('IVANCALEX_forSO.csv')
wr = wr.set_index('date')
wr.index = pd.to_datetime(wr.index)

wr["Month"] = wr.index.month
wr['Hour'] = wr.index.hour

month_dict = {1: "January", 2: "February", 3: "March", 4: "April",
               5: "May", 6: "June", 7: "July", 8: "August", 9: "September",
               10: "October", 11: "November", 12: "December"}

xval = ["dir_3135"]
yval = ['Calexico, 604 Kubler Rd', 'Calexico, Alvarez', 'Calexico, Encinas Ave and Ethel St', 'Calexico, Ethel',
       'Calexico, Housing Authority', 'Calexico, Housing Authority West', 'Calexico, Residence', 
       'Holtville, 1015 Miller Rd', 'Holtville, South', '1201 West Hwy 98']

months = [v for k,v in month_dict.items()]
nrows, ncols = 2,6

#bins=np.logspace(0, 4, num=5) #pm10
#bins=np.arange(0, 1, .2) #pm2.5/pm10

for x,y in zip(xval,yval):
    fig = plt.figure(figsize=(15, 10))
    plt.subplots_adjust(hspace=0.5)
    site_name = y.split(",")[0].replace(" ", "_")
    fname = f"pollutionrose_{site_name}.png"
    bins=[-60,-10,0,10,40] #ozone deviations
    fig.tight_layout()
    for i, month in enumerate(months):
        d =  wr[wr["Month"].eq(month)].reset_index(drop=True)
        ax = fig.add_subplot(nrows, ncols, i + 1, projection="windrose")
        ax.set_title(month.capitalize(),fontsize=20, weight='bold')
        ax.bar(d[x], d[y],
           normed=True, opening=0.8,
           bins=bins, cmap=cm.rainbow,
           nsector=8)
        ax.set_xticklabels(['E', 'N-E', 'N', 'N-W', 'W', 'S-W', 'S', 'S-E'],fontsize=18)
        ax.tick_params(axis="y", labelsize=12.5)
        #ax.set_legend(decimal_places=1,fontsize='x-large', loc='best')
        #ax.set_yticklabels(np.arange(11, 77, step=11), fontsize=18)
    ax.figure.savefig(fname, dpi=400) #(8, 56, step=8)

I am unsure why I am getting this error message because I have worked with data that has many NaN values in the past and had no problem. Are there potentially too many NaN values to perform this analysis?

I tried to make this modification:

for i, month in enumerate(months):
    d =  wr[wr["Month"].eq(month)].reset_index(drop=True)
    ax = fig.add_subplot(nrows, ncols, i + 1, projection="windrose")
    ax.set_title(month.capitalize(),fontsize=20, weight='bold')

    # Drop rows with NaN values in d[x] or d[y]
    if d[x].isna().any() or d[y].isna().any():
        d = d.dropna(subset=[x, y])
    
    ax.bar(d[x], d[y],
           normed=True, opening=0.8,
           bins=bins, cmap=cm.rainbow,
           nsector=8)
    ax.set_xticklabels(['E', 'N-E', 'N', 'N-W', 'W', 'S-W', 'S', 'S-E'],fontsize=18)
    ax.tick_params(axis="y", labelsize=12.5)

but it did not seem to resolve the issue. Ultimately I want it to look like the picture attached. pollution rose

UPDATE: I tried to replace the NaN values with -250 since I don't need those values anyway and I am still getting the same error message. The error comes from this portion of the code:

ax.bar(d[x], d[y],
           normed=True, opening=0.8,
           bins=bins, cmap=cm.rainbow,
           nsector=8)

When I look at the d variable, there is no data written so the code is attempting to make a windrose with no data. I am not sure why this is happening.

please help!


Solution

  • You iterate over the month names, and then compare the Month column values (integer) to that name. That won't work, obviously.

    You can simplify your code: remove the line months = [v for k,v in month_dict.items()], and adjust the following lines in the nested for loop:

    ...
        for i, month in month_dict.items():
    ...
            ax = fig.add_subplot(nrows, ncols, i, projection="windrose")