matplotlibgraphnan

matplotlib graphs: how to connect with lines points if there are emply values in a column


I have graph with multiple parameters which I take from one big dataframe. For one of parameters there are a few points only, rest values are empty (I can make them NAN or zero). How can I connect with lines these points on graph.

I guess that one of solutions is to copy that parameter to a separate dataframe and drop all empty values and then add it to graph with other parameters from main dataframe. But I hope there is a faster and smarter way.

For example there are settings in excel when creating a graph: choose data - empty and hidden values - show empty cells as line. Can we do similar thing in Python?


Solution

  • According to the Plotting masked and NaN values demo of the matplotlib's documentation, you can't do it without removing the NaNs, so you need to handle each column with NaNs separately.

    Using pandas's df.dropna() and df.plot() makes it easier since it keeps the index consistent, but if you directly use the matplotlib's plot() function then you will need to ensure that the x axis is consistent by yourself, as shown in the documentation.

    Here is how to do it using pandas:

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame(
        {
            "A": [1, 3, 5, 3, 1, 7],
            "B": [2, 1, 9, 6, 7, 8],
            "C": [3, np.nan, np.nan, 11, np.nan, 4],
            "D": [4, 8, 7, 2, 5, 7],
            "E": [5, 3, np.nan, 1, np.nan, 8],
        }
    )
    
    # Separate columns with NaN values from columns without
    nan_cols = df.columns[df.isna().any()]
    full_cols = df.columns.difference(nan_cols)
    
    # Plot full columns all at once
    df[full_cols].plot(marker="o", alpha=0.3)
    
    # Plot cols with NaNs separately after dropping NaNs
    for col in nan_cols:
        df[col].dropna().plot(marker="o")
    
    plt.legend()
    plt.tight_layout()
    plt.show()
    

    Plot with NaNs