pythonpandasdataframescikit-learnquandl

How to smoothly impute values in a Pandas DataFrame?


I am doing a data science project using Streamlit, Pandas and the Quandl Nasdaq Nordic Dataset.


When I use the Python Quandl module to get the data and plot it on a streamlit.area_chart or streamlit.line_chart, it seemed to have some missing values or ones that dropped to 0. I wanted to impute these, but whether I used "mean" or median, the imputed data then had wide flat sections.



Here is a zoom in on the flat areas

Zoom in on Flat Areas on My Graph


I obviously don't want this. Is there any other way of imputing values with pandas, sklearn SimpleImputer, or any other resource, so that it preserves the trend in the imputes?


A suggestion I have could be taking an average from the surrounding rows, like a moving average, but I am not sure how to implement this or if this is the best way.


Thank you for your time.



Solution

  • Thanks to ifly6, I have found the solution.

    Simply set your dataset to the interpolated version as below:

    data = df.interpolate()
    

    Simple!