pythonsignal-processingdata-cleaningnoise-reduction

Cleaning up Data with Variably Noisy Signal


I have been running a study where we unexpectedly received a disparate amount of noise in our signal depending on how we collected it. Now, I am trying to learn how to filter out the noise, though I am fairly new signal processing / cleaning. With one method of data collection, we got a lot less noise as shown here: enter image description here

These are weight measurements on a scale where the object being weighed is removed, manipulated, and then reweighed which results in the "stair step" like pattern. The intent is to use this data in future with a change detection algorithm to identify where the weight changes are (i.e., the "stair steps") for a decently high volume of data that makes doing it by eye untenable. I think what I need is a low pass filter to remove those really brief instances where the signal goes really high. However, with another form of collection we received a lot more noise: enter image description here

I am literate in python and R, but am just unsure where to begin to clean such disparate looking data and how to save its integrity for the subsequent change detection analysis. I thought about simply deleting high and low weight values, but this leads to a problem with the change detection algorithm. Any suggestions are appreciated and I've attached sample data.

https://docs.google.com/spreadsheets/d/1qLFmuP3CfU40MjV9-o2zos4ylTXzHcwo/edit?usp=sharing&ouid=100771405497110181259&rtpof=true&sd=true


Solution

  • You could apply a median filter. It will both remove glitches and fluctuations of the measurements, and has only one parameter to tune, the size of the window.

    from scipy.ndimage import median_filter
    import matplotlib.pyplot as plt
    plt.figure(figsize=(14, 4))
    plt.plot(median_filter(x, 10), label='window=10')
    plt.plot(median_filter(x, 100), label='window=100')
    plt.legend()
    plt.grid()
    

    enter image description here