pandastime-seriestrendtrendline

Time-series trend analysis in python


I have some data like shown in the table below. I want to figure out the point of change in the trend when a line is plotted with x=date_code and y= mass_weight. something like the attached image. There should be a constant decline or increase in weight_kg values after that point.

This is what my data looks like:

   date_code  weight_kg
0        354     215.16
1        355     502.59
2        356     568.15
3        357     328.20
4        358     824.07

I'm trying to figure out the change point in trend when we plot the data. There are so many weight_kg and date_code in the original data. I want to divide the date_code into specific periods and identify the change point in the weight_kg trend for each of the periods. You can access the data on the link below. I am using Python. The data shared is an example of a date_code period.

This is what the plot would look like: trend plot


Solution

  • Here's your solution

    You can change the date window as you like.

    import pandas as pd
    import matplotlib.pyplot as plt
    import numpy as np
    
    # Read data from Excel file
    df = pd.read_excel("stckovflw.xlsx")
    
    # Extract date_code and weight_kg columns
    date_code = df['date_code']
    weight_kg = df['weight_kg']
    # Calculate differences between consecutive weight values
    weight_diff = np.diff(weight_kg)
    
    # Find the index of the maximum weight difference
    max_diff_index = np.argmax(np.abs(weight_diff))
    
    
    # Calculate moving average with a window of your choice (e.g., 5) for both positive and negative trends
    window = 5
    
    positive_trend_avg = weight_kg.rolling(window=window).mean()
    negative_trend_avg = weight_kg[::-1].rolling(window=window).mean()[::-1]
    
    # Calculate the overall moving average
    overall_moving_avg = weight_kg.rolling(window=window).mean()
    
    # Find the index of the maximum weight difference
    max_diff_index = positive_trend_avg.idxmax()
    
    # Calculate slopes from the intercept of changepoint to the last day's weight
    slope_positive = (weight_kg.iloc[-1] - positive_trend_avg.iloc[max_diff_index]) / (date_code.iloc[-1] - date_code.iloc[max_diff_index])
    slope_negative = (weight_kg.iloc[-1] - negative_trend_avg.iloc[max_diff_index]) / (date_code.iloc[-1] - date_code.iloc[max_diff_index])
    
    # Plot the data, the detected changepoint, the positive and negative trend moving averages,
    # the slopes from changepoint to the last day's weight, and the overall moving average
    plt.plot(date_code, weight_kg, marker='o', linestyle='-', color='b', label='Weight')
    plt.plot(date_code, positive_trend_avg, color='orange', linestyle='--', label=f'{window}-Day Positive Trend')
    plt.plot(date_code, negative_trend_avg, color='green', linestyle='--', label=f'{window}-Day Negative Trend')
    plt.plot([date_code.iloc[max_diff_index], date_code.iloc[-1]], [positive_trend_avg.iloc[max_diff_index], weight_kg.iloc[-1]], color='purple', linestyle='-', label='Positive Trend Slope')
    plt.plot([date_code.iloc[max_diff_index], date_code.iloc[-1]], [negative_trend_avg.iloc[max_diff_index], weight_kg.iloc[-1]], color='red', linestyle='-', label='Negative Trend Slope')
    plt.axvline(x=date_code.iloc[max_diff_index], color='r', linestyle='--', label='Changepoint')
    plt.xlabel('Date Code')
    plt.ylabel('Weight (kg)')
    plt.title('Weight Trend with Positive and Negative Trend Moving Averages, Slopes, Overall Moving Average, and Detected Changepoint')
    plt.legend()
    plt.show()
    
    # Print the detected changepoint
    print("Detected changepoint:")
    print("Date Code:", date_code.iloc[max_diff_index])
    print("Weight (kg):", weight_kg.iloc[max_diff_index])
    
    # Print the calculated slopes
    print("Positive Trend Slope:", slope_positive)
    print("Negative Trend Slope:", slope_negative)
    

    Output:

    Detected changepoint:
    Date Code: 396
    Weight (kg): 3155.09
    Positive Trend Slope: -16.937824999999997
    Negative Trend Slope: -12.673725000000001
    

    trend analysis output