I have some data like shown in the table below. I want to figure out the point of change in the trend when a line is plotted with x=date_code and y= mass_weight. something like the attached image. There should be a constant decline or increase in weight_kg values after that point.
This is what my data looks like:
date_code weight_kg
0 354 215.16
1 355 502.59
2 356 568.15
3 357 328.20
4 358 824.07
I'm trying to figure out the change point in trend when we plot the data. There are so many weight_kg
and date_code
in the original data. I want to divide the date_code
into specific periods and identify the change point in the weight_kg
trend for each of the periods. You can access the data on the link below. I am using Python. The data shared is an example of a date_code
period.
You can change the date window
as you like.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Read data from Excel file
df = pd.read_excel("stckovflw.xlsx")
# Extract date_code and weight_kg columns
date_code = df['date_code']
weight_kg = df['weight_kg']
# Calculate differences between consecutive weight values
weight_diff = np.diff(weight_kg)
# Find the index of the maximum weight difference
max_diff_index = np.argmax(np.abs(weight_diff))
# Calculate moving average with a window of your choice (e.g., 5) for both positive and negative trends
window = 5
positive_trend_avg = weight_kg.rolling(window=window).mean()
negative_trend_avg = weight_kg[::-1].rolling(window=window).mean()[::-1]
# Calculate the overall moving average
overall_moving_avg = weight_kg.rolling(window=window).mean()
# Find the index of the maximum weight difference
max_diff_index = positive_trend_avg.idxmax()
# Calculate slopes from the intercept of changepoint to the last day's weight
slope_positive = (weight_kg.iloc[-1] - positive_trend_avg.iloc[max_diff_index]) / (date_code.iloc[-1] - date_code.iloc[max_diff_index])
slope_negative = (weight_kg.iloc[-1] - negative_trend_avg.iloc[max_diff_index]) / (date_code.iloc[-1] - date_code.iloc[max_diff_index])
# Plot the data, the detected changepoint, the positive and negative trend moving averages,
# the slopes from changepoint to the last day's weight, and the overall moving average
plt.plot(date_code, weight_kg, marker='o', linestyle='-', color='b', label='Weight')
plt.plot(date_code, positive_trend_avg, color='orange', linestyle='--', label=f'{window}-Day Positive Trend')
plt.plot(date_code, negative_trend_avg, color='green', linestyle='--', label=f'{window}-Day Negative Trend')
plt.plot([date_code.iloc[max_diff_index], date_code.iloc[-1]], [positive_trend_avg.iloc[max_diff_index], weight_kg.iloc[-1]], color='purple', linestyle='-', label='Positive Trend Slope')
plt.plot([date_code.iloc[max_diff_index], date_code.iloc[-1]], [negative_trend_avg.iloc[max_diff_index], weight_kg.iloc[-1]], color='red', linestyle='-', label='Negative Trend Slope')
plt.axvline(x=date_code.iloc[max_diff_index], color='r', linestyle='--', label='Changepoint')
plt.xlabel('Date Code')
plt.ylabel('Weight (kg)')
plt.title('Weight Trend with Positive and Negative Trend Moving Averages, Slopes, Overall Moving Average, and Detected Changepoint')
plt.legend()
plt.show()
# Print the detected changepoint
print("Detected changepoint:")
print("Date Code:", date_code.iloc[max_diff_index])
print("Weight (kg):", weight_kg.iloc[max_diff_index])
# Print the calculated slopes
print("Positive Trend Slope:", slope_positive)
print("Negative Trend Slope:", slope_negative)
Output:
Detected changepoint:
Date Code: 396
Weight (kg): 3155.09
Positive Trend Slope: -16.937824999999997
Negative Trend Slope: -12.673725000000001