Using GEKKO Python, we have trouble trying to learn a parameter that can vary multiple times per day. In some disciplines this is also known as 'regime detection or regime change detection'. We (me and my colleague Henri ter Hofte from Windesheim University of Applied Sciences) conceived 3 strategies but fail (more below).
Our question(s):
Your help is much appreciated.
=== The problem: We have time series data about: (1) CO₂ concentration (2) ventilation rates (or rather: valve fractions, which give ventilation rates, when multiplied with known maximum ventilation rates) (3) occupancy (number of persons in a room)
For research question (A) we would like to know a proper estimate for (2) for each hour of the day, given time series data about (1) and (3). For research question (B) we would like to know a proper estimate for (3) for each hour of the day, given time series data about (1) and (2).
We focus on research question A (but have similar questions for B).
=== The 3 strategies:
We considered 3 different strategies how to implement this using GEKKO Python: Strategy I. Declare the variable valve_frac as a Manipulated Variabe in our GEKKO model (m.MV), since the GEKKO documentation says these variables can be "adjusted by the optimizer to minimize an objective function at every time point". and "Manipulated variables are like FVs, but can change with each data row, either calculated by the optimizer (STATUS=1) or specified by the user (STATUS=0)." according to https://gekko.readthedocs.io/en/latest/imode.html#mv
Strategy II. Split the time into several shorter time spans (e.g. one time span per hour) and then learn valve_frac as a GEKKO Fixed Variable (m.FV), one for each hour.
Strategy III. Reframe the problem to GEKKO as if it were a control problem: the setpoint is reaching a particular CO2 concentration and GEKKO has can use valve_frac as a Control Variable (m.CV)
We tried implementing strategy I (see more info and code below) but fail to get proper results. Considering some equation derived from physics, we intend to find the best value for some specific variable (valve_frac__0 variable in following table. Having a dataframe (df_learn) like this:
Index | Date-Time | occupancy__p | valve_frac__0 | co2__ppm |
---|---|---|---|---|
1 | 2022.12.01 – 00:00:00 | 0 | 0.51 | 546 |
2 | 2022.12.01 – 00:15:00 | 4 | 0.85 | 820 |
3 | 2022.12.01 – 00:30:00 | 1 | 0.21 | 595 |
4 | 2022.12.01 – 00:45:00 | 2 | 0.74 | 635 |
5 | 2022.12.01 – 00:15:00 | 0 | 0.65 | 559 |
6 | 2022.12.01 – 00:15:00 | 0 | 0.45 | 538 |
7 | 2022.12.01 – 00:15:00 | 2 | 0.82 | 659 |
. | . | . | . | . |
. | . | . | . | . |
. | . | . | . | . |
1920 | 2022.12.20 – 00:15:00 | 3 | 0.73 | 749 |
We are trying to develop a moving horizon estimation model (IMODE=5) or Control model (IMODE=6) to predict the valve_frac__0 value. Following comes the code in Gekko format:
=== Code:
from gekko import GEKKO
# Gekko Model - Initialize
m = GEKKO(remote = False)
m.time = np.arange(0, duration__s, step__s)
# Conversion factors
s_min_1 = 60
min_h_1 = 60
s_h_1 = s_min_1 * min_h_1
mL_m_3 = 1e3 * 1e3
million = 1e6
# Constants
MET__mL_min_1_kg_1_p_1 = 3.5
desk_work__MET = 1.5
P_std__Pa = 101325
R__m3_Pa_K_1_mol_1 = 8.3145
T_room__degC = 20.0
T_std__degC = 0.0
T_zero__K = 273.15
T_std__K = T_zero__K + T_std__degC
T_room__K = T_zero__K + T_room__degC
infilt__m2 = 0.001
# Approximations
room__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_room__K)
std__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_std__K)
co2_ext__ppm = 415
# National averages
weight__kg = 77.5
MET__m3_s_1_p_1 = MET__mL_min_1_kg_1_p_1 * weight__kg / (s_min_1 * mL_m_3)
MET_mol_s_1_p_1 = MET__m3_s_1_p_1 * std__mol_m_3
co2_o2 = 0.894
co2__mol0_p_1_s_1 = co2_o2 * desk_work__MET * MET_mol_s_1_p_1
# Room averages
wind__m_s_1 = 3.0
# GEKKO Manipulated Variables: measured values
occupancy__p = m.MV(value = df_learn.occupancy__p.values)
occupancy__p.STATUS = 0; occupancy__p.FSTATUS = 1
# Strategy I:
valve_frac__0 = m.MV(value = df_learn.valve_frac__0.values)
valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# Strategy II:
#valve_frac__0 = m.FV(value = df_learn.valve_frac__0.values)
#valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# GEKKO Control Varibale (predicted variable)
co2__ppm = m.CV(value = df_learn.co2__ppm.values)
co2__ppm.STATUS = 1; co2__ppm.FSTATUS = 1
# GEKKO - Equations
co2_loss__ppm_s_1 = m.Intermediate((co2__ppm - co2_ext__ppm) * (vent_max__m3_s_1 * valve_frac__0 + wind__m_s_1 * infilt__m2) / room__m3)
co2_gain_mol0_s_1 = m.Intermediate(occupancy__p * co2__mol0_p_1_s_1 / (room__m3 * room__mol_m_3))
co2_gain__ppm_s_1 = m.Intermediate(co2_gain_mol0_s_1 * million)
m.Equation(co2__ppm.dt() == co2_gain__ppm_s_1 - co2_loss__ppm_s_1)
# GEKKO - Solver setting
m.options.IMODE = 5
m.options.EV_TYPE = 1
m.options.NODES = 2
m.solve(disp = False)
The result which I got for each scenario come as follow:
There is no output for simulated “co2__ppm” and the output value for valve_frac__0 = 0
There is big difference between simulated and measured “co2__ppm” and the output value for valve_frac__0 = 0.166 (which is not reasonable)
The code looks like it should work as long as valve_frac__0
is the adjustable unknown parameter that should be estimated from the CO2 PPM data. Here is a result on a smaller subset of the posted data.
The data doesn't fit exactly if there is a lower bound of zero on the valve position.
valve_frac__0 = m.MV(value = valve_frac__0,lb=0)
Otherwise, the valve position can be adjusted to fit the CO2 data perfectly.
Here is a complete script with the sample data.
from gekko import GEKKO
import numpy as np
# Gekko Model - Initialize
m = GEKKO(remote = False)
# data
# 1 2022.12.01 – 00:00:00 0 0.51 546
# 2 2022.12.01 – 00:15:00 4 0.85 820
# 3 2022.12.01 – 00:30:00 1 0.21 595
# 4 2022.12.01 – 00:45:00 2 0.74 635
# 5 2022.12.01 – 00:15:00 0 0.65 559
# 6 2022.12.01 – 00:15:00 0 0.45 538
# 7 2022.12.01 – 00:15:00 2 0.82 659
occupancy__p = np.array([0,4,1,2,0,0,2])
valve_frac__0 = np.array([0.51,0.85,0.21,0.74,0.65,0.45,0.82])
co2__ppm_meas = np.array([546,820,595,635,559,538,659])
duration__s = len(co2__ppm_meas)
m.time = np.linspace(0,duration__s-1,duration__s)
vent_max__m3_s_1 = 1
room__m3 = 1
# Conversion factors
s_min_1 = 60
min_h_1 = 60
s_h_1 = s_min_1 * min_h_1
mL_m_3 = 1e3 * 1e3
million = 1e6
# Constants
MET__mL_min_1_kg_1_p_1 = 3.5
desk_work__MET = 1.5
P_std__Pa = 101325
R__m3_Pa_K_1_mol_1 = 8.3145
T_room__degC = 20.0
T_std__degC = 0.0
T_zero__K = 273.15
T_std__K = T_zero__K + T_std__degC
T_room__K = T_zero__K + T_room__degC
infilt__m2 = 0.001
# Approximations
room__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_room__K)
std__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_std__K)
co2_ext__ppm = 415
# National averages
weight__kg = 77.5
MET__m3_s_1_p_1 = MET__mL_min_1_kg_1_p_1 \
* weight__kg / (s_min_1 * mL_m_3)
MET_mol_s_1_p_1 = MET__m3_s_1_p_1 * std__mol_m_3
co2_o2 = 0.894
co2__mol0_p_1_s_1 = co2_o2 * desk_work__MET * MET_mol_s_1_p_1
# Room averages
wind__m_s_1 = 3.0
# GEKKO Manipulated Variables: measured values
occupancy__p = m.MV(value = occupancy__p)
occupancy__p.STATUS = 0; occupancy__p.FSTATUS = 1
# Strategy I:
valve_frac__0 = m.MV(value = valve_frac__0,lb=0)
valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# Strategy II:
#valve_frac__0 = m.FV(value = df_learn.valve_frac__0.values)
#valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# GEKKO Control Varibale (predicted variable)
co2__ppm = m.CV(value = co2__ppm_meas)
co2__ppm.STATUS = 1; co2__ppm.FSTATUS = 1
# GEKKO - Equations
co2_loss__ppm_s_1 = m.Intermediate((co2__ppm - co2_ext__ppm) \
* (vent_max__m3_s_1 * valve_frac__0 \
+ wind__m_s_1 * infilt__m2) / room__m3)
co2_gain_mol0_s_1 = m.Intermediate(occupancy__p \
* co2__mol0_p_1_s_1 / (room__m3 * room__mol_m_3))
co2_gain__ppm_s_1 = m.Intermediate(co2_gain_mol0_s_1 * million)
m.Equation(co2__ppm.dt() == co2_gain__ppm_s_1 - co2_loss__ppm_s_1)
# GEKKO - Solver setting
m.options.IMODE = 5
m.options.EV_TYPE = 1
m.options.NODES = 2
m.options.SOLVER = 1
m.solve(disp = True)
import matplotlib.pyplot as plt
plt.subplot(2,1,1)
plt.plot(m.time,valve_frac__0.value,'r-',label='Valve Frac')
plt.legend(); plt.grid(); plt.ylabel('Valve Frac')
plt.subplot(2,1,2)
plt.plot(m.time,co2__ppm_meas,'ko',label='Measured')
plt.plot(m.time,co2__ppm.value,'k--',label='Predicted')
plt.legend(); plt.grid()
plt.xlabel('Time'); plt.ylabel('CO2')
plt.savefig('results.png',dpi=300)
plt.show()
For question B, adjust the code to make the valve position fixed at the measured values and the occupancy determined by the optimizer.
occupancy__p = m.MV(value = occupancy__p)
occupancy__p.STATUS = 1; occupancy__p.FSTATUS = 0
# Strategy I:
valve_frac__0 = m.MV(value = valve_frac__0,lb=0)
valve_frac__0.STATUS = 0; valve_frac__0.FSTATUS = 1
Use occupancy__p.MV_STEP_HOR = 2
or higher decrease the frequency at which the optimized parameter can change (e.g. every 2 hours).