I have been trying to run the below code to calculate upper and lower confidence intervals using t distribution, but it keeps throwing the error in the subject. The piece of code is as below:
def trans_threshold(Day):
Tran_Cnt=Tran_Cnt_DF[['Sample',Day]].dropna()
Tran_Cnt=Tran_Cnt.astype({'Sample':'str'})
Tran_Cnt.dtypes
#Finding outliers in Materiality via IQR
X_Tran = Tran_Cnt.drop('Sample', axis=1)
Tran_arr1 = X_Tran.values
#Finding the first quartile
Tran_q1= np.quantile(Tran_arr1, 0.25)
# finding the 3rd quartile
Tran_q3 = np.quantile(Tran_arr1, 0.75)
# finding the iqr region
Tran_iqr = Tran_q3-Tran_q1
# finding upper and lower outliers
Tran_upper_bound = Tran_q3+(1.5*Tran_iqr)
Tran_lower_bound = Tran_q1-(1.5*Tran_iqr)
# removing outliers
Tran_arr2 = Tran_arr1[(Tran_arr1 >= Tran_lower_bound) & (Tran_arr1 <= Tran_upper_bound)]
#Using t distribution for Materiality Limits
Tran_Threshold_mat=st.t.interval(alpha=0.99999999999, df=len(Tran_arr2-1),
loc=np.mean(Tran_arr2),
scale=st.sem(Tran_arr2))
return Tran_Threshold_mat
trn_lim_FullFeed_Mon = trans_threshold(Day)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[106], line 19
17 Tran_arr2 = Tran_arr1[(Tran_arr1 >= Tran_lower_bound) & (Tran_arr1 <= Tran_upper_bound)]
18 #Using t distribution for Materiality Limits
---> 19 Tran_Threshold_mat=st.t.interval(alpha=0.99999999999, df=len(Tran_arr2-1),
20 loc=np.mean(Tran_arr2),
21 scale=st.sem(Tran_arr2))
TypeError: rv_generic.interval() missing 1 required positional argument: 'confidence'
The issue seems to be with piece of code below. However, I have provided all parameters required to calculate confidence intervals, including degrees of freedom, but it still gives this error. Where am I going wrong and what needs to be done?
Tran_Threshold_mat=st.t.interval(alpha=0.99999999999, df=len(Tran_arr2-1),
loc=np.mean(Tran_arr2),
scale=st.sem(Tran_arr2))
Also, the Tran_arr2 list looks like below:
array([12617., 12000., 1123., 537., 8605., 4365., 11292., 12231.,
7640., 9583., 9257., 13864., 14682., 11744., 10501., 8694.,
5327., 10066., 13022., 11092., 7444., 11658., 14920., 12849.,
14681., 5719., 11029., 3814., 14703., 5593., 9772., 8851.,
9551., 15975., 6532., 13827., 8547.])
Hence, there is no issue, up until the last like of the code block which estimates confidence intervals using t distribution.
I have used the below packages:
import pandas as pd
import numpy as np
import scipy.stats as st
import matplotlib.pyplot as plt
import matplotlib.ticker as tkr
import matplotlib.scale as mscale
from matplotlib.ticker import FixedLocator, NullFormatter
pd.options.display.float_format = '{:.0f}'.format
pd.options.mode.chained_assignment = None
Note that the signature of scipy.stats.t
is interval(confidence, df, loc=0, scale=1)
. There is no alpha
keyword, pass it as positional or relabel it to confidence
.