Scenario: I have a pandas series that contains 3 values. These values can vary between nan, 0 and any value above zero. I am trying to get the pct_change among the series whenever possible.
Examples:
[0,nan,50]
[0,0,0]
[0,0,50]
[nan,nan,50]
[nan,nan,0]
[0,0,nan]
[0,nan,0]
What I tried: from other SO questions I was able to come up with methods either trying to ignore the nan or shifting, but these can potentially yield a result with empty values. Ideally, if a result cannot be calculated, I would like to output a 0.
Code tried:
series_test = pd.Series([0,None,50])
series_test.pct_change().where(series_test.notna()) # tested but gives only NaN or inf
series_test.pct_change(fill_method=None)[series_test.shift(2).notnull()].dropna() # tested but gives empty result
Question: What would be the correct way to approach this?
Expected outputs:
[0,nan,50] - 0 (undefined case)
[0,0,0] - 0 (undefined case)
[0,0,50] - 0 (undefined case)
[nan,nan,50] - 0 (undefined case)
[nan,nan,0] - 0 (undefined case)
[0,0,nan] - 0 (undefined case)
[0,nan,0] - 0 (undefined case)
[1,nan,5] - 400%
[0,1,5] - 400%
[1,2,nan] - 100%
[1,1.3,1.8] - 80%
I think you could dropna
, then compute the pct_change
and only keep the max finite value:
series_test.dropna().pct_change().loc[np.isfinite].max()
Or maybe:
s.pct_change().where(np.isfinite, 0).max()
Example output for the second approach:
[0, nan, 50] - 0.0
[0, 0, 0] - 0.0
[0, 0, 50] - 0.0
[nan, nan, 50] - 0.0
[nan, nan, 0] - 0.0
[0, 0, nan] - 0.0
[0, nan, 0] - 0.0
[1, nan, 5] - 4.0
[0, 1, 5] - 4.0
[0, 1, nan] - 0.0
Edit: given your comment, it looks like you want to use the first and last non-zero values to compute the percentage change.
In this case, I'd use a custom function:
def pct_chg(s):
tmp = s[s>0]
if len(tmp)>1:
return (tmp.iloc[-1]-tmp.iloc[0])/tmp.iloc[0]
return 0
Which should be equivalent to the more verbose:
(series_test
.where(s>0).bfill().ffill()
.iloc[[0, -1]].pct_change().fillna(0).iloc[-1]
)
Example:
[0, nan, 50] - 0
[0, 0, 0] - 0
[0, 0, 50] - 0
[nan, nan, 50] - 0
[nan, nan, 0] - 0
[0, 0, nan] - 0
[0, nan, 0] - 0
[1, nan, 5] - 4.0
[0, 1, 5] - 4.0
[0, 1, nan] - 0
[1, 1.5, 1.6] - 0.6000000000000001