I have this function to calculate log of returns. It works as expected.
def log_returns(prices):
return np.log(prices / prices.shift(1))
data.apply(lambda x: log_returns(x))
The values returned are very close to pct_change method. Is this expected?
data.pct_change()
It is, for small variations in the natural log are almost equal to percentage change, that's not a code issue.
Since :
log(A/B) = log(A) - log(B)
and in your case, A is equal to some small change e
of B.
log(A/B) = log(A) - log(B) = log(B(1+e)) - log(B)
log(A/B) = log(B) + log((1+e)) - log(B) = log(1+e)
For small values of e
, meaning that the log
is a good approx. around 1
log(1+e) ≈ e
For a more mathy explanation, see this SO post.
See for yourself with this code :
import pandas as pd
import numpy as np
small = np.linspace(0.01, 0.1, 100)
df = pd.DataFrame({"vals" : small})
df["changes"] = df["vals"].pct_change()
df["log div"] = np.log(df["vals"]/df["vals"].shift())
diff_log = np.log(df["vals"]) - np.log(df["vals"].shift())
df["diff log"] = diff_log
diff_log = diff_log[~np.isnan(diff_log)]
log_div = df["log div"].dropna().values
assert(np.allclose(log_div, diff_log))
and df.head(10)
:
values changes log div diff log
0 0.010000 NaN NaN NaN
1 0.010909 0.090909 0.087011 0.087011
2 0.011818 0.083333 0.080043 0.080043
3 0.012727 0.076923 0.074108 0.074108
4 0.013636 0.071429 0.068993 0.068993
5 0.014545 0.066667 0.064539 0.064539
6 0.015455 0.062500 0.060625 0.060625
7 0.016364 0.058824 0.057158 0.057158
8 0.017273 0.055556 0.054067 0.054067
9 0.018182 0.052632 0.051293 0.051293