I'm working to implement a basic Monte Carlo simulator in Python for some project management risk modeling I'm trying to do (basically Crystal Ball / @Risk, but in Python).
I have a set of n
random variables (all scipy.stats
instances). I know that I can use rv.rvs(size=k)
to generate k
independent observations from each of these n
variables.
I'd like to introduce correlations among the variables by specifying an n x n
positive semi-definite correlation matrix.
Is there a clean way to do this in scipy?
What I've Tried
This answer and this answer seem to indicate that "copulas" would be an answer, but I don't see any reference in scipy to them.
This link seems to implement what I'm looking for, but I'm not sure if scipy has this functionality implemented already. I'd also like it to work for non-normal variables.
It seems that the Iman, Conover paper is the standard method.
If you just want correlation through a Gaussian Copula (*), then it can be calculated in a few steps with numpy and scipy.
create multivariate random variables with desired covariance, numpy.random.multivariate_normal
, and creating a (nobs by k_variables) array
apply scipy.stats.norm.cdf
to transform normal to uniform random variables, for each column/variable to get uniform marginal distributions
apply dist.ppf
to transform uniform margin to the desired distribution, where dist
can be one of the distributions in scipy.stats
(*) Gaussian copula is only one choice and it is not the best when we are interested in tail behavior, but it is the easiest to work with for example http://archive.wired.com/techbiz/it/magazine/17-03/wp_quant?currentPage=all
two references
https://stats.stackexchange.com/questions/37424/how-to-simulate-from-a-gaussian-copula
http://www.mathworks.com/products/demos/statistics/copulademo.html
(I might have done this a while ago in python, but don't have any scripts or function right now.)