randomstataexponential-distribution

generating a random sample from an exponential distribution in Stata


I am trying to do a simulation in Stata with a random sample of 10000 for (i) the variable X with pdf f(x) = 2*x*exp(-x^2), X>0, and (ii) Y=X^2 I worked out the cdf of F to be 1-exp(-x^2), so the inverse of F is sqrt(-ln(1-u). I used the following code in Stata:

(1)  
 clear  
 set obs 10000  
 set seed 527665  
 gen u= runiform()  
 gen x= sqrt(-ln(1-u))  
 histogram x  
 summ x, detail  
(mean 0.88, sd 0.46)  
  

(2)  
clear  
set obs 10000  
set seed 527665  
gen u= runiform()  
gen x= (sqrt(-ln(1-u)))^2  
summ x, detail  
(mean 0.99, sd 0.99) 

(3)    
clear  
set obs 10000  
set seed 527665  
gen u= rexponential(1)  
gen x= 2*u*exp(-(u^2))  
summ x, detail  
(mean 0.49, sd 0.28)  

(4)
clear  
set obs 10000  
set seed 527665  
gen v= runiform()  
gen u=1/v  
gen x= 2*u*exp(-(u^2))  
histogram x  
summ x, detail  
(mean 0.22, sd 0.26)

My queries are: (i) (1) and (2) are based on the probability integral transformation, which I have come across but do not understand. If (1) and (2) are valid approaches, what is the intuition behind this, (ii) the output for (3) does not seem correct; I am not sure if I am applying the rexponential function correctly, and what is the scale parameter (there seems to be no explanation on this in stata help) (iii) the output for (4) also does not seem correct, and I was wondering why this approach is flawed.

Thanks


Solution

  • Well, what you worked out as distribution looks ok to me

    If

    PDF(x) = 2 x exp(-x2), x in [0...Infinity) then

    CDF(x) = 1 - exp(-x2)

    which means it is basically square root of exponentially distributed RV. Exponential distribution sampling is done using -ln(1-u) or -ln(u)

    I don't have Stata, just looking at the code

    (1) looks ok, you sample exponential and get square root of it

    (2) looks like you're sampling square root of exponential and immediately square it back. You will get back exponential, I believe

    (3) I don't know what it supposed to mean, exponent of squared exponentials? Should be

    clear  
    set obs 10000  
    set seed 527665  
    gen u = rexponential(1)  
    gen x = sqrt(u)
    summ x, detail  
    

    rexponential() is the same as -ln(1-runiform())

    (4) Does not make sense. Exponent from squared uniform?

    I quickly wrote simple Python code for illustration

    import numpy as np
    import matplotlib.pyplot as plt
    
    x = np.random.random(100000) // uniform in [0...1)
    xx = np.sqrt(-np.log(1.0-x)) // -log(1-x) is exponential, then square root
    
    q = np.linspace(0.0, 3.0, 101)
    z = 2.0*q*np.exp(-q*q)
    
    n, bins, patches = plt.hist(xx, 50, density=True, facecolor='g', alpha=0.75)
    plt.plot(q, z, 'r-')
    plt.show()
    

    with picture

    enter image description here