rplotprobabilityapproximation

Comparing the exact and an approximation of the CDF function with R command ecdf


I want to compute the exact and an approximation of the Cumulative Distribution Function for the following density :

f(x)= U(x;-1,2)/2 + U(x;0,1)/2

where U(.;a,b) is the uniform density function on the interval [a,b].

I know how to compute the exact expression of the CDF. My problem is that I do not understand why my approximation provided by the R function ecdf is a very bad approximation of the exact CDF (see code below).

#density function
f       = function(x){dunif(x,-1,2)/2+dunif(x)/2}
plot(f,-2,4)

#Approximation
xis     = runif(1000000,-1,2)/2+runif(1000000)/2
Ffapp   = ecdf(xis)
zs      = seq(-1.2, 2.2, by=0.01)
Ffs     = Ffapp(zs)
plot(zs,Ffs,type='l',col="red")

#exact expression  
Ffexact = function(t){(t <= -1) * 0 +
(t>-1 & t<=0) * (t+1)/6 +
(t>0 & t<=1) * ((t+1)/6 + t/2) +
(t>1 & t<=2) * ((t+1)/6 + 1/2) +
(t>2) * 1}
curve(Ffexact,-2,3,add=TRUE)
legend(-1, 0.9, legend=c("Fapp", "Fexact"),col=c("red", "black"), lty=1:1)

How to explain the bad approximation of the CDF obtained with the ecdf command ?

EDIT: here is the plot. We can see that the approximation is quite far from the exact expression. But more importantly I say it's a bad approximation because we do not have convergence of the red curve to the black curve when the size of the sample (argument of runif) tends to plus infinity. Mathematically we should have pointwise convergence of the two CDF.

plot


Solution

  • The problem is in the way you generate your random sample xis. If we look at its histogram, we can see that the distribution doesn't follow your density. hist(xis): Histogram

    xis are just random values, you don't want to divide them by 2 (this shrinks the distribution) and instead of summing, you need to concatenate the two vectors:

    xis <- c(runif(1000000, -1, 2), runif(1000000))
    

    Now the approximation fits perfectly: ECDF