pythonnormal-distributioncdf

Python Bivariate Normal CDF with variable upper bound


I am trying to find an elegant way to calculate a bivariate normal CDF with python where one upper bound of the CDF is a function of two variables, of which one is a variable of the bivariate normal density (integral variable).

Example:

from scipy import integrate
import numpy as np

# First define f(x, y) as the bivariate normal distribution with fixed correlation p
p = 0.4
def f(x, y):
    Q = x**2 + y**2 - 2*p*x*y
    return 1/(2*np.pi*np.sqrt(1-p**2))*np.exp(-1/(2*(1-p**2))*Q)

# Define N2(a, b) as the cumulative bivariate normal distribution f where a is constant 
# and b is a function of two variables
def N2(a, b):
    prob, error = integrate.dblquad(f, np.NINF, a, lambda x: np.NINF, b)
    return prob

# Upper bound function of 2 variables example where x is an integral variable
def upper_bound(x, v):
    return 0.5*v*x

# My approach which doesn't work
# Calculate bivariate normal CDF for different values of v in the upper bound
results = [N2(1, upper_bound(x, v)) for v in np.linspace(0.01, 4, 200)]

Any ideas on how I could change my approach so the call to upper_bound(x, v) in results = [N2(1, upper_bound(x, v)) for v in np.linspace(0.01, 4, 200)] would work? Other approaches to tackle the problem also welcome.

Edit: This is the integral I want to compute, where f(x,y) is the bivariate normal density function. Note that the actual upper bound f(x,v) = 0.5*v*x I want to compute is way more complicated, this is just as an example, therefore I do not want to compute it symbolically, for instance with sympy. Also, my goal is to compute the integral for a few hundreds different values of v.

The integral: enter image description here


Solution

  • Although it's slow this approach seems to work.

    The first few lines, up to 'this should produce 1', are a sanity check. I wanted to verify that my approach would correctly calculate the volume under the density. It does.

    I use a variance-covariance matrix to get the desired correlation of 0.4 and avoid writing my own pdf.

    I curry functions in two places so that functions have only single parameters. This makes it possible to calculate the inner integral as a function of x. It also makes it possible to take the v parameter 'outside' the other calculations.

    from toolz import curry
    from scipy.stats import multivariate_normal
    from scipy.integrate import quad
    import numpy as np
    
    @curry
    def bivariate(x,y):
        return multivariate_normal.pdf([x,y],cov=[[1,.4],[.4,1]])
    
    def out_y(x):
        marginal = bivariate(x)
        return quad(marginal, np.NINF, np.PINF)[0]
    
    # this should produce 1
    print (quad(out_y, np.NINF, np.PINF)[0])
    
    # now to what the OP wants
    
    @curry
    def inner_integral(v,x):
        marginal = bivariate(x)
        return quad(marginal, np.NINF, 0.5*v*x)[0]
    
    inner_integral_for_one_v = inner_integral(0.8)
    print (quad(inner_integral_for_one_v,np.NINF, 1)[0])
    

    To use this code you would write something equivalent to:

    for v in range(0,1,0.1):
        inner_integral_for_one_v = inner_integral(v)
        print (quad(inner_integral_for_one_v,np.NINF, 1)[0])