python statistics probability data-science markov

(Python) Markov, Chebyshev, Chernoff upper bound functions

I'm stuck with one task on my learning path.

For the binomial distribution X∼Bp,n with mean μ=np and variance σ**2=np(1−p), we would like to upper bound the probability P(X≥c⋅μ) for c≥1. Three bounds introduced:

Formulas

The task is to write three functions respectively for each of the inequalities. They must take n , p and c as inputs and return the upper bounds for P(X≥c⋅np) given by the above Markov, Chebyshev, and Chernoff inequalities as outputs.

And there is an example of IO:

Code:

print Markov(100.,0.2,1.5)

print Chebyshev(100.,0.2,1.5)

print Chernoff(100.,0.2,1.5)

Output

0.6666666666666666

0.16

0.1353352832366127

I'm completely stuck. I just can't figure out how to plug in all that math into functions (or how to think algorithmically here). If someone could help me out, that would be of great help!

p.s. and all libs are not allowed by task conditions except math.exp

Solution

Ok, let's look at what's given:

Input and derived values:

n = 100
p = 0.2
c = 1.5
m = n*p = 100 * 0.2 = 20
s2 = n*p*(1-p) = 16
s = sqrt(s2) = sqrt(16) = 4

You have multiple inequalities of the form P(X>=a*m) and you need to provide bounds for the term P(X>=c*m), so you need to think how a relates to c in all cases.

Markov inequality: P(X>=a*m) <= 1/a

You're asked to implement Markov(n,p,c) that will return the upper bound for P(X>=c*m). Since from

  P(X>=a*m)
= P(X>=c*m)

it's clear that a == c, you get 1/a = 1/c. Well, that's just

def Markov(n, p, c):
  return 1.0/c

>>> Markov(100,0.2,1.5)
0.6666666666666666

That was easy, wasn't it?

Chernoff inequality states that P(X>=(1+d)*m) <= exp(-d**2/(2+d)*m)

First, let's verify that if

  P(X>=(1+d)*m)
= P(X>=c    *m)

then

1+d = c
  d = c-1

This gives us everything we need to calculate the uper bound:

def Chernoff(n, p, c):
  d = c-1
  m = n*p
  return math.exp(-d**2/(2+d)*m)

>>> Chernoff(100,0.2,1.5)
0.1353352832366127

Chebyshev inequality bounds P(X>=m+k*s) by 1/k**2

So again, if

  P(X>=c*m)
= P(X>=m+k*s)

then

c*m     = m+k*s
m*(c-1) = k*s
k       = m*(c-1)/s

Then it's straight forward to implement

def Chebyshev(n, p, c):
  m = n*p
  s = math.sqrt(n*p*(1-p))
  k = m*(c-1)/s
  return 1/k**2

>>> Chebyshev(100,0.2,1.5)
0.16