I am trying to prove the following property in Dafny: covariance(x,y) <= variance(x) * variance(y)
, where x
and y
are real
.
First of all, I would like to know if I am interpreting covariance and variance correctly. Thus, I make the following question: is it the same proving covariance(x,y) <= variance(x) * variance(y)
and proving covariance(x,y) <= covariance(x,x) * covariance(y,y)
? I make this question because the variance usually appears to the power of 2 (i.e., sigma^2(x)
), which confuses me.
However, I understand from https://en.wikipedia.org/wiki/Covariance that this equivalence holds. Concretely, the following is stated: The variance is a special case of the covariance in which the two variables are identical.
I also take one of the definitions of covariance from that link. Concretely: cov(x,y)=(1/n)*summation(i from 1 to n){(x_i-mean(x))*(y_i-mean(y))}
.
Thus, assuming that I must prove covariance(x,y) <= covariance(x,x) * covariance(y,y)
and that covariance is defined like above, I test the following lemma in Dafny:
lemma covariance_equality (x: seq<real>, y:seq<real>)
requires |x| > 1 && |y| > 1; //my conditions
requires |x| == |y|; //my conditions
requires forall elemX :: elemX in x ==> 0.0 < elemX; //my conditions
requires forall elemY :: elemY in y ==> 0.0 < elemY; //my conditions
ensures covariance_fun (x,y) <= covariance_fun(x,x) * covariance_fun (y,y);
{}
Which cannot be proven automatically. The point is that I do not know how to prove it manually. I made this (not useful) approach of a contraction proof, but I guess this is not the way:
lemma covariance_equality (x: seq<real>, y:seq<real>)
requires |x| > 1 && |y| > 1; //my conditions
requires |x| == |y|; //my conditions
requires forall elemX :: elemX in x ==> 0.0 < elemX; //my conditions
requires forall elemY :: elemY in y ==> 0.0 < elemY; //my conditions
ensures covariance_fun (x,y) <= covariance_fun(x,x) * covariance_fun (y,y);
{
if !(covariance_fun(x,y) < covariance_fun(x,c) * covariance_fun (y,y)) {
//...
assert !(forall elemX' :: elemX' in x ==> 0.0<elemX') && (forall elemY' :: elemY' in y ==> 0.0<elemY'); //If we assume this, the proof is verified
assert false;
}
}
My main question is: how can I prove this lemma?
So that you have all the information, I offer the functions by which covariance calculation is made up. First, the covariance function is as follows:
function method covariance_fun(x:seq<real>, y:seq<real>):real
requires |x| > 1 && |y| > 1;
requires |x| == |y|;
decreases |x|,|y|;
{
summation(x,mean_fun(x),y,mean_fun(y)) / ((|x|-1) as real)
}
As you can see, it performs summation and then divides by the length. Also, note that the summation function receives mean of x
and mean of y
. Summation
is as follows:
function method summation(x:seq<real>,x_mean:real,y:seq<real>,y_mean:real):real
requires |x| == |y|;
decreases |x|,|y|;
{
if x == [] then 0.0
else ((x[0] - x_mean)*(y[0] - y_mean))
+ summation(x[1..],x_mean,y[1..],y_mean)
}
As you can see, it recursively performs (x-mean(x))*(y-mean(y))
of each position.
As for helper functions, mean_fun
calculates the mean and is as follows:
function method mean_fun(s:seq<real>):real
requires |s| >= 1;
decreases |s|;
{
sum_seq(s) / (|s| as real)
}
Note that is uses sum_seq
function which, given a sequence, calculates the sum of all its positions. It is defined recursively:
function method sum_seq(s:seq<real>):real
decreases s;
{
if s == [] then 0.0 else s[0] + sum_seq(s[1..])
}
Now, with all these ingredientes, can anyone tell me (1) if I already did something wrong and/or (2) how to prove the lemma?
PS: I have been looking and this lemma seems to be the Cauchy–Schwarz inequality, but still do not know how to solve it.
Elaborating on the answer
I will (1) define a convariance function cov
in terms of product and then (2) use this definition in a new lemma that should hold.
Also, note that I add the restriction that 'x' and 'y' are non-empty, but you can modify this
//calculates the mean of a sequence
function method mean_fun(s:seq<real>):real
requires |s| >= 1;
decreases |s|;
{
sum_seq(s) / (|s| as real)
}
//from x it constructs a=[x[0]-x_mean, x[1]-x_mean...]
function construct_list (x:seq<real>, m:real) : (a:seq<real>)
requires |x| >= 1
ensures |x|==|a|
{
if |x| == 1 then [x[0]-m]
else [x[0]-m] + construct_list(x[1..],m)
}
//covariance is calculated in terms of product
function cov(x: seq<real>, y: seq<real>) : real
requires |x| == |y|
requires |x| >= 1
ensures cov(x,y) == product(construct_list(x, mean_fun(x)),construct_list(y, mean_fun(y))) / (|x| as real) //if we do '(|x| as real)-1', then we can have a division by zero
//i.e., ensures cov(x,y) == product(a,b) / (|x| as real)
{
//if |a| == 1 then 0.0 //we do base case in '|a|==1' instead of '|a|==1', because we have precondition '|a| >= 1'
//else
var x_mean := mean_fun(x);
var y_mean := mean_fun(y);
var a := construct_list(x, x_mean);
var b := construct_list(y, y_mean);
product(a,b) / (|a| as real)
}
//Now, test that Cauchy holds for Cov
lemma cauchyInequality_usingCov (x: seq<real>, y: seq<real>)
requires |x| == |y|
requires |x| >= 1
requires forall i :: 0 <= i < |x| ==> x[i] >= 0.0 && y[i] >= 0.0
ensures square(cov(x,y)) <= cov(x,x)*cov(y,y)
{
CauchyInequality(x,y);
}
\\IT DOES NOT HOLD!
Proving HelperLemma (not provable?)
Trying to prove HelperLemma
I find that Dafny is able to prove that LHS of a*a*x + b*b*y >= 2.0*a*b*z
is positive.
Then, I started to prove different cases (I know this is not the best procedure):
(1) If one of a
,b
or z
is negative (while the rest positive) or when the three are negative; then RHS is negative
And when RHS is negative, LHS is greater or equal (since LHS is positive).
(2) If a
or b
or z
are 0
, then RHS is 0
, thus LHS is greater or equal.
(3) If a>=1
, b>=1
and z>=1
, I tried to prove this using sqrt_ineq()
.
BUT! We can find a COUNTER EXAMPLE.
With a=1
, b=1
, z=1
, RHS is 2; now, choose x
to be 0
and y
to be 0
(i.e., LHS=0). Since 0<2
, this is a counter example.
Am I right? If not, how can I prove this lemma?
lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
requires x >= 0.0 && y >= 0.0 //&& z >= 0.0
requires square(z) <= x * y
ensures a*a*x + b*b*y >= 2.0*a*b*z
{
assert square(z) == z*z;
assert a*a*x + b*b*y >= 0.0; //a*a*x + b*b*y is always positive
if ((a<0.0 && b>=0.0 && z>=0.0) || (a>=0.0 && b<0.0 && z>=0.0) || (a>=0.0 && b>=0.0 && z<0.0) || (a<0.0 && b<0.0 && z<0.0)) {
assert 2.0*a*b*z <=0.0; //2.0*a*b*z <= 0.0 is negative or 0 when product is negative
assert a*a*x + b*b*y >= 2.0*a*b*z; // When 2.0*a*b*z <= 0.0 is negative or 0, a*a*x + b*b*y is necessarily greater
}
if (a==0.0 || b==0.0 || z==0.0){
assert a*a*x + b*b*y >= 2.0*a*b*z;
}
else if (a>=1.0 && b>=1.0 && z>= 1.0){
assert a*a >= a;
assert b*b >= b;
assert a*a+b*b >= a*b;
assert square(z) <= x * y;
//sqrt_ineq(square(z),x*y);
//assert square(z) == z*z;
//sqrt_square(z);
//assert sqrt(z) <= sqrt(x*y);
//assert (x==0.0 || y == 0.0) ==> (x*y==0.0) ==> (square(z) <= 0.0) ==> z<=0.0;
assert a*a*x + b*b*y >= 2.0*a*b*z;
//INDEED: COUNTER EXAMPLE: a=1, b=1, z=1. Thus: RHS is 2; now, choose x to be 0 and y to be 0 (i.e., LHS=0). Since 0<2, this is a counter example.
}
}
Edited after clarification/fixing of post condition.
Here is attempt to do it in Dafny using calculation and induction. Few comments on verification
Since task at hand is verification, I assumed sqrt function and some of its properties. HelperLemma is provable provided we assume some properties of sqrt function. There are two assumption in verification that is easy to prove.
function square(r: real) : real
ensures square(r) >= 0.0
{
r * r
}
function sqrt(r: real) : real
function {:fuel 2} product(a: seq<real>, b: seq<real>) : real
requires |a| == |b|
{
if |a| == 0 then 0.0
else a[0] * b[0] + product(a[1..], b[1..])
}
lemma sqrt_ineq(a: real, b: real)
requires a >= 0.0 && b >= 0.0
requires a <= b
ensures sqrt(a) <= sqrt(b)
lemma sqrt_square(a: real)
ensures sqrt(square(a)) == a
lemma CauchyBaseInequality(a1: real, a2: real, b1: real, b2: real)
ensures square(a1*b1 + a2*b2) <= (a1*a1 + a2*a2) * (b1*b1 + b2*b2)
{
calc <= {
square(a1*b1 + a2*b2) - (a1*a1 + a2*a2) * (b1*b1 + b2*b2);
a1*b1*a1*b1 + a2*b2*a2*b2 + 2.0*a1*b1*a2*b2 - a1*a1*b1*b1 - a1*a1*b2*b2 - a2*a2*b1*b1 - a2*a2*b2*b2;
2.0*a1*b1*a2*b2 - a1*a1*b2*b2 - a2*a2*b1*b1;
- square(a1*b2 - a2*b1);
0.0;
}
}
lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
requires x >= 0.0 && y >= 0.0
requires square(z) <= x * y
ensures a*a*x + b*b*y >= 2.0*a*b*z
lemma CauchyInequality(a: seq<real>, b: seq<real>)
requires |a| == |b|
ensures square(product(a, b)) <= product(a, a) * product(b, b)
{
if |a| <= 2 {
if |a| == 2 {
CauchyBaseInequality(a[0], a[1], b[0], b[1]);
}
}
else {
var x, xs := a[0], a[1..];
var y, ys := b[0], b[1..];
calc >= {
product(a, a) * product(b, b) - square((product(a, b)));
(x*x + product(xs, xs))*(y*y + product(ys, ys)) - square(x*y + product(xs, ys));
x*x*y*y + y*y*product(xs, xs) + x*x*product(ys, ys) + product(xs, xs)*product(ys, ys)
- x*x*y*y - square(product(xs, ys)) - 2.0*x*y*product(xs, ys);
y*y*product(xs, xs) + x*x*product(ys, ys) + product(xs, xs) * product(ys, ys)
- square(product(xs, ys)) - 2.0*x*y*product(xs, ys);
{ CauchyInequality(xs, ys); }
y*y*product(xs, xs) + x*x*product(ys, ys) - 2.0*x*y*product(xs, ys);
{
CauchyInequality(xs, ys);
assume product(xs, xs) >= 0.0;
assume product(ys, ys) >= 0.0;
HelperLemma(y, product(xs, xs), x, product(ys, ys), product(xs, ys));
}
0.0;
}
}
}
lemma cauchyInequality_usingCov (x: seq<real>, y: seq<real>)
requires |x| == |y|
requires |x| >= 1
requires forall i :: 0 <= i < |x| ==> x[i] >= 0.0 && y[i] >= 0.0
ensures square(cov(x,y)) <= cov(x,x)*cov(y,y)
{
var x_mean := mean_fun(x);
var y_mean := mean_fun(y);
var a := construct_list(x, x_mean);
var b := construct_list(y, y_mean);
assert cov(x, y) == product(a, b) / (|x| as real);
assert cov(x, x) == product(a, a) / (|x| as real);
assert cov(y, y) == product(b, b) / (|x| as real);
CauchyInequality(a, b);
}
Verification of Helper Lemma
function square(r: real): real
ensures square(r) >= 0.0
{
r * r
}
function abs(a: real) : real
ensures abs(a) >= 0.0
{
if a < 0.0 then -a else a
}
function sqrt(r: real): real
requires r >= 0.0
ensures square(sqrt(r)) == r && sqrt(r) >= 0.0
lemma sqrtIneq(a: real, b: real)
requires a >= 0.0 && b >= 0.0
requires a <= b
ensures sqrt(a) <= sqrt(b)
lemma sqrtLemma(a: real)
ensures sqrt(square(a)) == abs(a)
lemma sqrtMultiply(a: real, b: real)
requires a >= 0.0 && b >= 0.0
ensures sqrt(a*b) == sqrt(a) * sqrt(b);
lemma differenceLemma(a: real, b: real)
requires a >= abs(b)
ensures a - b >= 0.0 && a + b >= 0.0
{
}
lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
requires x >= 0.0 && y >= 0.0
requires square(z) <= x * y
ensures a*a*x + b*b*y >= 2.0*a*b*z
{
var sx := sqrt(x);
var sy := sqrt(y);
assert sx * sx == x;
assert sy * sy == y;
if (a >= 0.0 && b >= 0.0) || (a < 0.0 && b < 0.0) {
calc >= {
a*a*x + b*b*y - 2.0*a*b*z;
a*a*sx*sx + b*b*sy*sy - 2.0*a*b*sx*sy + 2.0*a*b*sx*sy - 2.0*a*b*z;
square(a*sx - b*sy) + 2.0*a*b*sx*sy - 2.0*a*b*z;
2.0*a*b*sx*sy - 2.0*a*b*z;
2.0*a*b*(sx*sy - z);
{
sqrtIneq(square(z), x * y);
assert sqrt(x * y) >= sqrt(square(z));
sqrtMultiply(x, y);
assert sx * sy >= sqrt(square(z));
sqrtLemma(z);
assert sx * sy >= abs(z);
differenceLemma(sx*sy, z);
assert sx*sy >= z;
}
0.0;
}
}
else {
calc >= {
a*a*x + b*b*y - 2.0*a*b*z;
a*a*sx*sx + b*b*sy*sy + 2.0*a*b*sx*sy - 2.0*a*b*sx*sy - 2.0*a*b*z;
square(a*sx + b*sy) - 2.0*a*b*sx*sy - 2.0*a*b*z;
- 2.0*a*b*sx*sy - 2.0*a*b*z;
- 2.0*a*b*(sx*sy + z);
{
sqrtIneq(square(z), x * y);
assert sqrt(x * y) >= sqrt(square(z));
sqrtMultiply(x, y);
assert sx * sy >= sqrt(square(z));
sqrtLemma(z);
assert sx * sy >= abs(z);
differenceLemma(sx*sy, z);
assert sx*sy >= -z;
}
0.0;
}
}
}