What is the minimum sample size n (or the length n = length(x) of the data vector x) such that the difference D = 1 - statx4(x)/statx5(x) of the functions statx4 and statx5 is no more than 1/100 i.e. D ≤ 1/100?
And here are the functions:
statx4 <- function(x) {
numerator <- sum((x-mean(x))^2)
denominator <- length(x)
result <- numerator/denominator
return(result)
}
statx5 <- function(x) {
numerator <- sum((x-mean(x))^2)
denominator <- length(x)-1
result <- numerator/denominator
return(result)
}
I've been doing this exercise set for a while, but haven't managed to get anything valid on this question. Could you point me to right direction?
For the normal distribution, it is the following:
statx4 <- function(x) {
numerator <- sum((x-mean(x))^2)
denominator <- length(x)
result <- numerator/denominator
return(result)
}
statx5 <- function(x) {
numerator <- sum((x-mean(x))^2)
denominator <- length(x)-1
result <- numerator/denominator
return(result)
}
D <- function(x){
1-statx4(x)/statx5(x)
}
DD <- function(N=1111,seed =1){
set.seed(seed)
Logi <- vector()
for (n in 1:N) {
x<- rnorm(n)
y <- D(x)
Logi[n] <- (y > 1/100)
}
return(Logi)
}
min <- vector()
for (seed in 1:100) {
message(seed)
DD(1000,seed)
min[seed] <- length(which(DD(1000) == TRUE))
}
Answer <- mean(min)+1
Answer
Note that the function D
evaluates the difference of the unbiased variance and the ordinal variance.
I think this problem should be more clear in mathematical sense.