rfunctiondifferencesample-size

Minimum sample size n such that difference is no more than


What is the minimum sample size n (or the length n = length(x) of the data vector x) such that the difference D = 1 - statx4(x)/statx5(x) of the functions statx4 and statx5 is no more than 1/100 i.e. D ≤ 1/100?

And here are the functions:

statx4 <- function(x)  {
  numerator <- sum((x-mean(x))^2)
  denominator <- length(x)
  result <- numerator/denominator
  return(result)
}
statx5 <- function(x)  {
  numerator <- sum((x-mean(x))^2)
  denominator <- length(x)-1
  result <- numerator/denominator
  return(result)
}

I've been doing this exercise set for a while, but haven't managed to get anything valid on this question. Could you point me to right direction?


Solution

  • For the normal distribution, it is the following:

      statx4 <- function(x)  {
      numerator <- sum((x-mean(x))^2)
      denominator <- length(x)
      result <- numerator/denominator
      return(result)
    }
    statx5 <- function(x)  {
      numerator <- sum((x-mean(x))^2)
      denominator <- length(x)-1
      result <- numerator/denominator
      return(result)
    }
    
    D <- function(x){
    
      1-statx4(x)/statx5(x)
    }
    
    
    DD <- function(N=1111,seed =1){
      set.seed(seed)
      Logi <- vector()
      for (n in 1:N) {
        x<- rnorm(n)
        y <- D(x)
        Logi[n] <- (y  > 1/100) 
      }
      return(Logi)
    }
    
     min  <- vector()
     for (seed in 1:100) {
       message(seed)
       DD(1000,seed)
       min[seed] <-  length(which(DD(1000) == TRUE))
     }
    
      Answer <- mean(min)+1
    Answer 
    

    Note that the function D evaluates the difference of the unbiased variance and the ordinal variance.

    I think this problem should be more clear in mathematical sense.