samplesize package in R, understanding the parameters

Small Disclaimer: I considered posting this on cross-validated, but I feel that this is more related to a software implementation. The question can be migrated if you disagree.

I am trying out the package samplesize. I am trying to decipher what the k parameter for the function n.ttest is. The following is stated in the documentation:

k Sample fraction k

This is not very helpful. What exactly is this parameter?

I am performing the following calculations, all the essential values are in the vals variable, which I provide below:

power <- 0.90
alpha <- 0.05
vals <- ??? # These values are provided below
mean.diff <- vals[1,2]-vals[2,2]
sd1 <- vals[1,3]
sd2 <- vals[2,3]
k <- vals[2,4]/(vals[1,4]+vals[2,4])
design <- "unpaired"
fraction <- "unbalanced"
variance <- "equal"

# Get the sample size
n.ttest(power = power, alpha = alpha, mean.diff = mean.diff, 
        sd1 = sd1, sd2 = sd2, k = k, design = design, 
        fraction = fraction, variance = variance)

vals contains the following values:

> vals
  affected       mean       sd length
1        1 -0.8007305 7.887657     57
2        2  4.5799913 6.740781     16

Is k the proportion of one group, in the total number of observations? Or is it something else? If I am correct, then does the proportion correspond to group with sd1 or sd2?

Solution

Your first instinct was right -- this belongs on stats.SE and not on SO. The parameter k has a statistical interpretation which can be found in any reference on power analysis. It basically sets the sample size of the second sample, when, as in the case of two-sample tests, the second sample is constrained to be a certain fraction of the first.

You can see the relevant lines of the code here (lines 106 to 120 of n.ttest):

unbalanced = {
                  df <- n.start - 2
                  c <- (mean.diff/sd1) * (sqrt(k)/(1 + k))
                  tkrit.alpha <- qt(conf.level, df = df)
                  tkrit.beta <- qt(power, df = df)
                  n.temp <- ((tkrit.alpha + tkrit.beta)^2)/(c^2)
                  while (n.start <= n.temp) {
                    n.start <- n.start + 1
                    tkrit.alpha <- qt(conf.level, df = n.start - 
                      2)
                    tkrit.beta <- qt(power, df = n.start - 2)
                    n.temp <- ((tkrit.alpha + tkrit.beta)^2)/(c^2)
                  }
                  n1 <- n.start/(1 + k)
                  n2 <- k * n1

In your case:

library(samplesize)

vals = data.frame(
  affected = c(1, 2), 
  mean = c(-0.8007305, 4.5799913), 
  sd = c(7.887657, 6.740781), 
  length = c(57, 16))

power <- 0.90
alpha <- 0.05
mean.diff <- vals[1,2]-vals[2,2]
sd1 <- vals[1,3]
sd2 <- vals[2,3]
k <- vals[2,4]/(vals[1,4]+vals[2,4])
k <- vals[2,4]/vals[1,4]

design <- "unpaired"
fraction <- "unbalanced"
variance <- "equal"

# Get the sample size
tt1 = n.ttest(power = power, 
        alpha = alpha, 
        mean.diff = mean.diff, 
        sd1 = sd1, 
        sd2 = sd2, 
        k = k, 
        design = design, 
        fraction = fraction, 
        variance = variance)

You can see that:

assertthat::are_equal(ceiling(tt1$`Sample size group 1`*tt1$Fraction), 
                      tt1$`Sample size group 2`)