Small Disclaimer: I considered posting this on cross-validated, but I feel that this is more related to a software implementation. The question can be migrated if you disagree.
I am trying out the package samplesize. I am trying to decipher what the k
parameter for the function n.ttest
is. The following is stated in the documentation:
k Sample fraction k
This is not very helpful. What exactly is this parameter?
I am performing the following calculations, all the essential values are in the vals
variable, which I provide below:
power <- 0.90
alpha <- 0.05
vals <- ??? # These values are provided below
mean.diff <- vals[1,2]-vals[2,2]
sd1 <- vals[1,3]
sd2 <- vals[2,3]
k <- vals[2,4]/(vals[1,4]+vals[2,4])
design <- "unpaired"
fraction <- "unbalanced"
variance <- "equal"
# Get the sample size
n.ttest(power = power, alpha = alpha, mean.diff = mean.diff,
sd1 = sd1, sd2 = sd2, k = k, design = design,
fraction = fraction, variance = variance)
vals
contains the following values:
> vals
affected mean sd length
1 1 -0.8007305 7.887657 57
2 2 4.5799913 6.740781 16
Is k
the proportion of one group, in the total number of observations? Or is it something else? If I am correct, then does the proportion correspond to group with sd1
or sd2
?
Your first instinct was right -- this belongs on stats.SE and not on SO. The parameter k
has a statistical interpretation which can be found in any reference on power analysis. It basically sets the sample size of the second sample, when, as in the case of two-sample tests, the second sample is constrained to be a certain fraction of the first.
You can see the relevant lines of the code here (lines 106 to 120 of n.ttest
):
unbalanced = {
df <- n.start - 2
c <- (mean.diff/sd1) * (sqrt(k)/(1 + k))
tkrit.alpha <- qt(conf.level, df = df)
tkrit.beta <- qt(power, df = df)
n.temp <- ((tkrit.alpha + tkrit.beta)^2)/(c^2)
while (n.start <= n.temp) {
n.start <- n.start + 1
tkrit.alpha <- qt(conf.level, df = n.start -
2)
tkrit.beta <- qt(power, df = n.start - 2)
n.temp <- ((tkrit.alpha + tkrit.beta)^2)/(c^2)
}
n1 <- n.start/(1 + k)
n2 <- k * n1
In your case:
library(samplesize)
vals = data.frame(
affected = c(1, 2),
mean = c(-0.8007305, 4.5799913),
sd = c(7.887657, 6.740781),
length = c(57, 16))
power <- 0.90
alpha <- 0.05
mean.diff <- vals[1,2]-vals[2,2]
sd1 <- vals[1,3]
sd2 <- vals[2,3]
k <- vals[2,4]/(vals[1,4]+vals[2,4])
k <- vals[2,4]/vals[1,4]
design <- "unpaired"
fraction <- "unbalanced"
variance <- "equal"
# Get the sample size
tt1 = n.ttest(power = power,
alpha = alpha,
mean.diff = mean.diff,
sd1 = sd1,
sd2 = sd2,
k = k,
design = design,
fraction = fraction,
variance = variance)
You can see that:
assertthat::are_equal(ceiling(tt1$`Sample size group 1`*tt1$Fraction),
tt1$`Sample size group 2`)