I want to calculate the effect size of my variables. I am getting ther error "missing value wher TRUE/FALSE needed" even though I purged my data.frame of NAs before. Any idea why this is happening?
I am using the cohens_d()
function of rstatix .
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
My data.frame looks like this:
structure(list(y = c(7.18497519069826, 7.3003780648707, 7.17955179116519,
8.36921585741014, 8.15836249209525, 7.09061070782841, 7.49108141342319,
7.1846914308176, 6.67089495352021, 6.69143515214406, 6.42357351973274,
7.52608069180203, 7.24501887073775, 6.85901814388889, 7.57170883180869,
7.33425264233423, 8.04921802267018, 7.03181227133037, 7.59494473669508,
7.19479175772192, 7.50365451924296, 7.98766626492627, 7.69670578093392,
7.60357736815147, 6.96018527660461, 6.87390159786446, 7.06818586174616,
7.73303668293358, 7.00902574208691, 7.43980621139333, 7.21563756343506,
7.28869626059026, 7.16435285578444, 8.40397796366936, 8.11092624226642,
6.87139778148748, 7.28510702956681, 7.28533222764388, 7.09131515969722,
6.75541746281094, 7.48515334990365, 7.04727486738418, 7.05153839051533,
6.94610823043691, 6.88677264305444, 7.17522180034305, 8.01535975540921,
6.97657921864011, 7.44994098877334, 7.24328614608345, 6.94987770403687,
7.0265332645233, 7.03662889536216, 6.7070589406276, 7.44075170047919,
6.58972625625424, 6.75913881628117, 7.41597441137657, 7.57460994134019
), x = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), levels = c("untreated", "VRZ", "AMB", "untreated_107"), class = "factor")), row.names = c(NA,
-59L), class = c("tbl_df", "tbl", "data.frame"), na.action = structure(c(`58` = 58L), class = "omit"))
r_test %>%
cohens_d (y~ x) %>%
as.data.frame()
Any idea what the problem is?
Similarly, when I tried to use the function wilcox_effsize()
instead, R returns the following error:
"can't deal with factors containing only one level"
When I used this very similar data-frame the analysis worked even though iut contained NAs
structure(list(y = c(9.91e+08, 8.17e+08, 461200000, 15330000,
175100000, 50320000, 13590000, 22970000, 2778000, 3453000, 12890000,
375900000, 44590000, 1.611e+09, 1e+09, 889900000, 373200000,
NA, NA, NA, NA, NA, 5010000, 6549000, 23160000, 32520000, 7707000,
556900000, 634600000, 820900000, 391400000, 498300000, 147900000,
646900000, 22060000, 1e+07, 306800000, 319400000, 41290000, 94100000,
127200000, 117200000, 618300000, 570700000, 617100000, 284900000,
449600000, 3866000, 6918000, 4177000, 14870000, 29380000, 2815000,
1619000, 3126000, 1710000, 2191000), x = structure(c(1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L), levels = c("untreated", "VRZ", "AMB"
), class = "factor")), row.names = c(NA, -57L), class = c("tbl_df",
"tbl", "data.frame"))
EDIT:
The problem is that there is one unused factor level, namely untreated_107
. There are several ways to deal with this situation:
Use droplevels
from base R
:
library(rstatix)
library(tidyverse)
r_test %>%
mutate(x = droplevels(x))%>%
cohens_d(y ~ x) %>%
as.data.frame()
.y. group1 group2 effsize n1 n2 magnitude
1 y AMB untreated -1.1805582 19 20 large
2 y AMB VRZ -0.4735816 19 20 small
3 y untreated VRZ 0.6551090 20 20 moderate
With fct_drop
from forcats
:
library(forcats)
library(rstatix)
library(tidyverse)
r_test %>%
mutate(x = droplevels(x))%>%
cohens_d(y ~ x) %>%
as.data.frame()
Or, to circumvent the problem of the missing factor level altogether, by converting x
to character (but conceptually questionable, as x
may/will be factor for a reason):
library(rstatix)
library(tidyverse)
r_test %>%
mutate(x = as.character(x)) %>%
cohens_d(y ~ x) %>%
as.data.frame()