rgeometric-mean

Standardize name in row and calculate the geometric mean based on similar row in R


I have a data table where I want to standardize the name in "Sex" and calculate the geometric mean based on each Group (as in x, y and z in the table).

Would appreciate your help. Below are the data.table.

library(data.table)
dt <- data.table(Group = c("x","x","x","y","z","z"), Sex = c("Man","Female","Feminine","Male","M","F"), Score = c(0,0.4,0.1,0.5,3,2.1))

Thank you.


Solution

  • Is this what you want?

    geomean <- function(v) prod(v)**(1/length(v))
    res <- tapply(dt$Score, dt$Group, geomean)
    

    which gives

    > res
          x       y       z 
    0.00000 0.50000 2.50998 
    

    or use ave to create a new column

    dt <- within(dt,gm <- ave(Score,Group,FUN = geomean))
    > dt
    Group      Sex Score      gm
    1:     x      Man   0.0 0.00000
    2:     x   Female   0.4 0.00000
    3:     x Feminine   0.1 0.00000
    4:     y     Male   0.5 0.50000
    5:     z        M   3.0 2.50998
    6:     z        F   2.1 2.50998
    

    EDIT:

    If you want to group data by both Group and Sex, try below

    dt <- within(transform(dt,Sex = toupper(substr(Sex,1,1))),
                 gm <- ave(Score,Group,Sex,FUN = geomean))
    

    thus

    > dt
       Group Sex Score  gm
    1:     x   M   0.0 0.0
    2:     x   F   0.4 0.2
    3:     x   F   0.1 0.2
    4:     y   M   0.5 0.5
    5:     z   M   3.0 3.0
    6:     z   F   2.1 2.1