rlevels

Calculate mean of levels interval


I have levels that I wish to calculate the mean of. Do you have to use gsub and replace characters or are there another way?

# Reproduce data
x <- c("(-48.2,-47.8]", "(-61.9,-61.5]", "(-52.2,-51.8]", "(-43.7,-43.3]", "(-51.4,-51]", "(-43.3,-42.9]", "(-43.7,-43.3]", "(-47.4,-47]")

# I have data on the form as below
X <- as.factor(x)

# I want the mean of e.g X[1]
# mean(X[1]) = mean(-48.2 + -47.8)

Solution

  • You could also try this approach using dplyr() to preserve all the numbers:

    library(dplyr)
    library(tidyr)
    
    data.frame(x) %>% separate(x, into = c("num1", "num2"), sep = ",") %>%
      mutate(num1 = as.numeric(gsub("[()]|[][]", "", num1)),
             num2 = as.numeric(gsub("[()]|[][]", "", num2)),
             mean = (num1 + num2) / 2)
    

    Output:

    #    num1  num2  mean
    # 1 -48.2 -47.8 -48.0
    # 2 -61.9 -61.5 -61.7
    # 3 -52.2 -51.8 -52.0
    # 4 -43.7 -43.3 -43.5
    # 5 -51.4 -51.0 -51.2
    # 6 -43.3 -42.9 -43.1
    # 7 -43.7 -43.3 -43.5
    # 8 -47.4 -47.0 -47.2