I need to merge two data frames by the species columns (present in both data frames) and add the mean values from two other two columns of the data frame 2 (df2).
Example:
df1 <- as.data.frame(rbind("A","B","C"))
colnames(df1) <- "species"
df1 <-
species
A
B
C
a <- rbind(replicate(2,sample(1:5,5,rep=TRUE)))
b <- rbind(replicate(2,sample(2:8,10,rep=TRUE)))
c <- rbind(replicate(2,sample(4:11,6,rep=TRUE)))
a1 <- cbind(replicate(5, "A"),a)
b1 <- cbind(replicate(10, "B"),b)
c1 <- cbind(replicate(6, "C"),c)
df2 <- as.data.frame(rbind(a1,b1,c1))
colnames(df2) <- c("species", "variable1", "variable2")
df2[, c(2:3)] <- sapply(df2[, c(2:3)], as.numeric)
df2 <-
species variable1 variable2
A 4 3
A 3 4
A 5 4
A 3 3
A 1 4
B 6 7
B 8 5
B 5 8
B 3 5
B 2 2
B 5 7
B 5 7
B 4 3
B 4 2
B 6 7
C 11 4
C 10 9
C 6 5
C 6 10
C 8 11
C 6 4
The outcome I'm looking for is the following:
result_df <-
mean_variable1 mean_variable2
A X X
B X X
C X X
I need the mean of the two variables for each species of the df1 (A, B and C).
Any idea on how to do this maybe using dplyr?
I think the below would work? If you need to join on multiple character columns that exist in both, can do by = c("var1", "var2", ...)
result_df <- df2 %>% group_by(species) %>%
summarize(mean_variable1 = mean(variable1)
,mean_variable2 = mean(variable2)) %>%
inner_join(df1, by = c("species"))