I am trying to create a new column by conditionals in the matching of two columns with the same factors, in order to summarise the data.
This is how looks the dataframe
This is how I want to summarise
This is my proposed not working code:
probe$Cluster<-"cluster"
for (i in 1:length(probe)){
if(is.na(probe[i,4])){
probe[i,9]<-probe[i,6]}
if(is.na(probe[i,6])){
probe[i,9]<-probe[i,4]}
if(identical(probe[i,4],probe[i,6])){
probe[i,9]<-probe[i,4]}}
if(!identical(probe[i,4],probe[i,6])){
probe[i,9]<-probe[i,4]
rep(probe[i,1:9]%>%probe[i,9]<-probe[i,6}
#Then create a summary of this like this:
Sum<-probe%>%group_by(Method,Cluster)%>% summarise(mean(relation, na.rm
=FALSE),numberobservations=length(unique(GenA)))%>%data.frame()
Thank you for any advise
Can't verify without sample data that I can load (without retyping from a picture), but it looks like you're going for something like this:
library(dplyr)
probe %>%
mutate(Cluster = coalesce(ClusterA, ClusterB) %>% # use 1st non-NA from cols
group_by(Method, Cluster) %>%
summarize(mean = mean(relation, na.rm = TRUE),
numberobservations = n(), .groups = "drop")