I have a dataframe:
dput(gene1[1:5,1:5])
structure(list(en_Adipose_Subcutaneous.db = c(0.0531016390078734,
-0.00413407782001034, -0.035434632568444, 0.00968736935965742,
0.0523714252287003), en_Adipose_Visceral_Omentum.db = c(0, 0,
0, 0, 0), en_Adrenal_Gland.db = c(0, 0, 0, 0, 0), en_Artery_Aorta.db = c(0,
0, 0, 0, 0), en_Artery_Coronary.db = c(0, 0, 0, 0, 0)), row.names = c("rs1041770",
"rs12628452", "rs915675", "rs11089130", "rs36061596"), class = "data.frame")
I want to select only those rows for which atleast there is value in more than 2 columns. And remove those rows for which there is a value only in one column. I wrote this code:
one_tissueonly <- NULL
for(i in 1:552){
y <- which(gene1[i,]!=0) ## >1 means more than one col
if(length(y)>1){ ##select only for one col:
value <- gene1[i,]
}
one_tissueonly <- rbind(one_tissueonly,value)
}
But it generate some same rows: for the first value using rbind function:
dput(one_tissueonly[1:5,1:5])
structure(list(en_Adipose_Subcutaneous.db = c(0.0531016390078734,
0.0531016390078734, 0.0531016390078734, 0.00968736935965742,
0.0523714252287003), en_Adipose_Visceral_Omentum.db = c(0, 0,
0, 0, 0), en_Adrenal_Gland.db = c(0, 0, 0, 0, 0), en_Artery_Aorta.db = c(0,
0, 0, 0, 0), en_Artery_Coronary.db = c(0, 0, 0, 0, 0)), row.names = c("rs1041770",
"rs10417701", "rs10417702", "rs11089130", "rs36061596"), class = "data.frame")
Output file looks like this: Does anyone know how to solve this. Thank you.
Following Gregor Thomas advice (modified since you want two or more tissues that show the same marker)
# using bracets:
gene <- gene[rowSums(gene != 0) > 1, ]
# using subset()
gene <- subset(gene,rowSums(gene != 0) > 1)