I have a data frame that looks like this,
df <- data.frame(type=c("SNP","DEL","SNP","SNP"),geneA=c(1,1,1,0), geneB=c(0,0,1,1), geneC=c(1,0,0,1))
type geneA geneB geneC
1 SNP 1 0 1
2 DEL 1 0 0
3 SNP 1 1 0
4 SNP 0 1 1
I want to make an UpSet plot in R to find the common genes and
I want to plot the distribution of types (SNP or DEL) in an histogram.
this is my code so far
upset(df,
attribute.plots = list(gridrows=50,
plots=list(list(plot=histogram,
x="type"))))
and this is my error, which I can not solve
Error: StatBin requires a continuous x variable: the x variable is discrete.Perhaps you want stat="count"?
ANY help is highly appreciated
I am not sure what the distribution of types (SNP or DEL) in an histogram
means exactly. However, because the data of SNP and DEL in your example are binaries, you probably want to plot the counts of SNP and DEL in a bar plot. If that's true, you can try this way:
mybarplot <- function(mydata, x) {
(ggplot(mydata, aes_string(x = x)) + geom_bar()
+ theme(plot.margin = unit(c(0.5, 0.5, 0, 0), "cm"),
legend.key.size = unit(0.4, "cm")))
}
This function takes a data frame and one of its column as its inputs, and
generates a barplot as its output. Then, you call this function within your upset
function to produce the barplot along with your upset plot.
upset(dftest, attribute.plots = list(gridrows = 50,
plots = list(list(plot = mybarplot,
x = "type"))))