I have a set data set from genes derived from two populations, then I want to assess if the two populations have these genes differentially expressed. That's why I want to obtain the adjusted p-value.
I know little about coding, but I have tried without any success. I have tried Python and R.
The data that I have have this structure:
,CONTROL1,CONTROL2,CONTROL3,PROBLEM1,PROBLEM2,PROBLEM3
gene1,31.7,6.31,0.632,0.021,0.159,0.026,
gene2,31.7,6.31,0.632,0.021,0.159,0.026,
gene3,31.7,6.31,0.632,0.021,0.159,0.026,
...
gene_n,31.7,6.31,0.632,0.021,0.159,0.026,
I have tried all I found in Internet, but nothing succeeds.
I want to obtain a list similar to this:
gene_name, adjusted_p-value
gene1, 0.001
gene2, 0.3
gene3, 0.9
...
gene_n, 0.004
if someone could give me any hint where to check or how to do it I would be very grateful. Thanks!!
Short answer is if you're using differential expression data it's probably worth using deseq2.
https://bioconductor.org/packages/release/bioc/html/DESeq2.html
The tutorial is pretty good: https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html
It walks you through how to go from a count matrix data which is basically what you have to a set of adjusted p-values.
For future things like this you might have a better time going to https://bioinformatics.stackexchange.com/ than here.