I am trying to iterate a section of code based off values in a column of a dataframe of all my data. An example of the layout of the dataframe is below:
HumanName | log2FoldChange | pvalue | gene |
---|---|---|---|
Rob | -3e00 | 4e-06 | GeneA |
Carol | -2e04 | 2e-09 | GeneA |
Pamela | 4e-06 | 5e-04 | GeneA |
Rob | 4e-02 | 1e-10 | GeneB |
Carol | -1e-10 | 4e-06 | GeneB |
Pamela | -8e03 | 3e-09 | GeneB |
I now want to run a section of code that produces separate dataframes based off of values in the "gene" column. So I'd ultimately end up with MvWup for GeneA and a separate MvWUp for GeneB.
GeneSetUp <- subset(FullSet0.05, log2FoldChange>0)
GeneSetDown <- subset(FullSet0.05, log2FoldChange<0)
GeneSetUpLF1 <- subset(GeneSetUp, log2FoldChange > 1)
GeneSetDownLF1 <- subset(GeneSetDown, log2FoldChange< -1)
GeneSetUpLF1G <- subset(GeneSetUpLF1, select = -c(pvalue, log2FoldChange))
GeneSetDownLF1G <- subset(GeneSetDownLF1, select = -c(pvalue, log2FoldChange))
MvWup<-as.vector(unlist(GeneSetUpLF1G))
MvWdown<-as.vector(unlist(GeneSetDownLF1G))
I have gone back and forth between a for loop and a map approach but am struggling with both. I've created a function for the above code but can't seem to apply it correctly. Any guidance would be very appreciated.
# Sample dataframe
df <- data.frame(
HumanName = c("Rob", "Carol", "Pamela", "Rob", "Carol", "Pamela"),
log2FoldChange = c(-3e00, -2e04, 4e-06, 4e-02, -1e-10, -8e03),
pvalue = c(4e-06, 2e-09, 5e-04, 1e-10, 4e-06, 3e-09),
gene = c("GeneA", "GeneA", "GeneA", "GeneB", "GeneB", "GeneB")
)
# Split the dataframe by 'gene'
gene_list <- split(df, df$gene)
# Initialize empty list to store results
results <- list()
# Iterate through each gene-specific dataframe
for (gene_name in names(gene_list)) {
gene_data <- gene_list[[gene_name]]
# Filtering logic
GeneSetUp <- subset(gene_data, log2FoldChange > 0)
GeneSetDown <- subset(gene_data, log2FoldChange < 0)
GeneSetUpLF1 <- subset(GeneSetUp, log2FoldChange > 1)
GeneSetDownLF1 <- subset(GeneSetDown, log2FoldChange < -1)
GeneSetUpLF1G <- subset(GeneSetUpLF1, select = -c(pvalue, log2FoldChange))
GeneSetDownLF1G <- subset(GeneSetDownLF1, select = -c(pvalue, log2FoldChange))
# Store filtered dataframes in a list with dynamic names
results[[paste0("MvWup_", gene_name)]] <- GeneSetUpLF1G
results[[paste0("MvWdown_", gene_name)]] <- GeneSetDownLF1G
}
# Example: Access result for GeneA
results$MvWup_GeneA
results$MvWdown_GeneA