Currently I have a data frame which has the acronyms of unique cancer types (hotspot_mockup), like so:
Cancer | Gene |
---|---|
AASTR | IDH1 |
ACRM | NRAS |
In another data frame, I these 184 unique acronyms and their corresponding full names (new_hotspot_cancers). This is in the form:
Acronym | Full Name |
---|---|
AASTR | Anaplastic Astrocytoma |
ACRM | Acral Melanoma |
I want to replace the acronyms in the first data frame with the corresponding full-names in the second data frame (assuming of course, the acronym exists in the second data frame). Overall, I want the result to look like:
Cancer | Gene |
---|---|
Anaplastic Astrocytoma | IDH1 |
Acral Melanoma | NRAS |
I was thinking of some kind of "for" loop, but I know this is frowned upon in R. As always, any guidance would be greatly appreciated!
I was thinking of some kind of "for" loop, but I know this is frowned upon in R.
It's not that it's frowned upon, it's that those who have experience in other programming languages tend to use for loops in R when they are not needed - either because R vectorizes by default, or because there are functions like lapply()
or map()
from the purrr
package that do the job of a for loop more efficiently.
In this case, you can just do a left_join()
, from the dplyr
package.
df1 <- data.frame(Cancer = c("AASTR", "ACRM"), Gene = c("IDH1", "NRAS"))
df2 <- data.frame(Acronym = c("AASTR", "ACRM"), Full_Name = c("Anaplastic Astrocytoma", "Acral Melanoma"))
dplyr::left_join(df1, df2, by = c("Cancer" = "Acronym"))
Cancer Gene Full_Name
1 AASTR IDH1 Anaplastic Astrocytoma
2 ACRM NRAS Acral Melanoma