I've been trying to figure out a way of using R on how to extract from a CSV file that was created using the RISmed package from PubMed certain terms, for example latino in a way that would create a new variable "Latino" read the whole row and insert if there is any mention of the word yes or no in the newly created variable
how would I be able to do this and which package do you recommend?
Here is a sample of my code
library(RISmed)
library(dplyr) # tibble and other functions
RCT_topic <- 'randomized clinical trial'
RCT_query <- EUtilsSummary(RCT_topic, mindate=2016, maxdate=2017, retmax=100)
summary(RCT_query)
RCT_records <- EUtilsGet(RCT_query)
RCT_data <- data_frame('PMID'=PMID(RCT_records),
'Title'=ArticleTitle(RCT_records),
'Abstract'=AbstractText(RCT_records),
'YearPublished'=YearPubmed(RCT_records),
'Month.Published'=MonthPubmed(RCT_records),
'Country'= Country(RCT_records),
'Grant' =GrantID(RCT_records),
'Acronym' =Acronym(RCT_records),
'Agency' =Agency(RCT_records),
'Mesh'=Mesh(RCT_records))
Why not use grepl to add a column indicating whether or not a search term is found in the abstract column of your search results? grepl
will return a logical vector indicating TRUE if your pattern is found, or FALSE if is not.
# There are no mentions of "Latino" or "latino" in your df.
RCT_data$Latino <- grepl("Latino|latino",RCT_data$Abstract)
# There are several mentions of the word "pain":
RCT_data$Pain <- grepl("pain",RCT_data$Abstract)