I'm getting crazy with a (apparently) very simple problem.
I stored in a data frame Excel formulas as strings. Some of these strings are hyperlinks. From these hyperlink formulas, I want to extract only the displayed text and modify in place the cell of the data frame accordingly.
For columns whose first row contains a string with an hyperlink formula, it works for the whole column.
For columns whose first row doesn't contain a string with an hyperlink formula, it doesn't work for the whole column.
Here is a reprex:
DF <- data.frame(
x = c('HYPERLINK("[Individual files/NetworkData4MassiveExport_Individual_1.xlsx]1_2016_Nodes!A3", "\'Nodes\' tab")', 'NA()'),
y = c('NA()', 'HYPERLINK("[Individual files/NetworkData4MassiveExport_Individual_2.xlsx]2_2019_Nodes!A3", 1)'))
DF
DF[] <- lapply(DF, function(x)
ifelse(substr(x, 1, 9) == 'HYPERLINK', trimws(gsub('[\")]', '', unlist(strsplit(x, ','))[2])), gsub('"', '', x)))
DF
Maybe a single sub()
command can handle it.
lapply(DF, \(x) sub("HYPERLINK\\(.+, (.+)\\)", "\\1", x)) |>
as.data.frame()
# x y
# 1 "'Nodes' tab" NA()
# 2 NA() 1