I would like to output data from R to an excel file, and have been using the openxlsx package.
writeDataTable is the function I have been using. The problem is that I want the entries of one of the columns to be a mixture of numbers and text. As a column of a dataframe in R must have all the same type of entries, everything gets converted to character entries.
When I open the excel file that is outputted I get the little green triangle in the corner of the number cells, telling me that they are a number stored as text and giving me the option to convert to number.
I'm wondering is it possible to get R to convert this for me in the Workbook Object that is created before saving the file.
I've spent a lot of time googling the problem, looking for different libraries that might help, but haven't been able to find anything so far. I have an example below. Note that this isn't the exact data frame I am using, but is for demonstrative purposes.
df = data.frame(A = c('Dog', '5', '7.04'), B = c('Cat', '12', '1.23'))
wb = createWorkbook()
addWorksheet(wb, "Sheet2")
writeDataTable(wb, "Sheet2", df)
output_file = "C:\\Users\\johndoe\\documents\\excel_file.xlsx"
saveWorkbook(wb, output_file)
Any help would be much appreciated.
Here's a way, but it will be painfully slow.
### unchanged
library(openxlsx)
df = data.frame(A = c('Dog', '5', '7.04'), B = c('Cat', '12', '1.23'))
wb = createWorkbook()
addWorksheet(wb, "Sheet2")
writeDataTable(wb, "Sheet2", df)
### this is the new part
for (cn in seq_len(ncol(df))) {
for (rn in seq_len(nrow(df))) {
if (!is.numeric(df[rn,cn]) && !is.na(val <- as.numeric(as.character(df[rn,cn])))) {
writeData(wb, "Sheet2", val, startCol = cn, startRow = 1L + rn)
}
}
}
### unchanged
saveWorkbook(wb, output_file)
The catch is that this is writing one cell at a time.
Ways to improve this:
If you have more numbers than strings, you might want to reverse this a little by converting the relevant columns with as.numeric
(producing a lot of NA
s), and then one-by-one over-write a cell's NA
value with the previous string.
You can look for runs of cells to replace (perhaps using rle
within a column), which will write clump-by-clump instead of cell-by-cell.