I have a data frame that I want to save into XLSX format using openxlsx2
.
This data frame contains entries with specific columns with identical values.
I want to highlight those entries (the whole row) in the output XLSX file, using different fill colors.
I'm not going to go into detail on how I get those entries, but at the end, I end up with a data frame pretty similar to this modified mtcars
in the MWE example below:
#load data
data(mtcars)
mtcars$model <- rownames(mtcars)
rownames(mtcars) <- NULL
brand_split <- strsplit(mtcars$model, "\\s+")
mtcars$brand <- sapply(brand_split, "[", 1)
#color data frame
color_df <- data.frame(model=c("Mazda RX4","Mazda RX4 Wag","Merc 230","Merc 240D","Merc 280","Merc 280C","Merc 450SE","Merc 450SL","Merc 450SLC","Fiat 128","Fiat X1-9"),
color=c("red","red","blue","blue","green","green","yellow","yellow","yellow","purple","purple"))
mtcars <- dplyr::left_join(mtcars, color_df, by="model")
main_cols <- c('model','brand','gear','carb')
other_cols <- names(mtcars)[!(names(mtcars) %in% main_cols)]
mtcars <- mtcars[,c(main_cols, other_cols)]
... which produces:
> mtcars
model brand gear carb mpg cyl disp hp drat wt qsec vs am color
1 Mazda RX4 Mazda 4 4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 red
2 Mazda RX4 Wag Mazda 4 4 21.0 6 160.0 110 3.90 2.875 17.02 0 1 red
3 Datsun 710 Datsun 4 1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 <NA>
4 Hornet 4 Drive Hornet 3 1 21.4 6 258.0 110 3.08 3.215 19.44 1 0 <NA>
5 Hornet Sportabout Hornet 3 2 18.7 8 360.0 175 3.15 3.440 17.02 0 0 <NA>
6 Valiant Valiant 3 1 18.1 6 225.0 105 2.76 3.460 20.22 1 0 <NA>
7 Duster 360 Duster 3 4 14.3 8 360.0 245 3.21 3.570 15.84 0 0 <NA>
8 Merc 240D Merc 4 2 24.4 4 146.7 62 3.69 3.190 20.00 1 0 blue
9 Merc 230 Merc 4 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 blue
10 Merc 280 Merc 4 4 19.2 6 167.6 123 3.92 3.440 18.30 1 0 green
...
I want to create an XLSX file with that data frame (minus the color
column), in which each row is colored by the color defined in the color
column.
I checked different resources, like this and this... but nothing comes close to what I want to do, which seems to me like the easiest and most obvious case.
I found this question, which seems to want to accomplish the same, but they use the outdated openxlsx
, not openxlsx2
.
I tried looking in the openxlsx2 manual for similar functions, but no success so far.
This is my code so far, to just save the data in XLSX format:
mtcars_wb <- openxlsx2::wb_workbook()
mtcars_wb$add_worksheet(sheet = "mtcars")
mtcars_wb$add_data(sheet = "mtcars", x = mtcars[,-ncol(mtcars)], na.strings = "NA")
#
output_file <- "mtcars_test.xlsx"
openxlsx2::wb_save(mtcars_wb, file = output_file)
... which produces this:
Two problems here:
For 1
, based on the question above that uses openxlsx
, I want to do something similar like this (which obviously does not work):
for (color in unique(color_df$color)){
rows <- which(mtcars$color==color)
if (length(rows)>0){
color_style <- openxlsx2::create_style(fontColour = "#000000", fgFill = color)
mtcars_wb$add_style(sheet = "mtcars", cols = 1:ncol(mtcars), rows = rows, style = color_style, gridExpand = TRUE)
}
}
For 2
as a bonus, I would like to change the output window size, which I cannot figure out how... maybe something like this (which also does not work)?
options("openxlsx2.outputSize" = <MUCH BIGGER WINDOW SIZE>)
At the end, the output XLSX file should look like:
Thanks!
I've renamed the data frame to df
for simplicity. This gives the output you desire:
packages <- c("openxlsx2", "dplyr")
lapply(packages, library, character.only = TRUE)
# load data
df <- mtcars
df$model <- rownames(df)
rownames(df) <- NULL
brand_split <- strsplit(df$model, "\\s+")
df$brand <- sapply(brand_split, "[", 1)
# colors
color_df <- data.frame(model=c("Mazda RX4","Mazda RX4 Wag","Merc 230","Merc 240D","Merc 280","Merc 280C","Merc 450SE","Merc 450SL","Merc 450SLC","Fiat 128","Fiat X1-9"),
color=c("red","red","blue","blue","green","green","yellow","yellow","yellow","purple","purple"))
df <- left_join(df, color_df, by="model")
# pull out colors, don't want in output.
colors <- df$color
df <- df %>% select(-color)
main_cols <- c('model','brand','gear','carb')
other_cols <- names(df)[!(names(df) %in% main_cols)]
df <- df[,c(main_cols, other_cols)]
# WB and WS objects
mywb <- wb_workbook()
mywb$add_worksheet("mysheet")
wb_add_data(mywb, "mysheet", df, dims="A1")
# color the rows
lapply(seq_along(colors),
function(x) if(!(is.na(colors[x]))) {
mywb$add_fill("mysheet",
dims=paste0("A", x+1, ":", int2col(ncol(df)), x+1),
color=wb_color(name=colors[x]))
}
)
wb_save(mywb, "out.xlsx")