rexcelopenxlsx

Failed to keep the first empty rows when importing data from Excel into R


I'd like to keep the full data, including any empty rows and columns, when importing. However, I am not sure why the read.xlsx() function from the openxlsx package does not achieve this if the first row is an empty row.

For example, if we look at this simple .xlsx file:

enter image description here

And using this code to import data:

library(openxlsx)

read.xlsx(xlsxFile      = "dummy.xlsx",
          sheet         = 1,
          skipEmptyRows = FALSE,
          skipEmptyCols = FALSE,
          colNames      = FALSE)

I'd expect an empty row to also be present in R as the first row. However, the result shows that the first row is being skipped, and the actual value from row 2 is now in the first row:

enter image description here

It's a small issue but I do need the accurate coordinates for some tasks later. I'd be greatly appreciate if you know how to fix this issue. Many thanks!


Solution

  • openxlsx apparently has no option to do this.

    From this question, you can use read_xlsx from readxl which is part of the tidyverse. Set the range parameter using cellranger::cell_limits

    library(readxl)
    read_xlsx("test.xlsx", col_names = FALSE, range = cellranger::cell_limits(ul = c(1L, 1L)))
    New names:                                                                             
    • `` -> `...1`
    # A tibble: 2 × 1
      ...1                    
      <chr>                   
    1 NA                      
    2 Data row 1 (Excel row 2)