rexcelopenxlsx

openxlsx package, read.xlsx check.names=false still placing a . in column names


Usuaully I use Tidyverse to read in excel files with the read_excel command, however I encountered the dreaded "Unknown or uninitialised column" bug that refers to a non existent column and then warns about said not existent column from then on through the workflow.

So I decided to use openxlsx instead to read in the excel files. All was going well until I realised that openxlsx sees column names with white space as not syntactically correct and it adds a . to replace the whitespace. So 'Customer Name' becomes 'Customer.Name'.

I tried using the check.names=FALSE command to leave the headers in tact, but the package seems to ignore this command.

Many of the headers might have more than a single space between the words and the format has to stay the same. I cannot use an excel package that relies on Java as our company has blocked it.

How can I force openxlsx to leave the header alone?

Example of the code I am using is here: IMACS <- read.xlsx("//zfsstdscun001a.rz.ch.com/UKGI_Pricing/Bus_Insights/R_Scripts/IMACS.xlsx",check.names=FALSE, sheet = "IMACS")


Solution

  • All credit to @Matt on this.

    Using readxl and read_excel together worked a treat.

    IMACS <- readxl::read_excel("//zfsstdscun001a.rz.com/UKGI_Pricing/Bus_Insights/R_Scripts/CAT Risks/IMACSV2.xlsx",
                                sheet = "IMACS")
    

    With openxlsx package, the following can be used:

    IMACS <- read.xlsx("//zfsstdscun001a.rz.ch.com/UKGI_Pricing/Bus_Insights/R_Scripts/IMACS.xlsx", 
                       sep.names = " ", sheet = "IMACS")