rvalidationxlconnect

XLConnect function is not passing argument to system.file


I have an issue with XLConnect library. Its system.file function, I've only seen be used with

file1 <- system.file(file, package ="XLConnect")

Where file is always equal to a string. However I have a list of strings with the file name of each file in it. I have this inside of a for loop that iterates through the list. For whatever reason, file1 is empty even though the file object has the equivalent "soandso.xlsx".

So my question is do I need file to be a string?

Edit #1: So I switched from using the XLConnect library which was giving me a separate error to xlsx library and at least now, I can get the code to run, albeit with different errors.

I think I am making progress though because I had the code unsuccessfully run past the first loop (since k incremented to 2) and I had something populate my dataset variable (where I store the 'read.xlsx2' reponse).

I've attached screenshot of my current error:

Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  : java.lang.IllegalArgumentException: Your InputStream was neither an OLE2 stream, nor an OOXML stream

Screenshot of my R Studio

Edit # 2:

https://www.dropbox.com/s/s9413oee146v497/Q1%2C%202011%20-%20Halton%2C%20all%20home%20types.xlsx?dl=0 https://www.dropbox.com/s/tgaup2ihnlq7br2/Q4%2C%202015%20-%20Halton%2C%20condo%20types.xlsx?dl=0

I've included 2/60 file I need to attach. All files at my destination are Excel files, which are 10 columns wide and variable row length - due to new regions or just regions just being renamed. The new error I'm getting is :

In addition: Warning message: In unzip(xlsxFile, exdir = xmlDir) : error 1 in extracting from zip file

Edit#3:

source('~/.active-rstudio-document', echo=TRUE)

> library("openxlsx")

> file_list = list.files(pattern = "xlsx$")

> dataset = data.frame()

> for (file in file_list){
+ 
+   print(file)
+   temp = read.xlsx(file,sheet = 1, colNames = F)
+   temp$quarter = substr(file,1,2)
+   temp$year = s .... [TRUNCATED] 
[1] "~$Q1, 2011 - Halton, all home types.xlsx"
Error in file(con, "r") : invalid 'description' argument
In addition: Warning message:
In unzip(xlsxFile, exdir = xmlDir) : error 1 in extracting from zip file

Sorry but I'm still having issues, however I know where the problem is. For some reason what's being printed as file is coming with "~$" before it, when I input just the file name 'Q1, 2011 - Halton, all home types.xlsx' as the file, it works and I get data to be inputted into dataframe. Now the problem with this I obviously cannot mass import 60 different Excel names...


Solution

  • Try using openxlsx library and the following code:

    library("openxlsx")
    
    file_list = list.files(pattern = "xlsx$")
    
    dataset = data.frame()
    
    for (file in file_list){
        print(file)
        temp = read.xlsx(file,sheet = 1, colNames = F)
        temp$quarter = substr(file,1,2)
        temp$year = substr(file,4,8)
        dataset = rbind(dataset,temp)
    }
    

    Edit 1: Added a print(file) to detect which xlsx file is having unzip problems.

    Edit 2: Added quarter and year information by reading them from the filename. This only works if all filenames follow the same pattern.