xmlrsortingdataframeodk

Preserve order of variables when merging XML files into a data frame in R


I've got a directory full of XML files that I'd like to merge into a data frame. The XML files are output of an ODK survey, and contain measurements only of the variables that were not left blank by the respondent. When importing, unlisting, and merging, the original order in which the questions were asked is lost. I've got (I can make) a data frame with the questions in one column and the order in the second. How can I use that data frame to re-order the resultant data frame?

Here is an example workflow:

library(XML)
filenames = list.files(,recursive=TRUE)

a = xmlToList(filenames[1])
b = data.frame(t(unlist(a)))

for (i in 2:length(filenames)){
    b = xmlToList(filenames[i])
    b = data.frame(t(unlist(b)))
    a = merge(a,b,all=TRUE)
    print(paste("merged ",i))
}

Now for the reproducible part; here is an example a and b:

a = data.frame(v1=2, v5=2, v8=1)
b = data.frame(v1=2, v2=4, v8=3)
m = merge(a,b,all=TRUE)

Which gives me:

R> m
  v1 v8 v5 v2
1  2  1  2 NA
2  2  3 NA  4

I have (can make) a data frame that looks like this:

orderframe = data.frame(varname=c("v1", "v2", "v5", "v8"),
                         order = c(1, 2, 5, 8))

How can I use orderframe to sort the columns of m?


Solution

  • I'm not quite sure what you mean "sort the columns". If you just mean, re-arrange the columns so that column 1 is v1, column 2 is v2, column 3 isv5`, etc, then this

    m[,as.character(orderframe$varname)]
    

    The reason for as.character is that varname is a factor.