rgsubxtable

How can I use gsub in a user-supplied input to xtable's sanitize.text.function argument?


I have a table of summary statistics created in R. Rows correspond to variables, columns to different samples. I want to export this table to Latex using the xtable package. However, some variables are on a much larger scale, so I would like to round these variables. I have tried creating a user-supplied input to xtable's sanitize.text.function to do this:

dt <- data.table(sample1 = c(1.11, 2222.22), sample2 = c(3.33, 44444.44))  # data for MWE
rownames(dt) <- c('var 1', 'var 2')
xt <- print(xtable(dt),
            format.args=list(big.mark=","),
            sanitize.text.function = \(x) gsub('([0-9]+,[0-9]{3})\\.[0-9]{2}', '\\1', x)) 

# OUTPUT:
# \begin{table}[ht]
# \centering
# \begin{tabular}{rrr}
#   \hline
#  & sample1 & sample2 \\ 
#   \hline
# var 1 & 1.11 & 3.33 \\ 
#   var 2 & 2,222.22 & 44,444.44 \\ 
#    \hline
# \end{tabular}
# \end{table}

However the variables are not rounded in the output table as I would like them to be. Calling the same gsub function on the output of the call to xtable works:

gsub('([0-9]+,[0-9]{3})\\.[0-9]{2}', '\\1', xt)
[1] "\\begin{table}[ht]\n\\centering\n\\begin{tabular}{rrr}\n  \\hline\n & sample1 & sample2 \\\\ \n  \\hline\nvar 1 & 1.11 & 3.33 \\\\ \n  var 2 & 2,222 & 44,444 \\\\ \n   \\hline\n\\end{tabular}\n\\end{table}\n"

So why doesn't this work directly in the call to print.xtable? Having to save the output of call to gsub on the xtable object is a pain.

Bonus: I am trying this approach since, as far as I can see, xtable only allows me to format the number of decimal places for a whole column, rather than a whole row. Any approach that allows me fix the number of decimal places for a row would also solve my problem.


Solution

  • As pointed out in @ZéLoff's answer, I would need to convert the input columns to character to use sanitize.text.function. An alternative is to pre-process the numeric columns using round, floor, or trunc, depending on the desired output. However, with no further pre-processing, the print.xtable output still includes trailing zeros, even if I have rounded the inputs. One option, as suggested below, is to convert the input table to character. However then I cannot use big.mark for large numbers. Better to use the arguments to formatC to get the processed numeric columns looking as I want in the output table.

    In this example, this looks like this:

    dt <- data.table(sample1 = c(1.11, 2222.22), sample2 = c(3.33, 44444.44))  # data for MWE
    round_cols <- c(1, 2)
    dt[2, (round_cols) := lapply(.SD, round), .SDcols = round_cols]
    rownames(dt) <- c('var 1', 'var 2')
    xt <- print(xtable(dt),
                format.args=list(big.mark = ",",
                                 drop0trailing = T),
                comment = F)
    
    \begin{table}[ht]
    \centering
    \begin{tabular}{rrr}
      \hline
     & sample1 & sample2 \\ 
      \hline
    var 1 & 1.11 & 3.33 \\ 
      var 2 & 2,222 & 44,444 \\ 
       \hline
    \end{tabular}
    \end{table}