rabstract-syntax-treesymbolic-matharithmetic-expressions

Conversion of an abstract syntax tree with R


Given an arithmetric expression, for example x + y*z, I want to convert it to add(x, multiply(y, z)).

I found a helpful function here:

> getAST <- function(ee) purrr::map_if(as.list(ee), is.call, getAST)
> getAST(quote(x + y*z)) 
[[1]]
`+`

[[2]]
x

[[3]]
[[3]][[1]]
`*`

[[3]][[2]]
y

[[3]][[3]]
z

One can use rapply(result, as.character, how = "list") to get characters instead of symbols.

How to get add(x, multiply(y, z)) from this AST (the result)? This becomes more complicated when there are some parentheses:

> getAST(quote((x + y) * z)) 
[[1]]
`*`

[[2]]
[[2]][[1]]
`(`

[[2]][[2]]
[[2]][[2]][[1]]
`+`

[[2]][[2]][[2]]
x

[[2]][[2]][[3]]
y



[[3]]
z

I don't require the answer must use the getAST function. It's just a possible way to go.

Of course in my real use case the expressions are longer.


Here is a solution (I think) for the case when there's no parentheses:

getAST <- function(ee) purrr::map_if(as.list(ee), is.call, getAST)

ast <- rapply(getAST(quote(x + y*z)), as.character, how = "list")

convertAST <- function(ast) {
  op <- switch(
    ast[[1]],
    "+" = "add",
    "-" = "subtract",
    "*" = "multiply",
    "/" = "divide"
  )
  left <- ast[[2]]
  right <- ast[[3]]
  if(is.character(left) && is.character(right)) {
    return(sprintf("%s(%s, %s)", op, left, right))
  }
  if(is.character(left)) {
    return(sprintf("%s(%s, %s)", op, left, convertAST(right)))
  }
  if(is.character(right)) {
    return(sprintf("%s(%s, %s)", op, convertAST(left), right))
  }
  return(sprintf("%s(%s, %s)", op, convertAST(left), convertAST(right)))
}

convertAST(ast)

Solution

  • We can use substitute like this:

    subst <- function(e, sub = list(`+` = "add", 
                                    `-` = "minus",
                                    `/` = "divide",
                                    `*` = "multiply")) {
      sub <- Map(as.name, sub)
      do.call("substitute", list(e, sub))
    }
    
    # test
    e <- quote(x + (y + 1) * z)
    res <- subst(e); res
    ## add(x, multiply((add(y, 1)), z))
    
    # evaluate test against values
    add <- `+`; multiply <- `*`; x <- 1; y <- 2; z <- 3
    eval(res)
    ## [1] 10
    

    If you want a character string result then

    deparse1(subst(e))
    ## [1] "add(x, multiply((add(y, 1)), z))"