rparsingsubstitution

How does `substitute` choose the deparsed representation of objects?


base::substitute's documentation describes it as follows:

Substitution takes place by examining each component of the parse tree as follows: If it is not a bound symbol in ‘env’, it is unchanged. If it is a promise object, ... the expression slot of the promise replaces the symbol. If it is an ordinary variable, its VALUE is substituted, unless ‘env’ is .GlobalEnv in which case the symbol is left unchanged.

In the final sentence, "value" is ambiguous: we need to put a deparsed representation of the object into the returned language object. But how is this representation chosen?

My assumption would be to use base::deparse, but it doesn't consistently work this way:

e <- new.env()
e$df <- tibble::tibble(1:5)
e$env <- new.env()
e$x <- 1

substitute(x+1, e)   # 1 + 1
substitute(df+1, e)  # list(`1:5` = 1:5) + 1 (does not match deparse(e$df))
substitute(env+1, e) # <environment> + 1     (does match deparse(e$env)) 

# if expr to substitute is a name, return the object itself 
# (no need to return a language object, no need for deparsing)
identical(substitute(df, e), e$df)
identical(substitute(env, e), e$env) 

Solution

  • I don't see ambiguity. "value" refers to the value bound to the symbol. The documentation clearly says:

    substitute returns the parse tree for the (unevaluated) expression expr, substituting any variables bound in env.

    Don't confuse what is printed with the actual representation of "value".

    e1 <- new.env()
    e1$DF <- data.frame(a = 1)
    
    x <- substitute(DF + 1, env = e1)
    
    print(x)
    #list(a = 1) + 1
    
    as.list(x)
    # [[1]]
    # `+`
    # 
    # [[2]]
    # a
    # 1 1
    # 
    # [[3]]
    # [1] 1
    
    str(as.list(x)[[2]])
    #'data.frame':  1 obs. of  1 variable:
    #  $ a: num 1
    
    
    e1$env <- new.env()
    x <- substitute(env + 1, env = e1)
    
    print(x)
    #<environment> + 1
    
    as.list(x)
    # [[1]]
    # `+`
    # 
    # [[2]]
    # <environment: 0x0000025fe3a9f040>
    #   
    #   [[3]]
    # [1] 1
    
    call("+" , e1$env, 1)
    #<environment> + 1
    

    So, maybe your question boils down to how calls are printed? That would need a dive into the internal source code of print.default. Deparsing is involved and I believe it is equivalent to deparse1(x, control = "niceNames").

    deparse1(x, control = "niceNames")
    #[1] "<environment> + 1"
    
    x <- substitute(DF + 1, env = e1)
    deparse1(x, control = "niceNames")
    #[1] "list(a = 1) + 1"