rglobal-variablesslice

Why is it not possible to replace part of a global vector in this example?


R version 4.5.0 (2025-04-11 ucrt) RGui console in Windows 11.

I happened to come across the following problem. Consider the global assignment statement:

xxx <<- c(1,2)
xxx[1:2]
ls()

This gives

[1] 1 2
[1] "xxx"

However, this example:

xxx <<- c(1,2)
xxx[1:2]
ls()
xxx[1:2] <<- c(11, 22)

Results in

Error: object 'xxx' not found

Q: Why is object xxx not found? Where to find the relevant documentation?

Note: Above statement are executed immediately after startup in the R console for Windows 11. This issue may occur when copying instructions from a function to the global environment.

Environment

  globalenv()
# environment: R_GlobalEnv>
  environment()
# environment: R_GlobalEnv>

Solution

  • I think this is a result of bad design in R (which might be excused because it was inherited from S, I'm not sure). Here's what is going on:

    When you run

    xxx <<- c(1,2)
    

    R will look for xxx in the parent environments of the current environment (which is shown by printing the result of environment(), i.e. <environment: R_GlobalEnv>). This search fails, because there's no xxx higher up in the environment tree. At this point, a good design would signal an error and you wouldn't try that again.

    However, R doesn't signal an error, it defaults to making the assignment in globalenv(). This leads to tons of confusion, e.g. to people thinking <<- means "global assignment", when really it means "assignment to an ancestor environment or maybe global". It would be better if it was just "assignment to an ancestor environment".

    Your second assignment is more complicated, because it assigns to a subset of xxx. If you had used the regular assignment operator

    xxx[1:2] <- c(11, 22)
    

    R would go through several steps. As the Language manual explains, these are equivalent to

    `*tmp*` <- xxx
    xxx <- "[<-"(`*tmp*`, 1:2, value=c(11, 22))
    rm(`*tmp*`)
    

    The manual also explains what happens when you do both, as in your

    xxx[1:2] <<- c(11, 22)
    

    though I think there are typos. Correcting those, this is equivalent to

    `*tmp*` <- get("xxx", envir=parent.env(), inherits=TRUE)
    `*tmp*`[1:2] <- c(11, 22)
    xxx <<- `*tmp*`
    rm(`*tmp*`)
    

    and here the first line fails, because xxx is not in the parent environment. This is a good thing. We should have had a failure in

    xxx <<- c(1,2)
    

    too, but we don't because of the bad design.