In R (4.4.0), I found that the base function merge
accepts an argument iby
in place of by
. The RGUI with vanilla setting reproduces the same result for two different computers I have.
However, when I investigate the source code of base::merge
by getAnywhere(merge.data.frame)
, there is not mention of iby
argument. Thus, I have no idea why iby
argument can successfully run.
Can anybody tell why this is the case? What are the mechanisms behind?
df1 <- data.frame(ID = c(1, 2, 3), Name = c("Alice", "Bob", "Charlie"), stringsAsFactors = FALSE)
df2 <- data.frame(ID = c(1, 2, 4), Age = c(24, 25, 26), stringsAsFactors = FALSE)
result_iby <- merge(df1, df2, iby = "ID")
result_by <- merge(df1, df2, by = "ID")
print(result_iby)
OK, I have a theory. First of all, this works too:
result_garbage <- merge(df1, df2, garbage = "ID")
There are two reasons this looks like it works.
merge
has a ...
argument that will swallow any unrecognized argumentsby
argument is intersect(names(x), names(y))
, which happens to be "ID" in this case.So the iby
argument is being ignored but the same value is being filled in by the default.
A test of this is that merge(df1, df2, by = "garbage")
throws
"Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column")
while merge(df1, df2, iby = "garbage")
works fine (merges on "ID").
Arguably R should be more helpful about reporting when unrecognized arguments are passed through ...
and discarded ... As people have realized this, functions like rlang::check_dots_used (and similar functions in other packages) have become more widely used ...