under R version 4.4.2 (2024-10-31) -- "Pile of Leaves", latest macos
$ R --vanilla
> load(file="tttdf")
> str(ttt)
'data.frame': 3 obs. of 17 variables:
$ .mn.r : num 0 0 0
$ .sd.r : num 0 0 0
$ .mn.g : num 0 0 0
$ .sd.g : num 0 0 0
$ .cor.r.g : num 1 1 1
$ sep : num -1 -1 -1
$ beta.g.ldp : num 0 0 0
$ beta.dp.ldp: num 1 1 1
$ beta.r.ldp : num 0 0 0
$ sep : num -2 -2 -2
$ lastdpr : num -3 -5 -6
$ declinedpr : num 0 2 3
$ sep : num -3 -3 -3
$ beta.r.lr : num 0 0 0
$ beta.g.lg : num 0 0 0
$ beta.g.lr : num 0 0 0
$ beta.r.lg : num 0 0 0
ttt <- within(ttt, hello <- 22)
Error in `[<-.data.frame`(`*tmp*`, nl, value = list(hello = 22, .mn.r = c(0, :
duplicate subscripts for columns
> ## make it work
> xxx <- ttt[,1:ncol(ttt)]
> xxx <- within(xxx, hello <- 22)
I have no idea what could be causing this. This is why I can't shorten the example, either --- e.g., by removing columns.
The sep
column is duplicated. Subsetting the dataframe using ttt[, 1:ncol(ttt)]
automatically repairs the column names, which resolves the issue.
In the following example, I create a dataframe with two identical column names. It produces the same error you get. When I subset the columns, their names are fixed.
df <- data.frame(a = 1, a = 2, check.names = FALSE)
within(df, hello <- 22)
# Error in `[<-.data.frame`(`*tmp*`, nl, value = list(hello = 22, a = 1, :
# duplicate subscripts for columns
df[1:ncol(df)]
# a a.1
# 1 1 2
Explanation:
The behavior that subsetting produces unique names is documented in help(`[.data.frame`)
; column names will be transformed to be unique, using make.unique()
, if necessary (e.g., if columns are selected more than once, or if more than one column of a given name is selected if the data frame has duplicate column names). Also see help(make.names)
which additionally produces 'valid' names.
> make.unique(names(df))
[1] "a" "a.1"