I googled many times and the result was not what I want:
A sample dataset is provided as below:
year = c(1991,1996,2001,2006,2011,2016,2021)
factor(a,levels = c(1991,1996,2001,2011,2016,2021))
The result was:
[1] 1991 1996 2001 <NA> 2011 2016 2021
Levels: 1991 1996 2001 2011 2016 2021
I want to set the level of 2006
to be the same as 2001
, therefore, my favorable outcome will be:
[1] 1991 1996 2001 2006 2011 2016 2021
Levels: 1991 1996 2001 2011 2016 2021
Is it possible to change the levels of 2006
to be the same as 2001
without changing the original content of the vector year
?
When you dig into the source code of factor
, I guess you will have the answer in your mind (I think it should be "No" to your question)
> factor
function (x = character(), levels, labels = levels, exclude = NA,
ordered = is.ordered(x), nmax = NA)
{
if (is.null(x))
x <- character()
nx <- names(x)
if (missing(levels)) {
y <- unique(x, nmax = nmax)
ind <- order(y)
levels <- unique(as.character(y)[ind])
}
force(ordered)
if (!is.character(x))
x <- as.character(x)
levels <- levels[is.na(match(levels, exclude))]
f <- match(x, levels)
if (!is.null(nx))
names(f) <- nx
if (missing(labels)) {
levels(f) <- as.character(levels)
}
else {
nlab <- length(labels)
if (nlab == length(levels)) {
nlevs <- unique(xlevs <- as.character(labels))
at <- attributes(f)
at$levels <- nlevs
f <- match(xlevs, nlevs)[f]
attributes(f) <- at
}
else if (nlab == 1L)
levels(f) <- paste0(labels, seq_along(levels))
else stop(gettextf("invalid 'labels'; length %d should be 1 or %d",
nlab, length(levels)), domain = NA)
}
class(f) <- c(if (ordered) "ordered", "factor")
f
}
<bytecode: 0x00000186f0fe3640>
<environment: namespace:base>
As we can see, levels
is generated either by unique(x, nmax = nmax)
if the levels
argument is not provided, or, levels[is.na(match(levels, exclude))]
with the given levels
. That means, you are not possible to have a single level
for two x
values.