I often categorise times into day/night time using cut()
. Because cut()
doesn't understand that clock times go around zero, I first divide the hours into three groups (night either side of day), and then merge the two "night" factor levels. This can be done by giving the same "night" value twice to levels()
. E.g.
x <- c(4, 10, 23) # i.e. 4 am, 10 am, 11 pm
x <- cut(x
, breaks = c(0, 6, 22, 23)
, include.lowest = FALSE
, labels = c("night2", "day", "night1"))
# [1] night2 day night1
# Levels: night2 day night1
levels(x) <- c("night", "day", "night")
x
# [1] night day night
# Levels: night day
Now I'm trying to do the same thing with a huge dataset in an ff
object:
require(ff)
require(ffbase)
y <- ff(c(4, 10, 23))
y <- ff(cut(y
, breaks = c(0, 6, 22, 23)
, include.lowest = FALSE
, labels = c("night2", "day", "night1")))
y
# ff (open) integer length=3 (3) levels: night2 day night1
# [1] [2] [3]
# night2 day night1
levels(y) <- c("night", "day", "night")
y
# ff (open) integer length=3 (3) levels: night day night
# [1] [2] [3]
# night day night
Note that in this case, levels()
has retained three factor levels, two of which have the same label. recodeLevels
looked promising but doesn't quite do the same thing:
y <- recodeLevels(y, c("night", "day", "night"))
y
# ff (open) integer length=3 (3) levels: night day night
# [1] [2] [3]
# NA day NA
I've also tried duplicate "night" labels within cut()
(actually cut.ff()
), but it still returns three levels, plus a warning that duplicate levels in factors are deprecated.
Thanks for your advice.
This might be what you are looking for. Use recodeLevels
from package ff
require(ff)
y <- c(4, 10, 23)
y <- ff(cut(y, breaks = c(0, 6, 22, 23), include.lowest = FALSE,
labels = c("night2", "day", "night1")))
levels(y) <- c("night", "day", "night")
alllevs <- c("night", "day")
y <- recodeLevels(y, alllevs)
levels(y) <- alllevs
y
ff (open) integer length=3 (3) levels: night day
[1] [2] [3]
night day night