rnulllevels

Levels function returning NULL


I'm hoping this is an easy fix. Whenever I run levels(df) am I given a NULL output. This isn't specific to my data frame as it occurs with any set of data that I use. I am thinking that there may be an issue with one of my packages. Has anyone run into this or know of a fix? Thanks


Solution

  • You can only run levels on a factor vector, not on a data frame.

    Example below

    > df <- data.frame(a = factor(c('a','b','c'), levels = c('a','b','c','d','e')),
    +                  b = factor(c('a','b','c')), 
    +                  c = factor(c('a','a','c')))
    > levels(df)
    NULL
    

    To see the level of every column in your data frame, you can use lapply

    > lapply(df, levels)
    $a
    [1] "a" "b" "c" "d" "e"
    
    $b
    [1] "a" "b" "c"
    
    $c
    [1] "a" "c"
    

    If you want the levels of a specific column, you can specify that instead:

    > levels(df[, 2])
    [1] "a" "b" "c"
    

    EDIT: To answer question below on why apply(df, 2, levels) returns NULL.

    Note the following from the documentation for apply():

    In all cases the result is coerced by as.vector to one of the basic vector types before the dimensions are set, so that (for example) factor results will be coerced to a character array.

    You can see this behavior when you try to take the class, and try a few other functions.

    > apply(df, 2, levels)
    NULL
    > apply(df, 2, class)
              a           b           c 
    "character" "character" "character" 
    > apply(df, 2, function(i) levels(i))
    NULL
    > apply(df, 2, function(i) levels(factor(i)))
    $`a`
    [1] "a" "b" "c"
    
    $b
    [1] "a" "b" "c"
    
    $c
    [1] "a" "c"
    

    Note that even though we can force apply() to treat the columns as factors, we lose the prior ordering/levels that were set for df when it was originally created (see column `a`). This is because it has been coerced into a character vector.