rmlr

Why won't MLR package create the single classification task for my data?


I am having a similar issue as this person , but the link to the tutorial they reference seems broken and my problem is more related to a single classifying function, whereas most other posts on this seem to be about multiple classifying functions.

Here is my data:

structure(list(Month_Name = structure(c(10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L), levels = c("April", "December", "February", "January", 
"June", "March", "May", "November", "October", "September"), class = "factor"), 
    Coffee_Cups = c(3, 0, 2, 6, 4, 5, 3, 3, 2, 2, 3, 1, 1, 3, 
    2, 2, 0, 1, 1, 4, 4, 3, 0, 1, 3, 0, 0, 0, 0, 2, 0, 1, 2, 
    3, 2, 2, 4, 3, 6, 6, 3, 4, 6, 8, 3, 5, 0, 2, 2, 8, 6, 4, 
    6, 4, 4, 2, 6, 6, 5, 1, 3, 1, 5, 4, 6, 5, 0, 6, 6, 4, 4, 
    2, 2, 6, 6, 7, 3, 3, 0, 5, 7, 6, 3, 5, 3, 3, 1, 9, 9, 3, 
    3, 6, 6, 6, 3, 0, 7, 6, 6, 3, 9, 3, 8, 8, 3, 3, 7, 6, 3, 
    3, 3, 6, 6, 6, 1, 9, 3, 3, 2, 6, 3, 6, 9, 6, 8, 9, 6, 6, 
    6, 0, 3, 0, 3, 3, 6, 3, 0, 9, 3, 0, 2, 0, 6, 6, 6, 3, 6, 
    3, 9, 3, 0, 0, 6, 3, 3, 3, 3, 6, 0, 6, 3, 3, 5, 5, 3, 0, 
    6, 4, 2, 0, 2, 4, 0, 6, 4, 4, 2, 2, 0, 9, 6, 3, 6, 6, 9, 
    0, 6, 6, 6, 6, 6, 6, 3, 3, 0, 9, 6, 3, 6, 3, 6, 1, 6, 6, 
    6, 6, 6, 1, 3, 9, 6, 3, 6, 9, 3, 5, 6, 3, 0, 6, 3, 3, 5, 
    0, 6, 3, 5, 3, 0, 6, 7, 3, 6, 6, 6, 6, 3, 5, 6, 7, 6, 6, 
    4, 6, 4, 5, 5, 6, NA, 8, 6, 6, 6, 9, 3, 3, 9, 7, 8, 4, 3, 
    3, 3, 6, 6, 6, 3, 4, 3, 3, 6, 4, 3, 3, 4, 6, 0, 3, 6, 4, 
    3, 3, 7, 4, 4, 3, 1, 6, 4, 6), Mins_Work = c(435, 350, 145, 
    135, 15, 60, 60, 390, 395, 395, 315, 80, 580, 175, 545, 230, 
    435, 370, 255, 515, 330, 65, 115, 550, 420, 45, 266, 196, 
    198, 220, 17, 382, 0, 180, 343, 207, 263, 332, 0, 0, 259, 
    417, 282, 685, 517, 111, 64, 466, 499, 460, 269, 300, 427, 
    301, 436, 342, 229, 379, 102, 146, NA, 94, 345, 73, 204, 
    512, 113, 135, 458, 493, 552, 108, 335, 395, 508, 546, 396, 
    159, 325, 747, 650, 377, 461, 669, 186, 220, 410, 708, 409, 
    515, 413, 166, 451, 660, 177, 192, 191, 461, 637, 297, 601, 
    586, 270, 479, 0, 480, 397, 174, 111, 0, 610, 332, 345, 423, 
    160, 611, 0, 345, 550, 324, 427, 505, 632, 560, 230, 495, 
    235, 522, 654, 465, 377, 260, 572, 612, 594, 624, 237, 0, 
    38, 409, 634, 292, 706, 399, 568, 0, 694, 298, 616, 553, 
    581, 423, 636, 623, 338, 345, 521, 438, 504, 600, 616, 656, 
    285, 474, 688, 278, 383, 535, 363, 470, 457, 303, 123, 363, 
    329, 513, 636, 421, 220, 430, 428, 536, 156, 615, 429, 103, 
    332, 250, 281, 248, 435, 589, 515, 158, 0, 649, 427, 193, 
    225, 0, 280, 163, 536, 301, 406, 230, 519, 0, 303, 472, 392, 
    326, 368, 405, 515, 308, 259, 769, 93, 517, 261, 420, 248, 
    265, 834, 313, 131, 298, 134, 385, 648, 529, 487, 533, 641, 
    429, 339, 508, 560, 439, 381, 397, 692, 534, 148, 366, 167, 
    425, 315, 476, 384, 498, 502, 308, 360, 203, 410, 626, 593, 
    409, 531, 157, 0, 357, 443, 615, 564, 341, 352, 609, 686, 
    386, 323, 362, 597, 325, 51, 570, 579, 284, 0, 530, 171, 
    640, 263, 112, 217, 152, 203, 394)), row.names = c(NA, -290L
), class = c("tbl_df", "tbl", "data.frame"))

I'm trying to use the makeClassifTask function, but when I use the following code:

task.work <- makeClassifTask(
  data = work,
  target = "class"
)

I get the following error:

Warning in makeTask(type = type, data = data, weights = weights, blocking = blocking,  :
  Provided data is not a pure data.frame but from class tbl_df, hence it will be converted.
Error in makeSupervisedTask("classif", data, target, weights, blocking,  : 
  Column names of data doesn't contain target var: class

I tried manually changing the Month_Name variable to a factor with as.factor and the two numeric variables with as.double but this seems to not have fixed the issue. I've also tried using a regular data frame and a tibble, but the results remain the same. Is there something else I'm missing here?


Solution

  • target expects a column name from your input data.frame, in this case "Month_Name". You passed "class", which does not exist in your object.

    You don't need to define the object class when creating a data.frame as you did in class = . It looks like from your example that you're confused about setting a class type in R vs. passing a (factor) variable to the argument target which should be column name of the data.frame you're handing over.

    PS: When starting out fresh, you're way better off using {mlr3} instead of {mlr}. The latter is deprecated since 3 years.