rlmpoly

Use poly on dataframe column


I have a dataframe like so

wood <- read_csv("/Users/name/Desktop/AR/Exercise data-20201109/woodstrength.csv")

I select x and y

x <- wood %>% select(Conc)
y <- wood %>% select(Strength)

The relationship can be modeled with a polynomial of degree 2 so I do

m <- lm(y ~ poly(x, 2, raw = TRUE))

which returns

non-numerical argument for binary operator

but x looks like this

> x
# A tibble: 19 x 1
    Conc
   <dbl>
 1   1  
 2   1.5
 3   2  
 4   3  
 5   4  
 6   4.5
 7   5  
 8   5.5
 9   6  
10   6.5
11   7  
12   8  
13   9  
14  10  
15  11  
16  12  
17  13  
18  14  
19  15 

What am I doing wrong?


Solution

  • If you look at the help page (?poly) for poly :

    poly(x, ..., degree = 1, coefs = NULL, raw = FALSE, simple = FALSE)
    [...]
    x, newdata: a numeric vector at which to evaluate the polynomial. ‘x’
              can also be a matrix.  Missing values are not allowed in ‘x’.
    

    Your dataset is tibble, when you do select it keeps the object as a tibble:

    wood = tibble(`Conc` = rnorm(10),'Strength'=rnorm(10))
    x <- wood %>% select(Conc)
    
    class(x)
    [1] "tbl_df"     "tbl"        "data.frame"
    

    You get that error because underneath the function, it applies something that expects a matrix or vector, but sees a list or data.frame or in your case a tibble, hence the error. You can see why calling the column out works:

     class(wood[["Conc"]])
    [1] "numeric"
    

    To get it to turn into a numeric or vector, you can do:

    x <- wood %>% pull(Conc)
    y <- wood %>% pull(Strength)
    m <- lm(y ~ poly(x, 2, raw = TRUE))
    

    Or:

    m <- lm(Strength ~ poly(Conc, 2, raw = TRUE),data=wood)