rr-formula

Formulas in user-defined functions in R


Formulas are a very useful feature of R's statistical and graphical functions. Like everyone, I am a user of these functions. However, I have never written a function that takes a formula object as an argument. I was wondering if someone could help me, by either linking to a readable introduction to this side of R programming, or by giving a self-contained example.


Solution

  • You can use model.matrix() and model.frame() to evaluate the formula:

    lm1 <- lm(log(Volume) ~ log(Girth) + log(Height), data=trees)
    print(lm1)
    
    form <- log(Volume) ~ log(Girth) + log(Height)
    
    # use model.matrix
    mm <- model.matrix(form, trees)
    lm2 <- lm.fit(as.matrix(mm), log(trees[,"Volume"]))
    print(coefficients(lm2))
    
    # use model.frame, need to add intercept by hand
    mf <- model.frame(form, trees)
    lm3 <- lm.fit(as.matrix(data.frame("Intercept"=1, mf[,-1])), mf[,1])
    print(coefficients(lm3))
    

    which yields

    Call: lm(formula = log(Volume) ~ log(Girth) + log(Height), data = trees)
    
    Coefficients: (Intercept)   log(Girth) log(Height)
          -6.63         1.98         1.12
    
    (Intercept)  log(Girth) log(Height)
         -6.632       1.983       1.117  
    Intercept  log.Girth. log.Height.
         -6.632       1.983       1.117