rr-faqr-formula

Use of ~ (tilde) in R programming Language


I saw in a tutorial about regression modeling the following command:

myFormula <- Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width

What exactly does this command do, and what is the role of ~ (tilde) in the command?


Solution

  • The thing on the right of <- is a formula object. It is often used to denote a statistical model, where the thing on the left of the ~ is the response and the things on the right of the ~ are the explanatory variables. So in English you'd say something like "Species depends on Sepal Length, Sepal Width, Petal Length and Petal Width".

    The myFormula <- part of that line stores the formula in an object called myFormula so you can use it in other parts of your R code.


    Other common uses of formula objects in R

    The lattice package uses them to specify the variables to plot.
    The ggplot2 package uses them to specify panels for plotting.
    The dplyr package uses them for non-standard evaulation.