rdataframecoefficients

How to add coefficients to existing data base such that their effect on the final intercept is given?


Firstly, let's say I have a data frame df with variables y, x1, x2, x1 is a continuous variable and x2 is a factor.

Let's say I have a model:

model <- glm(y ~ x1 + x2, data = df, family = binomial)

This will result in an object where I can extract the coefficients using the command model$coefficients.

However, for use in another program I would like to export the data frame df, but I'd also like to be able to display the results of the model beyond simply adding the fitted values to the data frame.

Therefore I would like to have coeff1*x1 and coeff2*x2 also in the same dataframe, so that I could use these and the original data together to display their effects. The problem arises from the fact that one of the variables is a multi-level factor and therefore it's not preferable to simply use a for-loop to extract the coefficients and multiply the variables with them.

Is there another way to add two new variables to the dataframe df such that they've been derived from combining the original variables x1, x2 and their respective coefficients?


Solution

  • Try:

    set.seed(123)
    N <- 10
    
    df <- data.frame(x1 = rnorm(N, 10, 1),
                     x2 = sample(1:3, N, TRUE), 
                     y = as.integer(50 - x2* 0.4 + x1 * 1.2 + rnorm(N, 0, 0.5) > 52))
    
    model <- glm(y ~ x1 + x2, data = df, family = binomial)
    
    # add column for intercept
    df <- cbind(x0 = rep(1, N), df)
    df$intercept <- df$x0 * model$coefficients["(Intercept)"]
    df[["coeff1*x1"]] <- df$x1 * model$coefficients["x1"]
    df[["coeff2*x2"]] <- df$x2 * model$coefficients["x2"]
    
    # x0        x1 x2 y intercept     coeff1*x1     coeff2*x2
    # 1   1  9.439524  1 1  24.56607 -3.361333e-06 -4.281056e-07
    # 2   1  9.769823  1 1  24.56607 -3.478949e-06 -4.281056e-07
    # 3   1 11.558708  1 1  24.56607 -4.115956e-06 -4.281056e-07
    

    Alternatively:

    # add column for intercept
    df <- cbind(x0 = rep(1, N), df)
    tmp <- as.data.frame(Map(function(x, y) x * y, subset(df, select = -y),     model$coefficients))
    names(tmp) <- paste0("coeff*", names(model$coefficients))
    cbind(df, tmp)