rlinear-regressionquadprog

Constraints for a Linear Model Using Quadprog for Ranges/Limits on Coefficients


I'm doing some experimented, and I'm certainly aware why constraining coefficients is rarely needed, but here goes.

In the following data, I have used quadprog to solve a linear model. Note that X1 is simply the intercept.

X1 <- 1
X2 <- c(4.374, 2.3708, -7.3033, 12.0803, -0.4098, 53.0631, 13.1304, 7.3617, 16.6252, 27.3394)
X3 <- c(2.6423, 2.6284, 36.9398, 15.9278, 18.3124, 54.5039, 3.764, 19.0552, 25.4906, 13.0112)
X4 <- c(4.381, 3.144, 9.506, 15.329, 21.008, 38.091, 22.399, 13.223, 17.419, 19.87)
X <- as.matrix(cbind(X1,X2,X3,X4))
Y <- as.matrix(c(37.7,27.48,24.08,25.97,16.65,73.77,45.10,53.35,61.95,71.15))
M1 <- solve.QP(t(X) %*% X, t(Y) %*% X, matrix(0, 4, 0), c())$solution

The challenge is to subject certain coefficients to constraints. I know that I should be altering the Amet and bvac parameters (according to Linear regression with constraints on the coefficients). However, I'm not sure how to set it up, so that the following constraints are met.

The output reads [1] 37.3468790 1.2872473 -0.0177749 -0.5988443, where the values would be predicted fitted values of X1, X2, X3 and X4.

Constraints (subject to)…

X2 <= .899

0 <= X3 <= .500

0 <= X4 <= .334

Solution

  • Just to clarify: You give an expected output "where the values would be predicted fitted values of X1, X2, X3 and X4". I assume you are referring to estimated (not fitted) parameter values.

    It's quite straightforward to implement constrained parameters when modelling data in rstan. rstan is an R interface to Stan, which in turn is a probabilistic programming language for statistical Bayesian inference.

    Here is an example based on the sample data you provide.

    1. First, let's define our model. Note that we constrain parameters beta2, beta3 and beta4 according to your requirements.

      model <- "
      data {
          int<lower=1> n;   // number of observations
          int<lower=0> k;   // number of parameters
          matrix[n, k] X;   // data
          vector[n] y;      // response
      }
      
      parameters {
          real beta1;                                                  // X1 unconstrained
          real<lower=negative_infinity(), upper=0.899> beta2;          // X2 <= .899
          real<lower=0, upper=0.5> beta3;                              // 0 <= X3 <= 0.5
          real<lower=0, upper=0.334> beta4;                            // 0 <= X4 <= 0.334
          real sigma;                                                  // residuals
      }
      
      model {
          // Likelihood
          y ~ normal(beta1 * X[, 1] + beta2 * X[, 2] + beta3 * X[, 3] + beta4 * X[, 4], sigma);
      }"
      
    2. Next, we fit the model to your sample data.

      library(rstan)
      rstan_options(auto_write = TRUE)
      options(mc.cores = parallel::detectCores())
      df <- cbind.data.frame(Y, X)
      fit <- stan(
          model_code = model,
          data = list(
              n = nrow(df),
              k = ncol(df[, -1]),
              X = df[, -1],
              y = df[, 1]))
      
    3. Finally, let's print parameter estimates in a tidy way

      library(broom)
      tidy(fit, conf.int = TRUE)
      ## A tibble: 5 x 5
      #  term  estimate std.error conf.low conf.high
      #  <chr>    <dbl>     <dbl>    <dbl>     <dbl>
      #1 beta1   29.2      6.53   16.9        42.8
      #2 beta2    0.609    0.234   0.0149      0.889
      #3 beta3    0.207    0.138   0.00909     0.479
      #4 beta4    0.164    0.0954  0.00780     0.326
      #5 sigma   16.2      5.16    9.42       29.1
      

      We can also plot parameter estimates including CI.

    Note that parameter estimates are consistent with the imposed constraints.

    enter image description here