rmatrixregressionhypothesis-testsingular

system is computationally singular error from linearHypothesis but the matrix has maximal rank


I'm performing a Mincer Zarnowitz to test the goodness of fit of a time series regression. The test (https://eranraviv.com/volatility-forecast-evaluation-in-r/) boils down to, first, running a regression of the observations on the fitted values, and second, a joint test that the intercept of the regression is 0, and the coefficient of the fitted values is 1.

I attached the first 20 observations of my vectors of observations (obs) and fitted values (fit) - it gives the same error with the whole dataset. Using R, I first run the regression (MZ2) of obs on fit, and save it. Then I use the linearHypothesis function in the package car to test the joint hypotheses above. The rank of the matrix (MZ2$model) is maximal (2), so the matrix is invertible. Yet I receive the error Error in solve.default(vcov.hyp) : system is computationally singular: reciprocal condition number = 6.22676e-17. The code runs for the single hypothesis test.

I don't understand why I get this error. The summary vcov option should have returned the same error to compute the asymptotic (robust) standard errors, but it doesn't.

Any idea on this error? Thank you.

obs <-c(13964892, 10615134, 12066946,  8394110,  8991798, 12456120,  8981580,
        9261421, 12976910, 19263428,  6453574,  9025350, 12455365,  9711284,
        14876416, 11643567,  8383892, 10234233,  7601169, 10136608)
fit <- c(12478069, 11826724, 10706274, 10573869, 10413272, 10789469,
        9401626, 10067159, 12939216, 11535966, 10890038, 10634312, 11122152,
        11309619, 10877766, 10330747, 10034014, 10912567,  9204140,  9532570)
MZ2 <- lm(obs ~ fit)
summary(MZ2, vcov = vcovHC, type = "HC3")
  # Call:
  #   lm(formula = obs ~ fit)
  # 
  # Residuals:
  #   Min       1Q   Median       3Q      Max 
  # -4605688 -1518159  -543282  1318148  7130691 
  # 
  # Coefficients:
  #   Estimate    Std. Error t value Pr(>|t|)  
  # (Intercept) -7039028.9827  6717707.9500  -1.048   0.3086  
  # fit                1.6619        0.6209   2.676   0.0154 *
  #   ---
  #   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  # 
  # Residual standard error: 2565000 on 18 degrees of freedom
  # Multiple R-squared:  0.2847,    Adjusted R-squared:  0.2449 
  # F-statistic: 7.163 on 1 and 18 DF,  p-value: 0.0154
#
# JOINT TEST
#
require(car)
linearHypothesis(MZ2, c("(Intercept) = 0", "fit = 1"))
Error in solve.default(vcov.hyp) : 
  system is computationally singular: reciprocal condition number = 6.22676e-17
In addition: Warning message:
In constants(lhs, cnames_symb) : NAs introduced by coercion
> MZ2$rank
[1] 2
#
# UNIVARIATE TESTS
#
linearHypothesis(MZ2, c("(Intercept) = 0"))
Linear hypothesis test

Hypothesis:
(Intercept) = 0

Model 1: restricted model
Model 2: obs ~ fit

  Res.Df             RSS Df     Sum of Sq     F Pr(>F)
1     19 125618245448671                              
2     18 118396383219614  1 7221862229057 1.098 0.3086
> linearHypothesis(MZ2, c("fit = 1"))
Linear hypothesis test

Hypothesis:
fit = 1

Model 1: restricted model
Model 2: obs ~ fit

  Res.Df             RSS Df     Sum of Sq      F Pr(>F)
1     19 125870444423604                               
2     18 118396383219614  1 7474061203991 1.1363 0.3005

Solution

  • Your values are quite huge, so when it needs to calculate the RSS (where you square the residuals) or refit the model, at some point the numbers might be too large for the machine. It is similar to something discussed here

    Ideally, you go back to the linear model that gives that gives you the prediction, and scale your dependent variable, for example divide by 1e3 or 1e6.

    What you have now, you can do (and test the joint hypo):

    df = data.frame(obs=obs/1e6,fit=fit/1e6)
    MZ2 <- lm(obs ~ fit,data=df)
    library(car)
    linearHypothesis(MZ2, c("(Intercept) = 0", "fit = 1"))
    
    Linear hypothesis test
    
    Hypothesis:
    (Intercept) = 0
    fit = 1
    
    Model 1: restricted model
    Model 2: obs ~ fit
    
      Res.Df    RSS Df Sum of Sq      F Pr(>F)
    1     20 126.05                           
    2     18 118.40  2    7.6573 0.5821 0.5689