rspatialkriginggstat

Error in ST kriging of uneven spacetime data " the leading minor of order 2 is not positive definite"


I am trying to use universal kriging on some irregular data. I keep getting an error:

Error in chol.default(A) : the leading minor of order 2 is not positive definite

When I run the krigST command. Here is a sample of my data:

long lat date_from date_to var1 var2 var3 var4 resp tid spid
144.12 -33.73 2019-01-03 2019-02-02 700 103 48.6 4.93 -4.04 1 1
144.46 -32.59 2019-01-03 2019-02-02 698 89 65.8 16.69 -2.93 1 2
144.98 -31.05 2019-01-03 2019-02-02 1947 145 145.4 8.22 -7.05 1 3
144.46 -32.59 2019-01-07 2019-02-06 686 95 61.4 7.03 -3.94 2 2
145.44 -33.25 2019-01-07 2019-02-06 637 151 25 7.76 -5.09 2 4
146.15 -34.01 2019-01-07 2019-02-06 849 75 11 9.86 -5.46 2 6
145.8 -33.45 2019-02-03 2019-03-02 704 90 14.2 10.91 -5.28 3 5
146.15 -34.01 2019-02-03 2019-03-02 15 1 12.9 8.4 -3.41 3 6
144.98 -31.05 2019-03-13 2019-04-12 975 153 87.3 13.09 -6.1 4 3
145.44 -33.25 2019-03-13 2019-04-12 1025 124 33.4 22.13 -6.89 4 4
144.46 -32.59 2019-04-22 2019-05-21 264 192 20 5.52 -4.28 5 2
145.44 -33.25 2019-04-22 2019-05-21 530 136 17.2 9.27 -5.48 5 4
145.8 -33.45 2019-04-22 2019-05-21 790 87 16.6 15.63 -6.57 5 5
145.8 -33.45 2019-05-03 2019-06-02 1074 67 19.7 9.35 -6.94 6 5
146.15 -34.01 2019-05-03 2019-06-02 140 7 49.1 13.98 -3.12 6 6
144.98 -31.05 2019-06-02 2019-07-01 1104 126 21.2 17.88 -6.96 7 3
144.46 -32.59 2019-06-03 2019-07-02 351 174 61 7.45 -5.14 8 2
145.44 -33.25 2019-06-03 2019-07-02 573 122 1.4 13.98 -6.42 8 4
146.15 -34.01 2019-06-03 2019-07-02 651 53 12.4 9.06 -6.21 8 6
145.8 -33.45 2019-07-15 2019-08-14 1058 75 11.8 12.73 -6.77 9 5

This is the code I'm using:

dat<- read.csv("mydat.csv")


#load libraries
library(gstat)
library(sp)
library(spacetime)

#Format lat and long & remove the duplicates
COOR=unique(cbind(dat$long,dat$lat))

#create SpatialPoints object from coordinates CRS stands for Coordinate reference system
COOR.formatted=SpatialPoints(COOR,proj4string=CRS("+proj=longlat +datum=WGS84"))

#create index  for STSDF object
(index <- matrix(cbind(dat$spid, dat$tid), nrow= length(dat$spid), ncol=2, byrow = FALSE ))

dataframe<-data.frame(var1 = dat$var1,var2=dat$var2,
                      var3 = dat$var3,var4 = dat$var4, resp = dat$resp,
                      lat=dat$lat,     lat.sq=dat$lat^2, ID=paste("ID",1:nrow(dat)))
DATE.formatted <- unique(as.Date(dat$date_from))
DATE.formatted.end <- as.POSIXct(unique(as.Date(dat$date_to)))
datST = STSDF(COOR.formatted, DATE.formatted,  dataframe, 
              index, endTime=DATE.formatted.end )


summary(datST)
#################################create some theoretical models###################
var.uk.o=variogramST(resp~ var1+var2+var3+var4,
                     datST,tlags=0:7, 
                     na.omit=TRUE,)
plot(var.uk.o)
type <-c("Exp", "Sph")
psill=2
spatialVgm = vgm(psill=psill,type[1], range=120, nugget=0)
temporalVgm=vgm(psill=psill,type[1], range=150, nugget=0)
jointVgm = vgm(psill=psill,type[2], range=120, nugget=0)
fit.ssum.o = vgmST("sumMetric", space = spatialVgm, 
                   time = temporalVgm, 
                   joint = jointVgm, 
                   nugget = 0, 
                   stAni = 25)
fit.ssum.o=fit.StVariogram(var.uk.o,  
                           fit.ssum.o ,fit.method = 8)
#this is where it goes wrong. 
pred.a <- krigeST(resp~1,
                  data=datST, newdata=datST,
                  modelList=fit.ssum.o)
pred.b <- krigeST(resp~var1+var2+var3+var4,
                  data=datST, newdata=datST,
                  modelList=fit.ssum.o)
#I tried increasing the value of the response variable but I still get the error. 
dat$resp= dat$resp+10000


I tried to increase the response variable to move it away from zero, because my data is very noisy, but it doesn't work (I've tried small increments 10, 20, 50....1000, 10000 nothing works. I added the optional endTime option in the STSDF object but that didn't work either.


Solution

  • I'm able to answer this question at last. It turns out you also get this error if there are 'duplicate' entries in your data. My data was too noisy so I grouped the dates so they all started on the 1st of each month. This smoothed the data, however I was still getting an error because my smoothing had resulted in some peusdo-duplicate data i.e.

    144.46 -32.59 2019-01-03 2019-02-02 698 89 65.8 16.69 -2.93 1 2

    144.46 -32.59 2019-01-07 2019-02-06 686 95 61.4 7.03 -3.94 2 2

    became

    144.46 -32.59 2019-01-01 2019-02-02 698 89 65.8 16.69 -2.93 1 2

    144.46 -32.59 2019-01-01 2019-02-06 686 95 61.4 7.03 -3.94 1 2

    So I had to calculate the mean values of my response and predictor variables and now the code works correctly.

    This is a fine fix for this data, as I want to interpolate monthly values for several years. This is the little bit of code that fixed it

     p<- length(colnames(data))
     avg=aggregate(data[,5:p],by=list(data$long, data$lat, data$Date_from),FUN=mean,simplify=TRUE,na.rm=TRUE)
    

    The date_to data, was no longer relevant as I fixed that when I grouped the data into monthly bags.