I want to do a cross-validation for the ca20-Dataset from the geoR package. With for example the meuse-dataset, this works fine, but for this dataset, I encounter a strange problem with the dimensions of the SpatialPointsDataFrame. Maybe you can try this for yourself and explain why the autoKrige.cv function does not work (I tried several nfold-values but this only changes the locations-value of the error message...):
library(geoR)
library(gstat)
library(automap)
data(ca20)
east=ca20$coords[,1]
north=ca20$coords[,2]
concentration=ca20$data
frame=data.frame(east,north)
data=data.frame(concentration)
points<-SpatialPoints(data.frame(east,north),proj4string=CRS(as.character(NA)))
pointsframe<-SpatialPointsDataFrame(points,data, coords.nrs = numeric(0),proj4string = CRS(as.character(NA)), match.ID = TRUE)
krig=autoKrige(pointsframe$concentration~1,pointsframe)
plot(krig)
cv=autoKrige.cv(pointsframe$concentration~1,pointsframe)
I hope someone can reproduce the problem, my R version is 2.15, all packages are up to date (at least not older than a month or so...).
Thanks for your help!!
First, the way you build your SpatialPointsDataFrame can be done more easily:
library(geoR)
library(gstat)
library(automap)
...and build the SPDF:
pointsframe = data.frame(ca20$coords)
pointsframe$concentration = ca20$data
coordinates(pointsframe) = c("east", "north")
The problem you have is in how you use the formula
argument. You add the spatial object pointsframe
to the formula, in essence putting a vector directly into the formula. You should just use the column name in the formula, like this:
cv=autoKrige.cv(concentration~1,pointsframe)
and it works:
> summary(cv)
[,1]
mean_error -0.01134
me_mean -0.0002237
MAE 6.02
MSE 60.87
MSNE 1.076
cor_obspred 0.7081
cor_predres 0.01343
RMSE 7.802
RMSE_sd 0.7041
URMSE 7.802
iqr 9.519