rstatisticsspatialrastergstat

gstat in R - Variogram cutoff distance is not working at larger specified distances with large gridded datasets


I am attempting to compute variograms in R with the gstat package of biomass data across management areas. The biomass data is a raster dataset with a 3.5 ft resolution or 1.0668m. The size of the spatialpointsDataFrame I am passing to the variogram function is 18.6 Mb (814223 elements). (I have also tried the spatialpixelsDataFrame, but it does not like the 1.0668m pixel size). When I run the code:

v = variogram(ras.grid1@data[[1]]~1, data = ras.grid1)

and look at output "v", I get distance values that are much larger than the management area (and much larger than 1/3 of the diagonal length).

When I run the variogram function on smaller management units (40 ha) it gives me results that I would expect (this is using a SpatialPointsDataFrame with the size of 7.9 Mb and 344259 elements).

If I hard code the cutoff to be smaller, with the initial larger raster dataset to 200m, it again provides distance values I expect. If I try upping the distance let's say 600m again it provides distance values much larger than the 600m cutoff specified. 300m also provides unexpected results. For example:

    ####variogram computation with 200m cutoff....It works
    v = variogram(ras.grid1@data[[1]]~1, data = ras.grid1, cutoff=200)
    v
              np       dist    gamma dir.hor dir.ver   id
    1   195954282   8.874169 4990.504       0       0 var1
    2   572500880  20.621626 5627.534       0       0 var1
    3   958185761  33.701344 5996.423       0       0 var1
    4  1288501796  46.920392 6264.396       0       0 var1
    5  1652274803  60.198360 6472.187       0       0 var1
    6  1947750363  73.502011 6642.960       0       0 var1
    7  2282469596  86.807781 6802.124       0       0 var1
    8  2551355646 100.131946 6942.277       0       0 var1
    9  2849678492 113.441335 7049.838       0       0 var1
    10 3093057361 126.751400 7149.102       0       0 var1
    11 3375989515 140.081110 7240.848       0       0 var1
    12 3585116223 153.418095 7322.990       0       0 var1
    13 3821495516 166.721460 7394.616       0       0 var1
    14 4036375072 180.053643 7443.040       0       0 var1
    15 4235205167 193.389119 7476.061       0       0 var1

    ####variogram computation with 600m cutoff....It returns unexpected 
    ####distance values
    v2 = variogram(ras.grid1@data[[1]]~1, data = ras.grid1, cutoff=600) 
    v2
       np        dist      gamma dir.hor dir.ver   id
    1  1726640923    26.54691   5759.951       0       0 var1
    2   593559666   510.62232  53413.914       0       0 var1
    3  3388536438   229.26702  15737.659       0       0 var1
    4  1464228507   966.36789  49726.788       0       0 var1
    5  3503141163   623.13559  25680.965       0       0 var1
    6   878031648  3454.21122 117680.266       0       0 var1
    7  2233138601  1761.91799  50996.719       0       0 var1
    8  3266098834  1484.40162  37369.451       0       0 var1
    9  4056578316  1420.49358  31556.527       0       0 var1
    10  254561085 26030.66780 517601.669       0       0 var1
    11  562144107 13256.59985 239163.649       0       0 var1
    12  557621435 14631.84504 243476.857       0       0 var1
    13  385648032 22771.12890 352898.971       0       0 var1
    14 4285655256  2163.11091  31213.201       0       0 var1
    15 3744542323  2575.19496  34709.529       0       0 var1

Also if I scale the data up to 3m I again get the expected distance values.

I am not sure if the large size of raster dataset is causing the issue and what I am trying to do is not possible, or if I doing something wrong or if there is another way?

Thank you for the help and interest.


Solution

  • After exploring this in more detail, it does seem to be the size of the SpatialPointsDataFrame causing the issue. On my machine keeping the size under 10 Mb seemed to do the trick. To reduce the size of the SpatialPointsDataFrame I sampled the original raster using:

      ras.grid<-ras.grid[sample(1:length(ras.grid), 350000),]