The entries on the diagonal of a spatial distance matrix should be zero, bc they represent the distance between each location and itself. But the rdist.earth()
function from the fields
R package
sometimes give me non-zeros on the diagonal:
> # Set number of decimals of output display
> options(digits=8)
> # Some longitude, latitude data
> LLdat
lon lat
1 -105.85878 43.65797
2 -105.81812 43.57009
3 -105.80796 43.57748
>
> # Create distance matrix
> library(fields)
> distmat <- rdist.earth(LLdat,LLdat)
> distmat
1 2 3
1 0.0000000 6.410948951394 6.12184338
2 6.4109490 0.000059058368 0.72150586
3 6.1218434 0.721505863563 0.00000000
In the above distance matrix, the second entry on the diagonal is 0.000059058368
, in miles (the default unit), whereas the other two are 0.0000000
. First, why do the entries of the second column show more digits than the other two? And why is the entry on the second diagonal not zero to 8 decimals like the others? The discrepancy doesn't seem small enough to attribute to floating point rounding error.
Now compare the output of rdist.earth()
to that of a different package, geosphere
, and function distGeo()
, which computes the distance between two points (not a full distance matrix). Here, we compute the distance between each point and itself. The output vector units are in meters:
> library(geosphere)
> distmat2 <- distGeo(LLdat,LLdat)
> distmat2
[1] 0 0 0
So with distGeo()
, all three distances measures are in agreement and appropriately zero.
Is there something I'm missing? Or does this indicate a problem with rdist.earth()
?
Unfortunately, it is a rounding error.
If you look at the source code, you can replicate the issue:
x1 <- LLdat
R <- 3963.34
coslat1 <- cos((x1[, 2] * pi)/180)
sinlat1 <- sin((x1[, 2] * pi)/180)
coslon1 <- cos((x1[, 1] * pi)/180)
sinlon1 <- sin((x1[, 1] * pi)/180)
pp <- cbind(coslat1 * coslon1, coslat1 * sinlon1, sinlat1) %*%
t(cbind(coslat1 * coslon1, coslat1 * sinlon1, sinlat1))
return_val = (R * acos(ifelse(abs(pp) > 1, 1 * sign(pp), pp)))
The function first calcs an intermediate matrix pp:
print (pp)
[,1] [,2] [,3]
[1,] 1.0000000000 0.9999986917 0.9999988071
[2,] 0.9999986917 1.0000000000 0.9999999834
[3,] 0.9999988071 0.9999999834 1.0000000000
It seems like the diagonal is all the same. However:
print(pp, digits=22)
[,1] [,2] [,3]
[1,] 1.0000000000000000000000 0.9999986917465573110775 0.9999988070789928018556
[2,] 0.9999986917465573110775 0.9999999999999998889777 0.9999999834298258782894
[3,] 0.9999988070789928018556 0.9999999834298258782894 1.0000000000000000000000
> acos(0.9999999999999998889777) * R
[1] 5.905836821e-05
> acos(1.0000000000000000000000) * R
[1] 0