I'm having a bit of a problem analysing spatial data using R for machine learning. I'm confused, is it possible that coordinate_names = "geometry", coords_as_features = FALSE
doesn't do what I thought it would do with the geometry column data? Causing the error to be reported?
What is the actual purpose of the parameters in these two lines?
library(sf)
library(terra)
library(mlr3)
library(mlr3learners)
library(mlr3spatial)
trainingdata <- sf::st_read("https://github.com/LOEK-RS/FOSSGIS2025-examples/raw/refs/heads/main/data/temp_train.gpkg")
predictors <- terra::rast("https://github.com/LOEK-RS/FOSSGIS2025-examples/raw/refs/heads/main/data/predictors.tif")
trainDat <- sf::st_as_sf(terra::extract(predictors, trainingdata, bind = TRUE))
task <- as_task_regr_st(
x = trainDat,
target = "temp",
coordinate_names = "geometry",
coords_as_features = FALSE
)
#> Error in as_data_backend.data.frame(backend) :
Assertion on 'data' failed: Must have unique colnames, but element 24 is duplicated.
Created on 2025-07-14 with reprex v2.1.1
Two subsequent solutions (which make sense, at least to me, though not elegant either) for your consideration:
Option 1: If the X
Y
in the raw data is the same as in geometry
(Although in the case it is not, for the record code assumes they are consistent), discard the two columns in advance.
task <- as_task_regr_st(
x = subset(trainDat, select = (-c(X,Y))),
target = "temp",
coords_as_features = TRUE
)
Option 2: The original data is preserved by modifying the column names. And by changing coords_as_features = FALSE
, coordinates are not included as features in subsequent calculations.
colnames(trainDat)[colnames(trainDat) == "X"] <- "lon"
colnames(trainDat)[colnames(trainDat) == "Y"] <- "lat"
task <- as_task_regr_st(
x = trainDat,
target = "temp",
coords_as_features = FALSE
)
task
trainDat
contains columns X
, Y
and Geometry
. as_task_regr_st
tries to add X
and Y
extracted from Geometry
so we end up with duplicated columns. This should work:
library(sf)
library(terra)
library(mlr3)
library(mlr3learners)
library(mlr3spatial)
trainingdata = sf::st_read("https://github.com/LOEK-RS/FOSSGIS2025-examples/raw/refs/heads/main/data/temp_train.gpkg")
predictors = terra::rast("https://github.com/LOEK-RS/FOSSGIS2025-examples/raw/refs/heads/main/data/predictors.tif")
trainDat = sf::st_as_sf(terra::extract(predictors, trainingdata, bind = TRUE))
trainDat$X = NULL
trainDat$Y = NULL
task = as_task_regr_st(
x = trainDat,
target = "temp",
coords_as_features = FALSE
)