I'm trying to crop a netcdf file with a polygon with stars
package from daily netcdf data. I think I have managed to do it and could get this plot
with this script
library(tidyverse)
library(sf)
library(stars)
# Input nc file
nc.file <- "20220301120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.1.nc"
# read nc data
nc.data <- read_ncdf(nc.file, var="analysed_sst")
# Read mask coordinates
coordenades.poligon <- read_csv("coordenades_poligon.csv")
colnames(coordenades.poligon) <- c("lon","lat")
# Build sf polygon to crop data
polygon <- coordenades.poligon %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
summarise(geometry = st_combine(geometry)) %>%
st_cast("POLYGON")
# Crop data
nc.stars.crop <- st_crop(nc.data,polygon)
# plot
ggplot() + geom_stars(data=nc.stars.crop) +
coord_equal() + theme_void() +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))
Now I would like to combine lon, lat and analysed_sst in a data frame. I managed to extract coordinates with
nc.stars.coords <- as.data.frame(st_coordinates(nc.stars.crop))
But can't find how to get the corresponding sst values to cbind with longitude and latitude. Maybe there are other solutions with ncdf4
package.
Thank you very much for your help
EDIT 1
Link to SST original data (nc file): SST data
EDIT 2 Added head of coordenades_poligons.csv. First columns are longitude and latitude points, third column is the area ID and fourth one denotes the season. These are just the coordinates of a single area filtered by ID and season.
12.5,44.5,Z1,S
2,44.5,Z1,S
0,41.5,Z1,S
4,40,Z1,S
9,40,Z1,S
9,42,Z1,S
0,41.5,Z2,S
I am making assumptions here, because this is not my area of expertise, but you are able to simply transform this into a dataset using the raster
-package. This seems to be the way to go, also according to this author.
raster::as.data.frame(nc.stars.crop, xy = TRUE)
At least for me this worked. And then you could transform it back into a simple features object, if you are so inclined with
raster::as.data.frame(nc.stars.crop, xy = TRUE) %>%
sf::st_as_sf(coords = c('lon','lat'))
However, the transformation to lon/lat is not really exact, because it produces point data, whereas the original information is raster data. So there is clearly information that gets lost.
sf::st_as_sf()
seems to work out of the box for this, but I am not sure, because I have no way to validate the transformation of the original data. For me the following worked:
read_ncdf('20220301120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.1.nc', var="analysed_sst") %>%
sf::st_as_sf()
This creates polygons, the size of your initial raster tiles and seems to conserve all necessary information.
Finally, here is a work-around to extracting exactly the data you were plotting. You can access the data that ggplot used, by assigning the ggplot to a variable and then accessing the data layer.
p <- ggplot() + geom_stars(data=nc.stars.crop) +
coord_equal() + theme_void() +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))
p$layers[[1]]$data