I have downloaded multiple Lm3 4 km netcdf files (n = 4640) for sea surface temperature from the Ocean Colour Project, NASA.
class : SpatRaster
dimensions : 766, 709, 1 (nrow, ncol, nlyr)
resolution : 0.04165021, 0.04165796 (x, y)
extent : 24.61, 54.14, 5.8, 37.71 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
source : AQUA_MODIS.20120101.L3m.DAY.SST.x_sst.nc:sst
varname : sst (Sea Surface Temperature)
name : sst
unit : degree_C
My data spans from 2012 to 2024 for an 82 km area in The Red Sea, and I want to extract point data for sea surface temperature for specific dates.
I have successfully extracted values for sea surface temperature from the netcdf files using the code below, but I'm not sure if I'm just extracting values from the first netcdf layer rather than looping through all the files (1:4640), and I'm unsure how to incorporate the temporal element (the column 'Dates') with the point data over multiple years on specific dates.
I can't share the data because of ownership issues.
library(terra)
library(lubricate)
#Download Sea Surface Temperature
folder_SST<-"~/Documents/GIS_Data/SST/requested_files"
files_SST <-list.files(folder_SST, pattern='*.nc', full.names ="TRUE")
#Loop through all the file paths and the AQUA MODIS netcdf4 files to gain access to them:
for (file in files_SST) {
nc_SST=open.nc(files_SST)
print.nc(nc_SST)
}
#Select the sea surface temperature values from AQUA MODIS files
SSTs <- terra::rast(files_SST[1], "sst")
#Make the dataframe a spatial object of class = "sf" with a CRS of 4326
Ds_Points <- st_as_sf(x=MyDf,
coords = c("Longitude_E_DD", "Latitude_N_DD"),
crs = 4326)
#Format the dates in the dataframe
#Get dates
MyDf <- MyDf %>% mutate(Dates = as.Date(as.character(MyDf$Date), format = "%d/%m/%Y"))
#Extract the SST values from the AQUA MODIS files
SSTs_Data <- terra::extract(SSTs, Ds_Points)
Desired output:
Survey_Number Dates Longitude_E_DD Latitude_N_DD SST
1 2012-08-01 33.89083 27.26778 23.635
2 2012-06-02 33.86782 27.40854 23.640
3 2012-02-07 33.86230 27.44623 23.690
4 2012-02-12 33.88653 27.26957 23.635
5 2012-02-13 33.88766 27.26848 23.635
6 2012-02-14 33.85000 27.36111 23.780
7 2012-02-15 33.86177 27.41302 23.640
In the code you have, you are indeed only extracting values from the first netcdf file. This line in your script:
SSTs <- terra::rast(files_SST[1], "sst")
means you are loading the first file in your files_SST list and only using that file in the rest of your script. You will need to use a "for loop" to iterate over each file. I do not see a reference to MyDf above your Ds_Points definition, so I manually created MyDf using the lat/long values of the first 4 points that show up in your desired results. I also downloaded 3 of the more recent SST files from the given source to use for this.
This is my solution below:
library(tidyverse)
library(terra)
library(tidyterra)
library(sf)
library(ncdf4)
library(lubridate)
library(stringr)
# create vectors of longs and lats for sample points
# add these to MyDf, then convert MyDf to sf and save sf to Ds_Points
longs <- c(33.89083, 33.86782, 33.86230, 33.88653)
lats <- c(27.26778, 27.40854, 27.44623, 27.26957)
MyDf <- data.frame(longs, lats) %>%
rename(
Longitude_E_DD = longs,
Latitude_N_DD = lats
)
Ds_Points <- st_as_sf(x = MyDf,
coords = c("Longitude_E_DD", "Latitude_N_DD"),
crs = 4326)
######################################################################
# set path to data folder
# get sea surface temperature data files
folder_SST <- paste(here::here(), '/data', sep = '')
files_SST <- list.files(folder_SST, full.names ="TRUE")
# set column names in expected output df,
# then add them to new sst_df
cols <- c('Survey Number', 'Date', 'Longitude_E_DD', 'Latitude_N_DD', 'SST')
sst_df <- data.frame()
for (col in cols) {
sst_df[1, col] <- NA
}
######################################################################
# initialize index and survey_num
index <- 1
survey_num <- 1
# loop through all the AQUA MODIS netcdf4 files
for (file in files_SST) {
# open file as raster and get sst layer
nc_SST = rast(file, "sst")
# get formatted date from file name
date <- as.Date(as.character(str_split(tail(str_split(file, '/')[[1]], 1), '\\.')[[1]][2]), format = "%Y%m%d")
# iterate over each point
for (i in 1:nrow(Ds_Points)) {
# get point
d_point <- Ds_Points[i,]
# get point's lat and long
lon <- st_coordinates(d_point)[,'X']
lat <- st_coordinates(d_point)[,'Y']
# get SST at that point from SST file
# must convert point geometry to spatvector to use in terra::extract function
SST_data <- terra::extract(nc_SST, as_spatvector(d_point, geom = 'geometry', crs = st_crs(Ds_Points)))
# insert data into sst_df at row index
sst_df[index, 'Survey Number'] <- survey_num
sst_df[index, 'Date'] <- as.character(date)
sst_df[index, 'Longitude_E_DD'] <- lon
sst_df[index, 'Latitude_N_DD'] <- lat
sst_df[index, 'SST'] <- SST_data$sst
# next row index
index <- index + 1
}
# next survey number/file
survey_num <- survey_num + 1
}
# show completed sst_df
sst_df
This is the output:
Survey Number Date Longitude_E_DD Latitude_N_DD SST
1 2024-10-28 33.89083 27.26778 26.980
1 2024-10-28 33.86782 27.40854 27.000
1 2024-10-28 33.86230 27.44623 27.075
1 2024-10-28 33.88653 27.26957 26.980
2 2024-10-30 33.89083 27.26778 27.055
2 2024-10-30 33.86782 27.40854 27.005
2 2024-10-30 33.86230 27.44623 27.060
2 2024-10-30 33.88653 27.26957 27.055
3 2024-11-01 33.89083 27.26778 27.265
3 2024-11-01 33.86782 27.40854 27.220
3 2024-11-01 33.86230 27.44623 27.165
3 2024-11-01 33.88653 27.26957 27.265
This script finds the SST values at each of the points in Ds_Points for each of the netcdf files, then adds the data to sst_df. There are 4 points, so there are 4 SST values for each of the 3 dates/SST files. The date was extracted from the file name of the netcdf file. I figured the survey number was unique for each date.
Keep in mind, this particular solution will be a bit slow given how many files you have and even slower depending on how many points you have (since it will iterate over each point for each file).